| EVENTS |
| Full Schedule PDF |
| Keynotes |
| M 1:30P |
Knowledge Architectures for Pharmaceutical R&D |
Jim Golden, CTO, SAIC Life Science Office |
| T 8:45A |
The Cancer Biomedical Informatics Grid (caBIG) |
John Speakman, Memorial Sloan-Kettering Cancer Cancer |
| Presentations |
| M 8:30A |
OLSUG Meeting Introduction |
John Burke, Data Mgmt. Consultant |
| M 8:45A |
Oracle’s Platform for Life Sciences and Preview of 10g Release 2 New Features |
Charlie Berger, Sr. Dir. Prod. Mgmt., Life Sciences and Data Mining, Oracle |
| M 9:45A |
From Mainframe to the “Grid” – a Real World Experience |
Marcus Collins, Applera Corporation |
| M 10:15A |
Database storage of multiple conformations of molecular structures,
application to structure determination, molecular docking and dynamics simulations |
Mark Forster, Syngenta R&D/IS, Jealott's Hill, UK. |
| M 11:05A |
Oracle's Solutions for Systems Biology |
Susie Stephens, Principal Product Mgr., Life Sciences, Oracle |
| M 11:35A |
Biological Traversal Engine (BTE) |
Shunguang Wang, BG Medicine |
| M 12:05P |
ISV Lightning Round 1 |
12:05P EKM Corporation
12:12P BioPoint Solutions
12:19P Inforsense
12:26P SciTegic/Accelrys
12:33P Insightful
12:40P Tom Sawyer Software
|
| M 2:15P |
Maintaining Security and Identity Management in Life Sciences |
Roger Sullivan, VP Business Development Identity Management, Oracle |
| M 2:45P |
Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies |
Anthony C. Arvanites, Cambria Biosci. |
| M 3:15P |
Clustering of Protein Space Using Oracle10g |
Gyorgy Babnigg and Carol S. Giometti, Argonne National Laboratory |
| M 4:05P |
PhiMS: An Extensible Phenotypic Management System for the Electronic Capture
and Storage of CRF Data in the Non-Profit Environment
|
Brent Richter, Channing Labs |
| M 4:35P |
Electronic Lab Notebooks in the Real World: A LABTrack case study documenting
productivity gains & the functionality needed to be successful.
|
Richard Stember, CEO, Scientific Div., EKM Corporation |
| T 9:45A |
Life Science Examples Using Oracle at SDSC |
Joshua Li and Shankar Subramaniam, SDSC/UCSD |
| T 10:15A |
Managing a Complex Migration Project – A Standards Based Approach |
Marcus Collins, Applera Corporation |
| T 11:05A |
Transforming a DBA group into a Database Center of Excellence - A Case Study |
Danielle Fleming, Vice President, TeraDBA Consulting |
| T 11:35A |
TeraGenomics - Migration of VLDB from Teradata to Oracle |
Eva Mitter, Development & Operations Manager, IMC |
| T 12:05P |
ISV Lightning Round 2 |
12:05P Waters Corporation
12:12P Sun Microsystems
12:19P CambridgeSoft
12:26P Thermo Electron
12:33P Symyx IntelliChem, Inc.
12:40P Applied Biosystems
|
| T 1:30P |
Data warehouse development for integrative biomedical informatics research |
Hai Hu, Dir. Biomedical Informatics,
Michael Liebman, CSO, Windber Research Institute
|
| T 2:00P |
Managing Research Collaboration with Oracle Collaboration Suite Workspaces |
Alok Srivastava, Sr. Group Mgr, Oracle |
| T 2:30P |
Enhanced Reporting for Oracle Clinical and Oracle AERS |
Prasad Inampudi, Oracle |
| Workshops |
| M 9:45A |
Text Mining with Oracle |
Raf Podowski, Sr. Prod. Mgr., Life Sciences, Oracle |
| M 9:45A |
BLAST and Regular Expression Searches |
John Burke, Data Management Consultant
Susie Stephens, Principal Prod. Mgr., Life Sciences, Oracle,
Shiby Thomas, Principal Member of Tech. Staff, Oracle
|
| M 11:05A |
HTML DB |
Richard Landry, Oracle |
| M 2:15P |
Network Data Model for Biological Pathways |
Jack Wang, Principal Member of Tech. Staff, Oracle,
Ning An, Sr. Member of Tech. Staff, Oracle,
Brendan Madden, CEO, Tom Sawyer,
Susie Stephens, Principal Prod. Mgr., Life Sciences, Oracle
|
| M 2:15P |
Managing Life Science and Medical Image Repositories Using Oracle interMedia 10g Release 2 |
Melliyal Annamalai, Principal Member of Tech. Staff, Oracle |
| T 10:00A |
Oracle Data Mining |
Carolyn Hamm, Walter Reed Medical Center,
Charlie Berger, Sr. Dir. Prod. Mgmt., Oracle
|
| T 10:00A |
RDF Data Model for the Semantic Web |
Mike DiLascio, Senior Director, Siderean,
Souri Das, Consulting Member Tech. Staff, Oracle,
Susie Stephens, Principal Prod. Mgr., Life Sciences, Oracle
|
| T 1:30P |
Statistical analysis of gene expression data with Oracle & “R” |
Pat Hoffman, Sr. Principal Analytical Consultant, Oracle,
Raf Podowski, Sr. Prod. Mgr., Life Sciences, Oracle
|
| T 1:30P |
Loading Life Sciences Data into Oracle Database 10g |
Ellen Batbouta, Consulting Member of Tech. Staff, Oracle,
Paul Narth, Senior Group Manager, Warehouse Builder Prod. Mgmt., Oracle,
Susie Stephens, Principal Prod. Mgr., Life Sciences, Oracle,
Ray Swonger, Director Software Dev., Oracle
|
|
| ABSTRACTS |
| Keynotes |
Knowledge Architectures for Pharmaceutical R&D
There are several well-known and pressing needs within pharmaceutical
discovery:
- A decrease in pharmaceutical R&D productivity and the need to reduce compound
attrition in drug discovery and development.
- The need to realize the value of previous investments in new technologies such
as genomics, proteomics and systems biology.
- The need to make sense of the increasing volume of research data and to access
and integrate information across internal silos and "data tombs".
- The need to connect and make sense of information across R&D business units such
as target biology, compound discovery, and clinical trials. This includes the need
for pharmacovigilance and safety signal detection systems throughout the entire
pharma value chain.
- The need to share and protect IP and knowledge across alliances with other
pharmaceutical companies, biotechnology companies and academic labs.
The goal of any IT system within pharma is to enable innovation or enhance
productivity. Most internal initiatives in informatics and knowledge management have
not yielded an overall IT architecture that enables hypothesis-driven drug discovery,
allows researchers to make sense of all the experimental data within the R&D organization,
identifies latent IP stored in data warehouses, and enables signal detection and
communication across business units.
What is needed is a Knowledge Architecture for Pharmaceutical R&D - a
blueprint for a new kind of IT architecture that enables high level reasoning, semantic
integration and inference, and alliance and knowledge management across pharmaceutical R&D.
By creating a knowledge model of pharmaceutical R&D and a supporting IT architecture,
biopharmaceutical companies can enable innovation, enhance productivity and increase safety
throughout the drug discovery, development and clinical trial processes. In this
presentation I will talk about what this blueprint might look like, discuss the technologies
needed to build these systems, and give an overview of how such an architecture would impact
the R&D process.
|
The Cancer Biomedical Informatics Grid (caBIG)
The National Cancer Institute is engaging the community of sixty NCI designated
Cancer Centers in the United States in an ambitious program to collaboratively build of a the
Cancer Biomedical Informatics Grid (caBIG). caBIG is intended to be a virtual web of
interconnected data, individuals, and organizations that will redefine how research is conducted,
care is provided, and patients/participants interact with the biomedical research enterprise.
caBIG will consist of:
- a common, widely distributed infrastructure to permit the cancer research community to
focus on innovation
- a shared vocabulary, data elements, data models to facilitate information exchange
- a collection of interoperable applications developed to a common standard
- sets of raw published cancer research data available for mining and integration
Domain areas of interest, determined largely by expressed needs in the cancer
center community, include translational research, clinical trials management and tissue banking.
But the key facet of caBIG that has the capability to transform cancer research and move it to
the endgame is its promise of data sharing through syntactic and especially semantic
interoperability. caBIG is looking to bring in other organizations to achieve these goals
through the broad adoption of compatible standards.
|
| Presentations |
Oracle’s Platform for Life Sciences & Preview of 10g Release 2 New Features
Life sciences professionals have the daunting task of accessing, integrating,
assimilating, analyzing, and interpreting data and sharing and their findings with other life
science professionals. Life sciences professionals know that by integrating their data and
findings, newly discovered relationships, insights, and patterns can often lead to promising
new cures. Life sciences professionals need to collaborate with other researchers, yet maintain
the security of your intellectual property. As a result of years investing in information
technology, Oracle has emerged as the leading platform in life sciences with an 85% market
share (per IDC). This presentation highlights the features in the Oracle 10g Database,
Application Server and Collaboration Suite that have been found to be most of most value
for life science customers:
- Access distributed data (genomic, proteomic, cheminformatic, pathways, clinical data, etc., external tables, other databases, etc.),
- Integrate a variety of data types (files, relational data, unstructured text, XML, images, etc,),
- Manage vast quantities of data (grid, real application clusters, streams, data pump, backup, etc.)
- Find patterns and insights (statistics, data and text mining, pathways analysis, BLAST, regular expression searches, etc.),
- Collaborate securely with other researchers inside and outside an organization (security, files and document sharing, etc., collaborative tools, etc.)
Oracle 10g provides the basis for an integrated information platform for storing,
analyzing, and sharing life sciences data and information. A number of customer use cases will
be highlighted as examples.
|
From Mainframe to the “Grid” – a Real World Experience
Applera Corp recently migrated their Life Science application portfolio off
a Unix mainframe compute complex to a distributed Linux based “Grid Computing” architecture
using commercial off the shelf (COTS) components. The application portfolio consists of a
mix of commercial 3rd party applications and life scientific applications with a large
component of in-house developed open source code. This session will review recent
developments in hardware architecture, operating system capabilities and Grid Computing
infrastructure; detail the key design objectives considered during the design of Grid
Computing architectures; detailed design decisions around chipset, clustered file systems,
processor count and Oracle 9i/10g configurations; and real-world experiences gained during
a complex mainframe to distributed architecture migration.
Objectives
- Gain an understanding of recent developments in hardware architectures and Grid Computing.
- Gain an understanding of distributed architecture system design objectives.
- Gain an understanding of distributed architecture migration techniques and lessons learnt.
Outline
- Design Objectives of Distributed Architectures
- Server Provisioning and Reuse
- Server and Storage Capacity on Demand
- Storage Presentation on Demand
- Application Architecture (Service Oriented Architecture)
- Detailed Design
- Chipset and CPU/Memory Ratios
- Cluster File Systems
- High Availability (Clustering)
- Virtual Machines (VMware)
- Oracle 9i/10g Configurations
- Migration Techniques / Lessons Learnt
- Migration Techniques
- Lessons Learnt
|
Database storage of multiple conformations of molecular structures, application to structure
determination, molecular docking and dynamics simulations
Investigation of molecular conformation of small molecules and/or biological
macromolecules is a key activity in drug discovery and other aspect of life science research.
Multiple conformations (i.e. multiple coordinate sets) can be produced by structure determination
methods such as X-ray and NMR, as well as by computational tools for exploring small molecule and/or
macromolecule conformation. The amount of data generated can be large even for individual molecular
systems or models.
The output from the vast majority of computational tools available are often stored as
a number of flat files. This eventually leads to data management problems, whereby the methods and
series of conformations used in a set of calculations becomes unclear with time, and in addition flat
file storage is typically wasteful in that certain information is needlessly stored over and over again.
Even when database schema for storing molecular data are available, they are often aimed at storing just
a single set of coordinates per molecule.
We will discuss a data model aimed at storing multiple instances of molecular conformations
and apply this to data derived from structure determination methods, protein/ligand docking, molecular
dynamics simulations and other tools for exploring or optimising molecular conformation. Descriptions of
the data model in UML and SQL will be described, with examples to the above experimental and computational
methods.
|
Oracle's Solutions for Systems Biology
In order to maximize success in drug discovery it is necessary to
integrate all available data that relates to biological systems and disease.
This is a considerable challenge due to the quantities of data being generated,
the many data types, and the plethora of data sources. These integration challenges
are compounded by the dynamic nature of scientific data and the lack of a mature
naming convention. This presentation focuses on techniques that can be used to
integrate data, and thereby allow new patterns and insights to be discovered.
|
BTE (Semantic Data Integration)
BG Medicine Inc (BGM), the leading company in the field of Systems
Pharmacology, uses Oracle technology for its information infrastructure.
Building around an Oracle database, we designed and developed the
Biological Traversal EngineTM (BTE), a bioinformatics application tool
for data discovery and information integration. BTE allows one to find
the information about sets of biological entities among an array of
databases based upon either rule-based or graph-based strategies. BTE
illustrates how one can take advantage of the novel Oracle technologies,
such as Network Data Model, that are provided for the life science
community.
|
Maintaining Security and Identity Management in Life Sciences
Identity is the foundation of any audit and compliance system.
However, knowing someone's identity is only part of the solution. This presentation
will explore the breadth of Identity Management solutions toward a better understanding
of the relevant technologies, key solution requirements, and strategic directions in
Identity Management. The presentation will describe Oracle's approach to our customers'
security and Identity Management requirements, outlining how these technologies are
integrated within the Oracle product suites and how recent acquisitions have positioned
Oracle as a leader in the Identity Management market space. The presentation will also
outline related Identity Management standards initiatives such as SAFE and the Liberty
Alliance and propose how these activities could significantly affect companies over the
coming years.
|
Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies
At innovative biotechnology companies the challenge is to provide researchers with an
appropriate information environment that promotes discovery and improves productivity. Achieving
this goal requires effective collaborative access to a very heterogeneous information environment
- typically consisting of disparate data types and sources, and software tools and applications
in multiple locations and from a variety of vendors. The lack of appropriate solutions to this
problem can hinder internal collaboration, the capture and dissemination of best practices, and
the fostering of innovation. A workflow-centric Rapid Application Development environment based
on Oracle Database and Chemistry Cartridge technologies provides the ideal informatics solution to
meet the requirements of multi-disciplinary teams within a typical biotechnology research organization.
I will describe how Cambria Biosciences will combine InforSense Open Workflow technology
with Oracle Chemical Cartridge technology to rapidly develop and deploy solutions to solve problems in
a number of key research areas. Particular attention will be given to:
- Integrative cross domain processes (e.g. Hit-to-Lead processes that combine chemistry and biology).
- Chemistry Cartridge integration featuring structure handling, fingerprint calculation and chemical searching.
- In-Oracle analytics
- Advanced informatics workflow constructs
The resultant production level workflows can be deployed by a variety of means, including
web services and portals, to the research organization, thus providing a mechanism for enabling the required
seamless and collaborative access to data, tools and applications.
|
Clustering of Protein Space Using Oracle10g
- Attribute calculation for large number of sequences using Oracle10g and partitioned tables
- Cluster computing and the use Java stored procedures created by JDeveloper10g (new symbolic
math functions) and the use of Oracle10g for load balancing
- Similarity searches using Oracle10g and a Linux Cluster
- Sequence clustering using Oracle10g and a Linux cluster (use of partitioned table allows the
clustering of hundreds of millions of relationships within hours)
- Visualization of clustering results using an Oracle-based solution
- Attribute importance overlay on cluster result
|
PhiMS: An Extensible Phenotypic Information Management System
for the Electronic Capture and Storage of CRF Data in the Non-Profit Environment
Data collection, processing, and storage of case reports and
questionnaires throughout the life-cycle of clinical trials, observational,
and epidemiological studies is a challenging prospect but especially so within
the non-profit, biomedical research institution. In addition to complexity of
the types of data under collection, semantic inconsistency across studies,
lack of metadata standards, and a changing collection protocol, the non-profit
research institution maintains a non-trivial contstraint on resources due to
the nature of publicly grant-funded research. This constraint is not only
felt on the informatics and software systems professionals but also upon the
project's research assistants, coordinators, and data managers. A system is
needed, therefore, that can be efficiently built using existing open-source
frameworks and third-party tools, enables data integration at all levels, and
increases productivity not only for the informatics and software implementors
but also for the project's study personnel and overall project budget.
Here we present PhiMS, the Phenotypic Information Management
System that is currently developed and supporting a distributed clinical trial
project in nutrition at the Channing Laboratory. Core architecture and
infrastructure components of PhiMS uses the open-source frameworks of XML and
web services, pulicly accessible ontologies for controlling vocabulary and
semantics, along with specific, inexpensive technologies used for deployment.
PhiMS employs ontology-aware, XSD-governed study design, data capture, and
data harvesting infrastructure where all project data is captured via XML
documents, encrypted, and securely transferred to a central Oracle repository
for integration, archiving, and analysis. Development tools assist
investigators in specifying data capture instruments and target data
structures using standard vocabularies for methodological concepts (e.g.,
study design or assay methods) and substantive concepts (e.g., specific
anatomical or functional properties defining study outcomes). Underlying
representations have been chosen to permit very flexible rendering of data
capture instruments and real-time validation of input data against XSD
schemata, and to allow efficient adaptation of intake/database/data
repository infrastructure to changes in study instrumentation.
|
Electronic Lab Notebooks in the Real World: A LABTrack case study documenting productivity
gains & the functionality needed to be successful.
EKM's LABTrack Electronic Lab Notebook has been on the market for over 8 years and
boosts a customer base of over 200 companies. This presentation describes a recent independent
productivity study performed on an actual LABTrack implementation. The study found a 15% productivity
gain achieved with carefully crafted user interface design and functionality.
- What makes an Electronic Lab Notebook?
- Maintaining a legally accepted document.
- What ELN functions can actually improve productivity?
- What features are needed for Research, R&D, QC and Services laboratories?
- How does integration with existing systems impact productivity?
- What features are important outside of the laboratory?
|
Life Science Examples Using Oracle at SDSC
We will show several examples using oracle database and application servers
in the Molecule Pages Database, AfCS data center and BISTI projects. The Molecule Pages
Database project will be used to illustrate how we use Oracle database and OC4J server.
Putting a Java package inside our database as stored procedures and using text index in
database will be covered. The BISTI project will be used to show how to integrate SMD Oracle
schema and VAMP MySQL schema to one microarray database in Oracle 10g.
|
Managing a Complex Migration Project – A Standards Based Approach
Applera Corp recently migrated their Life Science application portfolio off
a Unix mainframe compute complex to a distributed Linux based “Grid Computing” architecture
using commercial off the shelf (COTS) components. One major reason for the success of this
project was the use of structured approaches for both the overall project and for the detailed
design phases. Systems Development Life Cycle (SDLC) was the methodology used for the overall
project and an expanded RASM approach (sometime known simply as RAS) was used during the
detailed design phases. This paper will outline the structured methodology (SDLC) used to plan
and execute the migration project; the structured design approach for detailed design, with a
focus on Grid Computing; outline real-world experiences of using these approaches during a
complex migration project.
Objectives
- Gain an overview of a structured methodology (SDLC).
- Gain an overview of a structured design approach for Grid Computing systems (RASM).
- Gain an understanding of how these approaches work in a real world migration project.
Outline
- Structured Methodology (SDLC)
- Waterfall Model
- Initiation, Design, Development, Transition, Disposal
- Structured Design (RASM)
- Reliability
- Availability
- Scalability
- Manageability
- Performance
- Measurability (Monitoring)
- Implementation / Lessons Learnt
- Implementation
- Lessons Learnt
|
Transforming a DBA group into a Database Center of Excellence - A Case Study
One of the most valuable resources in any organization is the Database Administrator -
the guardian of your corporation’s data. When these same valuable resources spend all of their
time on non value-added activities, you're wasting them.
Creating a Database Center of Excellence delivers value to the organization, reduces TCO,
retains your high value professionals, allows ITIL adoption, improves morale and teamwork throughout
the organization and allows you to fully engage in those standardization and consolidation plans that
you've had on the 'to do' list.
Hear how it is done.
What will be learnt?
- You'll learn how to transform a group of individual DBA's into a fully functioning, team oriented Database Center of Excellence.
- You'll learn techniques on how to reduce your operational costs and TCO.
- You'll learn how to deliver value to the organization.
- You'll learn how to retain your specialized, talented professionals.
|
TeraGenomics - Migration of VLDB from Teradata to Oracle
More than three years ago IMC, Inc., started to develop the TeraGenomics solution
on the top of the Teradata database. TeraGenomics is a high-performance data warehousing solution
for managing, analyzing, and sharing Affymetrix® GeneChip® gene expression microarray data.
To extend the potential customer base IMC recently decided to enable that solution to run on the
top of other databases that can effectively support very large data sets. Oracle is the first
database IMC recently elected for the migration effort.
The presentation will focus on the issues and challenges the development team has to
overcome. IMC currently hosts the TeraGenomics solution as an ASP. The database contains more that
7,000 microarrays and that translates to more that 4 billions rows of data to be managed. The
solution can scale to support hundreds of users analyzing data from tens of thousands of GeneChip
arrays.
TeraGenomics has a MIAME-compliant metadata structure comprised of 140 data elements
within controlled vocabularies (e.g., for organism, anatomy, and disease type) accessed through
drop-down menus. Microarray data in the warehouse can be queried by any of these elements.
TeraGenomics supports probe level pair-wise comparisons among dozens, hundreds,
or even thousands of chips through a rapid point-and-click interface, and stores the comparison
results in the warehouse for reuse. Among others TeraGenomics also supports the RMA analysis
with no limits on the number of chips included.
The application is a browser-based thin client solution that can be securely deployed
over the web or on an Intranet. The users can analyze their data using variety of visualization
approaches (clustering, Venn diagrams, plots, etc.). TeraGenomics is integrated with the Affymetrix
GeneChip Operating Software (GCOS) platform to support seamless uploads of experiments.
TeraGenomics is an ideal solution for scientists in multiple locations who perform
experiments with GeneChip brand arrays to consistently manage their data and share it for collaborative
research. The presentation will also mention some of the recent articles published in the scientific
journals that came from research efforts using TeraGenomics.
TeraGenomics supports easy exporting of data to popular desktop tools such as GeneSpring®
and Spotfire®, and to powerful 3-D neurogenomic visualization tools from Neurome, as well as in both
MAGE-ML and CSV formats.
|
Data warehouse development for integrative biomedical informatics research
Windber Research Institute is conducting integrative high throughput research
involving clinical, genomic, and proteomic platforms to produce terabyte levels of data.
Working with Walter Reed Army Medical Center and other medical institutions we enroll subjects
into study and gather data of about 500 fields ranging from demographics to pathological tissue
annotations. High throughput research is conducted using DNA, RNA, and protein extracted from
the collected blood, breast and other specimens. The major platforms include gene expression,
genotyping, comparative genomic hybridization, protein isolation, and mass spectrometry protein
identification. We have a data tracking system for most of those platforms, with additional
modules being developed. Recently the need for image handling is surfaced.
To prepare such large scale of data as well as involved public data for biomedical
informatics research, we envisioned that a data warehouse is the solution, and we opted for a
hybrid structure. In the last two years we have contracted the data warehouse to a company using
the Teradata hardware and RDBMS. Currently we have decided that to best meet our research needs
we should re-design the data warehouse by ourselves, using a patient-centric and object-based
structure, and the underlying database should be Oracle. This presentation will discuss this new
design of the data warehouse as well as the development of visualization and analysis tools.
|
Managing Research Collaboration with Oracle Collaboration Suite Workspaces
Research Collaboration sits at the heart of Life Sciences research. Given the
current tools for collaboration, researchers split their time between organizing research data,
sharing it with fellow researchers, complying with rules and regulations and spending actual
time analyzing it. This activity prolongs the time needed to conclude the research. It has
been shown time and again that research organizations spend enormous amount of money in funding
new research and studies. Any reduction in time and efforts needed to reach a conclusion directly
adds to savings. Better organization of research project not only saves money, it also helps
encourage re-use of information and learning from one project to another.
Oracle Collaboration Suite Workspaces provide a number of capabilities that can
help simplify the management of research project. Research information and activities can be
organized with minimal effort. Workspaces enable researchers to work with their favorite tools
and organize their research data behind the scene. A number of content management features and
ability to add-on records management capabilities for all content help researchers comply with
regulatory compliance efforts with relatively small effort. Oracle Workspaces provide the
following benefits to Life Sciences research community:
- Enable team collaboration with the context of a project complete with document, discussions,
meetings, tasks, notifications and announcements
- Reduce unwanted copies of any information
- Named links to establish relationship among documents and activities around them
- Tasks assignments and links to documents
- Privileged access to different content
- Default content management services transparently applied to workspace content including:
- Auto-attribution
- Auto-versioning
- Default workflows
- Other document policies
- Participation via email
- Access to relevant workspace content from their personal productivity tools such as:
- Documents from windows mounted OracleDrive with in-place editing and off-line access
to documents
- INBOX and Discussions accessible from IMAP compliant mail client
- Workspace meetings in personal calendar
- Optional email notification with summary and quick links
- Quickly capture best-practices from a workspace in a template and use them to quickly create
new workspaces
- View to create focused access to workspace content
- Full access to workspace APIs to enable integration into custom applications and tools without
any duplication of information. This allows access to all the same information from tools provided with Collaboration Suite.
|
Enhanced Reporting for Oracle Clinical and Oracle AERS
This presentation will discuss how users can enhance reporting and analysis capabilities
of their Oracle Clinical and Oracle AERS applications. Using Oracle Discoverer, personalized dashboards
may be designed to enable clinical data management and pharmacovigilance users to interactively analyze
clinical and safety information. This powerful tool can provide valuable insights into data management
and pharmacovigilance processes, thereby enabling customers to expedite clinical trials through efficient
tracking of problems in data management, optimize study design and improve regulatory compliance.
Examples of reporting and analysis for Oracle Clinical data include:
- Trial status analysis (e.g. enrollment, progress, etc.)
- Discrepancy listing analysis by study, site and patient
- CRF page analysis
- Study Design analysis (e.g. DCI, DCM, procedures, etc.)
- Global Library analysis (e.g. Studies that utilized a parameterized question, question group, or particular DVG subset)
- Lab test analysis (e.g. lab ranges, etc.)
- TMS analysis (e.g. dictionary mapping, uncoded terms, etc.)
Examples of reporting and analysis for Pharmacovigilance data include:
- Case status analysis (e.g. case progress)
- Regulatory agency reporting metrics
- Case origin analysis (e.g. by country, by product, etc.)
|
| Workshops |
Text Mining with Oracle
Oracle Technology is known for its ability to store vast quantities of data and
bring data mining algorithms directly to the data. This capacity extends to unstructured data,
such as literature, which is of great importance in life sciences. Oracle encompasses a set of
capabilities designed specifically for text analysis, as well as more generic data mining
algorithms which can be applied in text mining.
This workshop will present the use of some of the Oracle Text and Oracle Data
Miner (ODM) features for text mining of MEDLINE. Practical methods of document searching,
clustering and classification will be demonstrated along with enhancements through the use of
ontologies, thesauri and the Oracle Text knowledge base. We will also show how to combine
unstructured with structured data along with a few simple methods for results-visualization.
|
BLAST and Regular Expression Searches
In this workshop we discuss two database features in Oracle Database 10g: Oracle
Data Mining (ODM) BLAST and Regular Expression Searches. Scientists can use ODM BLAST to perform
sequence homology searches and Regular Expression Searches to perform pattern matching, inside
the database. Performing data analysis in the database simplifies data management by minimizing
the movement of data from disks to memory, allows pre-filtering and post-processing of data sets,
and enables data to remain in a secure, highly available environment. These new database features
enable scientists to take advantage of a new analytical paradigm to simplify data management in
research.
|
HTML DB
Oracle HTML DB - a no-cost feature of the Oracle Database 10g - is a declarative
web-based application development & deployment environment. With it, you can quickly create and
deploy secure, scalable web applications.
|
Network Data Model for Biological Pathways
The Oracle Spatial Network Data Model (NDM) feature enables graph modeling and
analysis in Oracle Database 10g. NDM explicitly stores and maintains connectivity (nodes,
links, and paths) within networks and provides network analysis capability such as shortest
path and connectivity analyses. NDM includes a PL/SQL API Package for Network Data Query and
management, and a Java API for network creation, representation, editing and analysis. This
workshop will give an overview of the architecture of NDM, and provide customer use cases.
Demos of the functionality will be shown using Cytoscape and Tom Sawyer.
|
Managing Life Science and Medical Image Repositories Using Oracle interMedia 10g Release 2
Oracle interMedia extends the benefits of the database to media data.
Images can be stored and managed in the database for security, ease of maintenance, scalability,
and other advantages. interMedia functionality includes metadata management, image processing,
interfaces for media upload and retrieval, and tools for easy application development involving media.
Oracle interMedia 10g Release 2 (now in beta) provides support for metadata management and
DICOM (the medical image standard). Metadata support for Life Science applications is based on XMP
and the XMP framework allows user's to write metadata into images based on any user-defined schema.
The metadata is hence part of the image and cannot be lost or disconnected from the image itself.
DICOM support allows applications to read metadata such as patient information from medical images.
All metadata and images can be stored securely in the database using interMedia.
The workshop will start with a 10 minute overview of interMedia. This will be
followed by a closer look at two sample applications. The first application will walk-through the
steps of storing images in the database and managing the metadata associated with images in the
database. The application will create a table of images in the database, create thumbnails of those
images, retrieve images from the table, extract metadata from the images, add new metadata, define
and add an application defined schema to the image metadata, and search the image metadata. It will
also include some processing functionality. The second application will walk-through the steps of
storing DICOM images in the database and will extract and manage DICOM metadata in the database.
Segments of PL/SQL and Java code from both applications will be presented.
|
Oracle Data Mining
This workshop provides information about the applications of data mining
technology using several life sciences examples. Participants will learn how to find
the factors associated with certain pathogens. The workshop will present the attribute
importance algorithm which is used to find the attributes (variables) that most influence
a user specified target attribute e.g. positive outcomes, high risk patients, or a disease.
Several classification algorithms including Naïve Bayes, Adaptive Bayes Networks, Support
Vector Machines, and Decision Trees will be presented to build models that can be used to
make predictions e.g. which patients are likely to respond to treatment or determine
whether a tissue sample is healthy or unhealthy based on gene expression data. Decision
trees, Association Rules, and Clustering techniques will be presented to help uncover
hidden factors and associations that can be used to improve the quality of life, outcomes,
or clinical care. Lastly, this workshop will present how “unstructured data” e.g. a
physician’s notes or a laboratory report can be used to extract more hidden information
and build better predictive models.
|
RDF Data Model for the Semantic Web
A project that is focused on true semantic interoperability is the Semantic Web
initiative by the W3C. The idea behind the Semantic Web is that it creates a globally distributed
database by adding definition tags to information that is available on the Web and linking the
tags in such a way that computers can discover data more efficiently and form new associations
between pieces of information. The aim is then that applications will be able to take advantage
of distributed data, and incorporate data that users were not aware existed. Semantic tagging is
unlike most other data integration approaches as it allows relationships to be discovered. For
example, gene definitions can become functionally linked to inherited diseases, and their position
within a biochemical pathway can be interwoven. Resource description framework (RDF) is the
recommended standard from W3C for the common data format, which is required for making data
available in the Semantic Web language. The syntax of RDF is a triple, subject-predicate-object,
which allows data to become connected piece-by-piece, and link-by-link. This workshop will describe
the RDF Data Model in Oracle Database 10g, and provide a demo of the capabilities.
|
Statistical analysis of gene expression data with Oracle & “R”
With Oracle's 10g database, most statistics, informatics, and traditional
machine learning can be done on all types of data completely inside the database. The
Oracle Data Miner interface is a Java based GUI that allows one to do most of the traditional
data mining functions, however, many specialized analyses such as gene-to-gene correlation
and t-statistic calculations can best be done with Oracle's SQL or PL/SQL API. Traditional
programming languages such as Java, C++, C-sharp can also be used to access these database
functions.
This talk/workshop will demonstrate database techniques for this type of informatics.
SQL and PL/SQL code and packages will be demonstrated, analyzed and made available for many
specific gene expression and other life science high throughput analysis.
Some of these techniques will include:
- Affymetrix gene expression 2d wide to transaction or nested table format - This allows
many thousands of genes to be attributes or columns in machine learning clustering and
predictive analysis.
- Efficient use of Correlations, t-statistics, f-statistics and Minimum Description length
(MDL) in determining the most significant genes in an expression experiment.
- Various statistical significance tests - Bonferroni, FDR, etc.
- Other analysis techniques, Protein Arrays, Mass. Spec, QSAR , etc. within Oracle's 10g DB.
|
Loading Life Sciences Data into Oracle Database 10g
This workshop provides an overview of a range of techniques for loading data
into Oracle Database 10g. Features to be covered include SQL*Loader, Data Pump and Oracle
Warehouse Builder.
SQL*Loader allows data to be loaded into Oracle tables from operating system
files. It can load data from multiple datafiles during the same session, as well as having
the capability to load data into multiple tables. The data can be loaded from disk, tape,
or named pipe. SQL*Loader contains a powerful data parsing engine which puts little limitation
on the format of the data in the datafile.
Data Pump is a high speed, parallel infrastructure that enables quick movement
of data and metadata from one Oracle database to another. This technology is the basis for Oracle's new
data movement utilities - Data Pump Import and Data Pump Export.
Oracle Warehouse Builder is a tool to enable the design and deployment of Business
Intelligence applications, data warehouses and data marts. Warehouse Builder enables users to design their own
Business Intelligence application from start to finish. Dimensional design, ETL process design, extraction from
disparate source systems, extensive metadata reporting and integration with Oracle Discoverer, Oracle Workflow
and Oracle Enterprise Manager enable an integrated Business Intelligence solution with Warehouse Builder at
the core.
|
|