Data Management Image

With the increased demand in computing power from life science researchers and scientists tackling big data issues, data storage infrastructure must be able to scale to handle billions of data points and files efficiently. The problem is administration of data to ensure information can be integrated, accessed, shared, linked, analyzed, and maintained to best effect across the organization. The Data Management track will explore how to manage data workflows and administer effective data processes to satisfy increased computing power demands.

Monday, September 20

7:30 am Registration Open
8:00 am Recommended Pre-Conference Workshops*

Cambridge Healthtech Institute is pleased to offer morning and afternoon pre-conference workshops on Monday, September 20, 2021. They are designed to be instructional, interactive and provide in-depth information on a specific topic. They allow for one-on-one interaction and provide a great way to explain more technical aspects that would otherwise not be covered during the main conference tracks that take place Tuesday-Wednesday. 

*Separate registration required. See Workshop page for details.

9:30 am Break
9:45 am Recommended Pre-Conference Workshops*
11:15 am Enjoy Lunch on Your Own
12:45 pm Recommended Pre-Conference Workshops*
2:15 pm Break
2:30 pm Recommended Pre-Conference Workshops*
4:00 pm Session Break and Transition to Plenary Keynote


4:15 pm Innovative Practices Awards – Winners Spotlight

Pharma Executive Roundtable: Broadening the Data Ecosystem

Panel Moderator:
Lita Sands, Head, Life Sciences, Amazon Web Services

The Bio-IT World community employed creativity, problem solving, and technical ingenuity to weather 2020 and never was the work more important. Meanwhile, digitization has been broadening the horizons of new possibilities and initiatives that are driving innovation in the life sciences sector. While over the past year many pharmaceutical companies have seen an acceleration of digital transformation, there are still many that are unsure what to expect going forward. Digital transformation is now a strategic imperative, not a buzzword. Join our Pharma Executive Roundtable to discover how biopharma companies are broadening their digital strategies and capabilities to develop products and services to scale, streamline operations, and drive innovation in life sciences R&D. 

Ramesh V. Durvasula, PhD, Vice President & Information Officer, Research Labs, Eli Lilly & Co.
Michael Montello, Senior Vice President, R&D Tech, GlaxoSmithKline
Bryn Roberts, PhD, Senior Vice President & Global Head of Data Services, Roche
Holly Soares, PhD, Vice President & Head, Precision Medicine, Pfizer Inc.
Lihua Yu, Chief Data Officer, FogPharma
5:45 pm Welcome Reception in the Exhibit Hall with Poster Viewing
7:00 pm Close of Day

Tuesday, September 21

7:00 am Registration Open and Morning Coffee


8:00 am Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Bio-IT World Conference & Expo

8:05 am Chairperson's Remarks
8:10 am

CO-PRESENTATION: Research Data Management Best Practices

Aleksandar Stojmirovic PhD, Associate Scientific Director, Data Science, Janssen Research & Development, LLC
Weiwei Schultz, PhD, Senior Data Scientist, Immunology Biomarker, Janssen R&D LLC

We report on the practices established at Janssen R&D to enable efficient storage and consistent retrieval of molecular research data in a FAIR (Findable, Accessible, Interoperable, Reusable) manner. All incoming data is required to be annotated with standardized metadata describing the context and purpose of its originating study and experiment. Coupled with indexing of metadata in integrated catalogs, this workflow rewards researchers with the ability to seamlessly search and cross-reference translational datasets.

8:40 am

Leveraging Knowledge Graphs for Drug Discovery

Sebastian Scharf, PhD, Data Scientist, Roche Pharma

With each year, the number of curated and community sources of publicly available data and repositories increases. While this is great news for the whole scientific community, it can be quite overwhelming for non-expert users. Here, we present an approach using semantic technologies and heavily rooted in the FAIR principles to make data more easily available to scientists, with a special focus on enriching internal data with publicly available data in the biologic space.

9:10 am Coffee Break in the Exhibit Hall with Poster Viewing
10:00 am

Beyond Data Strategy – How to Shift Gears from Planning to Value-Based Execution

Magdalena Wienken, PhD, Associate Director Data Operations & Governance, AstraZeneca GmbH

When we set out to build a data strategy it was a loose set of ideas together with a big vision. Three pillars of execution have helped to turn it into action: 1) The scientific question as the main framework for execution, 2) building a team with diverse skillset that fully understands the data, 3) integrating data governance to shape data citizenship and a novel way of data management.

Sangeet Khullar, Director, Data Science & Engineering, Daiichi Sankyo Inc.

Join to hear firsthand from Sangeet Khullar, Director of Data Architecture at Daiichi Sankyo, as he walks through his experience on Data Quality, RDM, and MDM initiatives in Pharma, including:

  • Reference data ( industry-standard dictionaries for data analysis and publications)
  • Master Data – Consistency of Protocol Numbers 
  • Cost reduction through simplifying data processes
  • Regulatory Mandates ( IDMP etc.)

The session will detail how Daiichi Sankyo uses the Ataccama platform to control the quality of their data and maintain strict, holistic data governance—thereby supporting and ensuring the success of drug development processes.

Andrew LeBeau, Associate Vice President for Biologics Marketing, Marketing, Dotmatics, Inc.

In March 2021, Dotmatics, provider of enterprise scientific informatics software, joined forces with Insightful Science, holder of best-of-breed desktop software such as Geneious, SnapGene, and GraphPad Prism. This combination gives life sciences research organizations the compelling benefits of centralized data management provided by Dotmatics, while still allowing end-users to innovate in the tools they are familiar and productive with. This marks a major step forward for the industry.

11:00 am Session Break and Transition to Luncheon Presentation
Keir Evans, Sr. Technical Solutions Consultant, Digital Solutions, Thermo Fisher Scientific

Scientific organizations continue to strive for digital transformation and connectivity across their ecosystem. The need for better automation is apparent, including physical automation of the lab and the intelligent flow of information across the disparate digital tools that support their scientific and business processes. Automation allows for reduced downtime, increased stability, and optimized efficiency, providing the time and space for labs to focus on innovations and improvements crucial to their success. Through digital solutions, lab automation, and artificial intelligence, automation platforms are critical to the digital transformation journey.

12:00 pm Session Break and Transition to Exhibit Hall
12:15 pm Refreshment Break in the Exhibit Hall with Poster Viewing


Michael Stapleton, PhD, Managing Director, Life Sciences, Accenture
1:15 pm

How Digital Evolution and an Attitudinal Revolution are Re-Shaping the Future of the Life Sciences Industry

Nimita Limaye, PhD, Research Vice President, Life Sciences R&D Strategy and Technology, IDC

The world has rapidly transitioned to a model of disaggregated care and decentralized clinical trials, with a heightened focus on patient-centricity. Digital resiliency has become the priority and discretionary spend on R&D platforms has been delayed. Federated-learning models are fueling co-innovation and GPU-powered transformer models are accelerating drug discovery. Technology is enabling access and equity. The borders between healthcare and life sciences are blurring and real-world data is being leveraged to drive a precision medicine strategy.

1:50 pm

All of Us Research Program – Seeking To Advance Precision Health for All Populations

Joshua Denny, MD, MS, CEO, All of Us Research Program, National Institutes of Health

The All of Us Research Program launched May 6, 2018 and currently has over 375,000 participants who have contributed biospecimens, health surveys, and a willingness to share their EHR. Participants are partners in the program and receive research results from data they contribute, including genetic ancestry and traits. In the future, participants will also receive health-related genomic results from whole genome sequencing. In May 2020, the program launched the beta version of the Researcher Workbench. Once researchers register and are approved to use the workbench, they can access individual-level data and a suite of tools to analyze these data. All of Us is committed to catalyzing a robust ecosystem of researchers and providing a rich dataset that drives discovery and improves health.

2:30 pm Refreshment Break in the Exhibit Hall with Poster Viewing


3:00 pm

PANEL SESSION: Data Governance: Cross-Industry Perspectives

Panel Moderator:
Santha Ramakrishnan, PhD, Global Data Governance Lead, Sanofi

There has been an explosion in the availability of data from various sources and formats and in the use of data from simple analytics to complex AI. And virtualization of all operations has accelerated our engagement with data. Governance is key to accessibility and quality data. We hope to uncover lessons in data governance from a selection of industries that deal with diverse data and problem spaces.

Sangram Birje, Vice President, Data Management, Medifast, Inc.
Felix Matschke, Executive Director, UBS, the investment banking company
Jack Pollard, PhD, Precision Oncology, Sanofi
Alexander Sherman, Director, Center for Innovation and Bioinformatics, Massachusetts General Hospital
Christine Suver, PhD, Vice President Research Governance & Ethics, Sage Bionetworks
4:05 pm Refreshment Break in the Exhibit Hall with Poster Viewing
4:35 pm

Trust in Data, Integrity in Science

Leslie D. McIntosh, PhD, CEO, Ripeta, Inc.

Transparent research practices and the sharing of research data, protocols, and code accelerate scientific advancements and solve real world problems in healthcare, the environment, and society. Yet, broad, open, research sharing has presented trust issues, particularly with limited checks on scientific integrity. This workshop will engage attendees on common trustbusters in open science and dive deep into the taxonomy of trust - providing research integrity recommendations and best practices.

Kevin Trimm, Vice President, Product Management, Certara

With the adoption of CDISC data standards by the USFDA and PMDA for new, investigational, and abbreviated new drug applications, and certain biologics license applications filings, most biopharmaceutical companies are converting data collected in clinical trials to comply with these standards. Successful methods for accomplishing and managing this process using a singular data model across biopharmaceutical research will be discussed and will highlight the benefits to research.

5:35 pm Networking Reception in the Exhibit Hall with Poster Viewing
6:35 pm Close of Day

Wednesday, September 22

7:30 am Registration Open
8:00 am Interactive Discussions

Interactive Discussions are informal, moderated discussions, allowing participants to exchange ideas and experiences and develop future collaborations around a focused topic. Each discussion will be led by a facilitator who keeps the discussion on track and the group engaged. For in-person events, the facilitator will lead from the front of the room while attendees remain seated. For virtual attendees, the format will be in an online networking platform. To get the most out of this format, please come prepared to share examples from your work, be a part of a collective, problem-solving session, and participate in active idea sharing. Please visit the website's Interactive Discussions page for a complete listing of topics and descriptions.

Emerson Huitt, Founder & CEO, Snthesis, Inc.
  • Fueled by automation and high throughput methods, life science data is growing at an exponential rate, creating more data about more aspects of biological systems than ever before.
  • The explosion in research data creates significant opportunities, but also exposes challenges in managing, integrating and utilizing this data at machine scale.
  • Discuss the opportunities presented by automating harmonization and normalization across disparate data sets.
9:00 am Coffee Break in the Exhibit Hall with Poster Viewing


9:45 am Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Bio-IT World Conference & Expo

9:50 am

Chairperson's Remarks

Benjamin R. Busby, PhD, Director, Solution Science, DNAnexus
9:55 am

Accelerated Analysis of Biobank Data

Rory Kelleher, Global Director, Industry Business Development, Healthcare & Life Sciences, NVIDIA

Biobanks offer a rich repository of information to study for greater understanding of disease mechanisms and the creation of more targeted treatments. While the growth of such data sets has been impressive to those wanting to analyze it, it can also be daunting in scale. GPUs dramatically accelerate the analysis of multi-omic data, from secondary and tertiary analysis of sequence data, to deep phenotyping image analysis, to natural language processing of records. Acceleration makes at scale analysis of data not only tractable, but in some cases interactive.

10:15 am

Unlocking the Promise of Multi-omics Data at Scale: Automated Harmonization with Clinical Data

Emerson Huitt, CEO, Snthesis, Inc.

The rising availability of multi-omics data for disease research represents an incredible opportunity for clinical research.  Automated harmonization of ICD-10 diagnostic codes and disease identifiers in omics data allows research data and clinical data to be integrated at scale.  This automation is the key to providing high quality data for ML applications in multi-omics clinical research.  We demonstrate automated harmonization and integration and its potential to unlock new research opportunities.

10:35 am

Translation of AI and ML to a Learning Healthcare System

Sean Davis, MD, PhD, Professor Of Medicine, University of Colorado Anschutz Medical Campus

The development and deployment of AI and ML models in healthcare includes challenges that extend well beyond data availability and technology stacks. Data governance, regulatory compliance, stakeholder priorities, financial incentives, sustainability, and translational and clinical impact all affect if and how AI and ML research can benefit a learning healthcare system. In this talk, I will review some of the general challenges to translation of AI and ML into clinical practice and provide specific context based on experiences at our institution.


From BioBank Scale to Individual Patients: Bringing Complex Multi-Omic Data to the Clinic and Clinical Research!

Panel Moderator:
Benjamin R. Busby, PhD, Director, Solution Science, DNAnexus

Many multi-omics datasets of different diseases have been generated and the availability of new analytical tools are now for the first time allowing the combining of all of these resources in several ways in clinical research. There are, however, serious challenges involved in realizing the promise of these developments. Developing new methods for multi-omics data will allow for better patient stratification, more targeted treatments, and greater understanding of disease mechanism.

Emerson Huitt, CEO, Snthesis, Inc.
Sean Davis, MD, PhD, Professor Of Medicine, University of Colorado Anschutz Medical Campus
Rory Kelleher, Global Director, Industry Business Development, Healthcare & Life Sciences, NVIDIA
Suman Kumar, Senior Manager, Deloitte Consulting LLP
Nick Lingler, Managing Director, Deloitte Consulting LLP

The traditional flow of data across the clinical trial life cycle can become a complicated maze of manual effort, rework, and inefficiency—contributing to trial time and cost. Companies should harness AI to streamline the clinical trial data lifecycle, and open new opportunities.  We’ll discuss: Challenges with traditional approaches to managing clinical study data & Potential for AI to deliver faster, more efficient, and significantly less expensive clinical trials.

11:55 am Session Break and Transition to Luncheon Presentation
Mike Mendoza, Senior Director, Solution Management - eClinical, Calyx
Elya Shaffer, Director, Patient Engagement Solutions, Product Management, Calyx

We ask Patients and Sites to provide a massive amount of data over the life of a clinical trial. The methods for collection of data, as well as the frequency and quantity of data collection seems to expand and multiply every year as technology advances and matures. With all the data we are collecting, is it possible we are missing out on some critical elements of patient-related or site-related data? How do we also try to ensure patients and sites stay engaged in the arduous process of data generation and collection?

12:55 pm Session Break and Transition to Exhibit Hall
1:10 pm Refreshment Break in the Exhibit Hall with Poster Viewing



Trends from the Trenches

Panel Moderator:
Kevin Davies, PhD, Executive Editor, The CRISPR Journal; Founding Editor, Bio-IT World

Since 2010, the “Trends from the Trenches” presentation, given by Chris Dagdigian, has been one of the most popular annual traditions on the Bio-IT Program. The intent of the talk is to deliver a candid (and occasionally blunt) assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. The presentation has helped scientists, leadership, and IT professionals understand the basic topics related to computing, storage, data transfer, networks, cloud, data science, and machine learning that are involved in supporting data-intensive science. In 2021, Chris will give the “Trends from the Trenches” presentation in its original “state-of-the-state address” followed by guest speakers giving podium talks on relevant topics. An interactive Q&A moderated discussion with the audience follows. Come prepared with your questions and commentary for this informative and lively session. To stay connected with Trends from the Trenches updates after today and all year, sign up for BioTeam's newsletter here:

Chris Dagdigian, Senior Director, BioTeam, Inc.
Fernanda S. Foertter, PhD, Director of Applications, NextSilicon
Karl Gutwin, PhD, Director, Software Engineering Services, BioTeam, Inc.
Adam Kraut, Director Infrastructure & Cloud Architecture, BioTeam, Inc.
3:30 pm Refreshment Break in the Exhibit Hall with Poster Viewing


4:00 pm

Deconvolution of Massive Scale Datasets from Etiological Lessons: Technical Tips and Tricks, Data Interoperability for Training, and Feature Extraction

Benjamin R. Busby, PhD, Director, Solution Science, DNAnexus
Ankita Das, PhD, Head of Product, MIODx
Ahmad Khleifat, Clinician Scientist, King's College London
Vivian Neilley, Lead Interoperability Solution Engineer, Google Cloud Healthcare

Clinical Reporting of TCR Data
Ankita Das, PhD, Head of Product, MIODx
TCR profiling can provide valuable insights about the Immune health status of an individual, and is valuable in predicting response to immunomodulatory drugs. Running TCR sequencing analysis at scale and over multiple time points, managing  and presenting outputs for clinical utility is challenging. We present a scalable and clinically interpretable TCR analysis workflow that can take inputs from multiple platforms and provide TCR clonotypes of individuals over time.  Results could be used to predict inflammation and response to treatment in conjunction with other omic data types.

An Extensible Prototype for MultiOmic Clinical Reporting
Ahmad Khleifat, Clinician Scientist, King's College London
Many multi-omics datasets of different diseases have been generated and the availability of many new analytical tools are now for the first time allowing the combining of all of these resources in several ways in clinical reporting. We have developed a tool to facilitate reporting of multi omics data. The tool generates two reports, one is aimed for clinical use and the second aimed for researchers, informing the interpretation of genetic variants pertaining to the gene provided by the user. The clinical research reports are harmonized to the Observational Medical Outcomes Partnership (OMOP) database which allows for a systematic analysis for all the integrated multi-omics data. The identification of biologically meaningful targets using multi-omics data will allow for better stratification, more targeted treatments, and a greater understanding of disease mechanisms.

Getting Healthcare Standards to Interoperate with FHIR, HL7v2, VCF, and OMOP
Vivian Neilley, Lead Interoperability Solution Engineer, Google Cloud Healthcare
Government regulations have brought a wave of adoption for structured data standards in the payor and provider sectors. Mandates requiring the use of FHIR in patient data through APIs have accelerated the consolidation of data across clinical organizations. One clear gap has been helping the research community to utilize their existing workflows (with VCF and OMOP) in combination with the standardized clinical sets. This talk will overview how to interoperate between research standards and EHR standards to best work across modalities. It will also cover how to bring research back into the point of care using integration and standardizations work.

5:35 pm Close of Conference

Exhibit Hall and Keynote Pass

Data Platforms and Storage Infrastructure