Original Agenda
We are actively working with our speakers to confirm their availability for the virtual event. Initial response from our speakers has been very positive, and we are optimistic we will have the new programs ready to share here soon.

Clinical Research and Translational Informatics

Advancing clinical trials and translational research requires transforming biological insights and raw research data into clean, actionable data for integration, visualization, and analysis. The Clinical Research and Translational Informatics track explores new and innovative tools and techniques—including big data analytics, machine learning, and artificial intelligence—and how they can be leveraged to address specific challenges faced across the drug discovery spectrum to accelerate the translation of scientific discoveries from the bench to medical care. Gain practical recommendations and real-world insights from case studies across pharma and academia. Actionable insights require making the results of complex analysis readily convertible into the common workflows of the clinician and researcher. How do you approach this problem?

Final Agenda


Monday, october 5

9:00 am - 5:00 pm Hackathon*

*Pre-registration required.

Tuesday, october 6

7:30 am Workshop Registration Open and Morning Coffee

8:30 am - 3:30 pm Hackathon*

*Pre-registration required.

8:30 - 11:30 am Recommended Morning Pre-Conference Workshops*

W3. Introduction to Data Visualization for Biomedical Applications

Nils Gehlenborg, PhD, Assistant Professor, Department of Biomedical Informatics, Harvard Medical School

12:30 - 3:30 pm Recommended Afternoon Pre-Conference Workshops*

W13. Structuring Data for Drug Development and Regulatory Submissions: The Role of Standards and Ontology

Lawrence Callahan, PhD, Chemist, Office of Health Informatics, Global Substance Registration System/Office of Health Informatics, Office of Chief Scientist, FDA

Hande Kucuk McGinty, PhD, Research Scientist, Collaborative Drug Discovery

Gregory Pappas, Associate Director, National Surveillance, Center for Biologics Evaluation and Research, U.S. Food and Drug Administration

Michael Waters, Team Lead, System Harmonization and Interoperability Enhancement for Laboratory Data (SHIELD), U.S. Food and Drug Administration

*Separate registration required.

2:00 - 6:30 Main Conference Registration Open


4:00 Welcome Remarks

Cindy Crowninshield, RDN, LDN, Executive Event Director, Cambridge Healthtech Institute





4:05 Keynote Introduction

4:15 PLENARY KEYNOTE PRESENTATION: NIH’s Strategic Vision for Data Science

Susan K. Gregurick, PhD, Associate Director, Data Science (ADDS) and Director, Office of Data Science Strategy (ODSS), National Institutes of Health





Rebecca Baker, PhD, Director, HEAL (Helping to End Addiction Long-term) Initiative, Office of the Director, National Institutes of Health





5:00 - 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing



Wednesday, october 7

7:30 am Registration Open and Morning Coffee


8:00 Welcome Remarks

Allison Proffitt, Editorial Director, Bio-IT World




8:05 Awards Program Introduction

J.W. Bizzaro8:10 Benjamin Franklin Award and Laureate Presentation

J.W. Bizzaro, Managing Director, Bioinformatics.org





Allison Proffitt8:35 Bio-IT World Innovative Practices Awards

Allison Proffitt, Editorial Director, Bio-IT World




8:45 Keynote Introduction

8:55 PANEL DISCUSSION: Game On: How AI, Citizen Science, and Human Computation Are Facilitating the Next Leap Forward

Seth CooperSeth Cooper, PhD, Assistant Professor, Khoury College of Computer Sciences, Northeastern University






Lee LancashireLee Lancashire, PhD, Chief Information Officer, Cohen Veterans Bioscience






Pietro Michelucci, PhD, Director, Human Computation Institute






Jérôme WaldispühlJérôme Waldispühl, PhD, Associate Professor, School of Computer Science, McGill University






While the precision medicine movement augurs for better outcomes through targeted prevention and intervention, those ambitions entail a bold new set of data challenges. Various panomic and traditional data streams must be integrated if we are to develop a comprehensive basis for individualized care. However, deriving actionable information requires complex predictive models that depend on the acquisition and integration of patient data on a massive scale. This picture is further complicated by new data streams emerging from quantified self-tracking and health social networks, both of which are driven by experimentation-feedback loops. Tackling these issues may seem insurmountable, but recent advancements in human/AI partnerships and crowdsourcing science adds a new set of capabilities to our analytic toolkit. This talk describes recent work in online collective systems that combine human and machine-based information processing to solve biomedical data problems that have been otherwise intractable, and an information processing ecosystem emerging from this work that could transform the landscape of precision medicine for all stakeholders.

9:45 Coffee Break in the Exhibit Hall with Poster Viewing


10:50 Organizer’s Welcome Remarks

Cambridge Healthtech Institute

10:55 Chairperson’s Remarks

11:00 Advancing Pharma R&D with Digital Health and AIEnabled Insights

Ray Liu, Senior Director, Advanced Analytics and Statistical Consultation, Takeda Pharmaceuticals

Novel digital technology allows study subjects to be assessed phenotyped to reveal new patterns. Coupled with Big Data and new AI technologies, digital technology has great potential to make the drug more efficient and fulfill the promise of personalized medicine. The presentation explores the status of digital technology implementation in clinical trials and impacts of AI/ML. Challenges and opportunities for analytical development will also be discussed.

11:30 Bridging Clinical Research and Real-World Data in a Patient-Centric Multiverse

Sherman_AlexAlexander Sherman, Director, Center for Innovation and Bioinformatics, Massachusetts General Hospital

Patient- and disease-related information resides in a multiverse of data silos, mostly institutional or modality-based databases. A patient centricity approach may help to bring such information together and bridge clinical trials data with RWD, such as data from EHRs, DNA sequences, image banks, biobanks, -omics, etc. We are introducing patient-centric approaches with a unique secure patient identification and aligning incentives for all players in a research continuum, including academia, industry, government, patient advocates, and patients.

12:00 pm Sponsored Presentation (Opportunity Available)

12:30 Session Break

DNAnexus 12:40 LUNCHEON PRESENTATION I: Democratizing Molecular and Digital Data to Accelerate Precision Oncology Research

Samir Courdy, Vice President, Research Informatics, City of Hope Comprehensive Cancer Center

1:10 Luncheon Presentation II (Sponsorship Opportunity Available) 

1:40 Session Break


1:50 Chairperson’s Remarks

Alexander Sherman, Director, Center for Innovation and Bioinformatics, Massachusetts General Hospital

1:55 Enabling the Connection between Preclinical and Clinical Data

Lange_MichaelMichael Lange, ML/AI Lead, R&D Informatics, Small Molecule Discovery Informatics, Roche

Preclinical Project Data Hub is going clinical. Over the past years, it was established in Small Molecule Discovery and recently expanded into Large Molecule Research. Next stop: Clinical Data. We plan to provide our users with an application that allows the connection between preclinical and clinical metadata.

2:15 A Comprehensive Platform for Innovation with Data

Shah_AjayAjay Shah, PhD, MBA, Executive Director & Head of IT for Translational Medicine, Bristol-Myers Squibb

Sage is a comprehensive platform that enables FAIR data, for data ranging from discovery, clinical research, and real-world. This talk will focus on the overview of Sage and solutions developed in Sage ecosystem for biomarker analytics, including an overview of essential components of the platform, such as uniform high-quality data ingestion, data lake enhancement with semantic integration conformance of data, and a reproducible research framework.

2:35 Maximizing Real-World Assets through a Comprehensive Patient Data Platform

Wang_AlbertAlbert Wang, MS, Director, IT for Translational Research & Technologies, Bristol-Myers Squibb

Sage ecosystem is a cross-functional cohesive platform for finding, accessing, integrating, and analyzing patient-centric data. This talk will focus on real-world data (RWD). It will highlight how Sage catalogs, models, integrates, conforms, and presents patient-level metadata across all RWD assets to facilitate downstream cross-dataset analysis within an integrated managed analytics environment. This talk will touch on the business drivers for this initiative, our current progress, as well as some lessons learned.

2:55 Optimising Site Feasibility Using AI and Predictive Insights

Nicola Marlin, Chief Product Officer, Pharma Intelligence

Co-Presenter to be Announced

3:10 Sponsored Presentation (Opportunity Available)

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing


4:00 Chairperson’s Remarks

Lawrence Callahan, PhD, Chemist, Office of Health Informatics, Global Substance Registration System/Office of Health Informatics, Office of Chief Scientist, FDA

4:05 Clinical Data Visualizations to Drive Clinical and Biomarker Exploration Using Both Clinical Trial and Real-World Data

Philip Ross, PhD, Director, Translational Bioinformatics Data Science, Bristol-Myers Squibb

Exploratory visualizations generated from clinical trials and real-world data sources provide important insights into safety, efficacy and biomarker responses to novel and standard-of-care treatments. Automation of data updates in near-real time increases the impact of this information on decision-making.

4:35 The Global Substance Registration System (GSRS): An Essential Tool for Structuring Translational Clinical and Regulatory Data

Callahan_LawrenceLawrence Callahan, PhD, Chemist, Office of Health Informatics, Global Substance Registration System/Office of Health Informatics, Office of Chief Scientist, FDA

The ISO IDMP is a set of standards developed by regulators and industry to structure medicinal product information in a consistent manner. The GSRS is freely distributed software developed in collaboration with NIH/NCATS that implements the substance standard. The GSRS defines substances in medicinal products and related substances such as targets, metabolites, and impurities and links these substances to products, clinical trials, applications, and adverse events.

5:05 Offshoot Applications from the G-SRS Moonshot

Katzel_DannyDaniel Katzel, Senior Software Engineer National Center for Advancing Translational Science, NIH

While developing G-SRS, the NIH/NCATS team developed useful standalone tools to help streamline registration, curation and regulation of medicinal ingredients. This presentation will cover three such projects: an image to chemical structure recognition program (molvec), an abstraction layer allowing cheminformatics software to switch between underlying informatics frameworks at runtime (molwitch) and Inxight:Drugs, which incorporates and unifies marketing, regulatory status, rigorous drug ingredient definitions, biological activity and clinical use.


5:35 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing

6:45 End of Day

Thursday, october 8

7:30 am Registration Open and Morning Coffee


8:00 Organizer’s Remarks

Cindy Crowninshield, RDN, LDN, Executive Event Director, Cambridge Healthtech Institute




Robert Green8:15 Toward Preventive Genomics: Lessons from MedSeq and BabySeq

Robert Green, MD, MPH, Professor of Medicine (Genetics) and Director, G2P Research Program/Preventive Genomics Clinic, Brigham & Women’s Hospital, Broad Institute, and Harvard Medical School




Natalija Jovanovic9:00 AI in Pharma: Where We Are Today and How We Will Succeed in the Future

Natalija Jovanovic, PhD, Chief Digital Officer, Sanofi Pasteur




9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced at 10:00




10:30 Organizer’s Remarks

Cambridge Healthtech Institute

10:35 Chairperson’s Remarks

10:40 CO-PRESENTATION: How a Knowledge-Base Analytics Platform Has Empowered Data-Driven Decision Making and Is Transforming Translational Research

Ge_YanYan Ge, Director, Data Analytics, Data Science Institute, Takeda

Koenig_ErikErik Koenig, Principle Scientist, Translational Oncology, Head Strategy Innovation Management, Takeda Pharmaceuticals

Takeda’s R&D Data Hub has been established to maximize the value of data, make them FAIR, increase access for efficient analysis and to drive data-driven decision making. The Strategic Translational Oncology Research Knowledge-base (STORK) platform is a mission-critical strategic application leveraging both the R&D Data Hub and leading-edge Big Data technologies to harmonize the increasing data density of Immuno-Oncology Research and Development. STORK provides better catalogued and enriched biomarker assays data, allows researchers to intuitively and easily query internal preclinical data, clinical trials data, and external data like full-text literature and clinicaltrials.gov sources using NLP. Furthermore, STORK’s self-service visualizations enable more efficient benchmarking, cross comparisons, forward and reverse translational insights to support key decision-making throughout the therapeutic lifecycle.

11:10 A Grassroots Translational Research Data Commons: Lessons from a Patchwork Data Lake Implementation

Bergeron_JayJay Bergeron, Director, Translational Research Business Technologies, Pfizer

Managing and exposing clinical biomarker information using traditional data warehouse platforms promotes efficient exploratory analysis. However, performing analyses across clinical study collections available in data warehouses has proven challenging due to a high degree of data heterogeneity coupled with the high cost of manually conforming study data to consistent standards. An alternative data management philosophy (i.e. data lake) promotes aggregating clinical datasets in native formats and delaying costly dataset transformation until specific analytical needs arise. Effective content search capabilities are required to identify scientific datasets of interest in order to attain the cost advantage of purpose-driven analytical data preparation. A data lake implementation based on toolsets commonly available to pharmaceutical informaticians, including Elastic Search, scientific terminology services, Jupyter Hub and R, will be presented as an option for building useable large-scale clinical data collections while limiting the necessity of comprehensive dataset transformations.

11:40 Sponsored Presentation (Opportunity Available)

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Last Chance Poster Viewing


1:55 Chairperson’s Remarks

Michael Liebman, PhD, Managing Director, IPQ Analytics, LLC

2:00 PANEL DISCUSSION: Big Data Meets RWE: There Are Elephants in the Room


Liebman_michaelMichael Liebman, PhD, Managing Director, IPQ Analytics, LLC


Christianson_AnastasiaAnastasia Christianson, Vice President, R&D Business Technology, Janssen Pharmaceuticals

Michael Montgomery, MD, former Global Head Medical Affairs, Incyte Pharma

Jonathan Morris, MD, Vice President, Provider Solutions; Chief Medical Informatics Officer, Real World Insights, IQVIA

We are applying Big Data to solve complex health-related problems but do not always acknowledge that there are Elephants in the Room: 1) It’s the Diagnosis, Stupid!; and 2) Clinical Trials: Data, Data Everywhere but Not without Bias. This panel will address issues of diagnostic quality, mis- and missed diagnosis, diagnosis vs stratification in complex disorders and syndromes and how these impact healthcare decisions and clinical development.


3:00 Applications of AI and Data Science to Drug Development: Opportunities and Challenges

Faisal Khan, PhD, Executive Director, Advanced Analytics and AI, AstraZeneca

The application of advanced data science and artificial intelligence techniques is prevalent across drug development, from pre-clinical drug discovery, through Phase 1-3 trials, and beyond.  Whether analyzing images, digital devices with streaming data, predicting trial progress, or doing much more, opportunities to accelerate getting drugs to the market are numerous.  However, there are also challenges that are worth bearing in mind.

3:30 Data Science in Predictive Trial Enrollment: Increasing Success and Improving Risk Mitigation Strategies

Rothenberger_GinaGina Rothenberger, Global Feasibility Therapeutic Area (TA) Head – Oncology, Janssen

Janssen is utilizing machine learning to compare planned vs. actual enrollment performance, showing success in early detection and intervention to improve strategies after study start-up and initial site selection. The strength and opportunity of utilizing tools and technologies to show meaningful correlations and visibility into tracking to plan will be demonstrated. Case studies utilizing data science algorithms will be shared to highlight how insights were derived and applied to increase success and improve risk mitigation strategies.

4:00 Close of Conference

Platinum Sponsors