Pharmaceutical R&D Informatics

Pharmaceutical R&D departments are at a crossroads – we have more technology and data than ever before, priming us for novel discoveries, yet there are still many challenges informatics strategies must address. The digitalization of the lab is at the forefront, and it necessitates quality data as well as knowledge management strategies, especially in the search for effective, real-world uses of AI and machine learning. We must also address how these new technologies are transforming day-to-day workflow and knowledge exchange, and what change management, investment, and regulatory strategies must be employed to make them successful. The Pharmaceutical R&D Informatics track will explore real-world projects related to digitalization, FAIR data, knowledge management systems, and artificial intelligence development and implementation, and how such initiatives are driving precision medicine.

Final Agenda

Monday, April 20

9:00 am - 5:00 pm Hackathon*

*Pre-registration required.

Tuesday, April 21

7:30 am Workshop Registration Open and Morning Coffee

8:30 am - 3:30 pm Hackathon*

*Pre-registration required.

8:30 - 11:30 am Recommended Morning Pre-Conference Workshops*

W1. Data Management for Biologics: Registration and Beyond Diana Bowley, Business Relationship Manager, Biologics, AbbVie

Diana Bowley, Business Relationship Manager, Biologics, AbbVie

Benjamin Li, Head of IT RDM Biological Sample Production & Management, Boehringer Ingelheim

Yuan Lin, Senior Manager, Pfizer Digital, Pfizer

Sebastian Schlicker, Director, Biologics Business, Genedata, Basel, Switzerland

Monica Wang, PhD, Principal Bioinformatics Architect, Project and Program Manager, Global Research IT, Takeda

12:30 - 3:30 pm Recommended Afternoon Pre-Conference Workshops*

W9. Digital Biomarkers and Wearables in Pharma R&D and Clinical Trials

Timothy Aungst, PharmD, Associate Professor, Pharmacy Practice, MCPHS University

Ariel Dowling, PhD, Associate Director, Digital Clinical Devices, Data Sciences Institute, Research and Development, Takeda Pharmaceuticals

Graham Jones, PhD, Director, Innovation, Novartis

Larsson Omberg, Vice President, Systems Biology, Sage Bionetworks

*Separate registration required.

2:00 - 6:30 Main Conference Registration Open

4:00 Welcome Remarks

Cindy Crowninshield, RDN, LDN, Executive Event Director, Cambridge Healthtech Institute




4:05 Keynote Introduction

4:15 PLENARY KEYNOTE PRESENTATION: NIH’s Strategic Vision for Data Science

Susan K. Gregurick, PhD, Associate Director, Data Science (ADDS) and Director, Office of Data Science Strategy (ODSS), National Institutes of Health





Rebecca Baker, PhD, Director, HEAL (Helping to End Addiction Long-term) Initiative, Office of the Director, National Institutes of Health





5:00 - 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)

Wednesday, April 22

7:30 am Registration Open and Morning Coffee

8:00 Welcome Remarks

Allison Proffitt, Editorial Director, Bio-IT World




8:05 Keynote Introduction

8:15 Toward Preventive Genomics: Lessons from MedSeq and BabySeq

Robert Green, MD, MPH, Professor of Medicine (Genetics) and Director, G2P Research Program/Preventive Genomics Clinic, Brigham & Women’s Hospital, Broad Institute, and Harvard Medical School




8:45 PANEL DISCUSSION: Game On: How AI, Citizen Science, and Human Computation Are Facilitating the Next Leap Forward

Pietro Michelucci, PhD, Director, Human Computation Institute






Additional Panelists to be Announced

9:45 Coffee Break in the Exhibit Hall with Poster Viewing


10:50 Organizer’s Welcome Remarks

Cambridge Healthtech Institute

10:55 Chairperson’s Remarks

Chairperson to be Announced, EPAM

11:00 Digital Transformation Driving Precision Medicine

Anastasia Christianson, Vice President, R&D Operations and Oncology IT, Janssen

Digital transformation is still a driving principle in pharma R&D with the ultimate goal being to streamline processes and enable precision medicine. This talk will showcase examples of digital technologies driving transformation and tangible results in R&D.

11:30 Enduring Value from Data-Centric Digital Transformation

Dana Vanderwall, PhD, Director, Biology & Preclinical Sciences IT, Research & Development IT, Bristol-Myers Squibb

Companies that have successfully transformed their business demonstrate the necessity of commensurate cultural change. Similarly, our ecosystem must embrace a data-centric design focus to deliver successful digital transformation and enduring value. Without implementing standards to deliver interoperable data with complete, consistent, standard contextual metadata, we will fail to transform the laboratory and data value chain. Continuing to generate data as a by-product of process and software underserves R&D objectives.

12:00 pm Making the Most of Real-WorldData (RWD) with Natural Language Processing (NLP)

Jane Reed, Senior Director, Life Science Strategy, Linguamatics

Real-world data can inform drug use and all phases of drug development/commercialization. Many RWD sources, like electronic health records, patient forums, social media, etc., contain unstructured or semi-structured text. We describe client use cases where Linguamatics NLP transforms RWD to provide visualization or analysis, or feed machine learning models.

12:15 Presentation to be Announced

12:30 Session Break

12:40 Luncheon Presentation I to be Announced

1:10 LUNCHEON PRESENTATION II: New Frontiers in Data Science – The Role of Automation in Accelerating Scientific Discovery

Georges Heiter, CEO, Databiology

1:40 Session Break


1:50 Chairperson’s Remarks

Tom Plasterer, PhD, Director of Bioinformatics, Data Science & AI, Biopharmaceutical R&D, AstraZeneca

1:55 The Essentials of FAIR-ifying Data

Tom Plasterer, PhD, Director of Bioinformatics, Data Science & AI, Biopharmaceutical R&D, AstraZeneca

While the value of FAIR data has been established–as well as the costs of un-FAIR data–adoption lacks easy routes. The Pistoia Alliance FAIR data toolkit and Innovative Medicines Initiative (IMI) FAIRplus Cookbook offer frameworks to start. Key decisions on what to name things (e.g., identifiers) and their semantics (e.g., vocabularies) are critical at journey inception. Once established, FAIR knowledge graphs and FAIR analytic services become enterprise data-centric enablers.

2:25 How to Hold on to Your Knowledge in an Agile World

Etzard Stolte, PhD, Global Head, Knowledge Management PTD, F. Hoffmann-La Roche

As pharma is embracing digital and agile, new challenges for the retention and sharing of information are emerging. While the information standards of a validated environment remain, ad hoc processes and distributed cloud solutions are creating new islands of knowledge. In this presentation, I will present a knowledge strategy based on automated discovery and integration for a development department of several thousand scientists at Roche.

2:55 Modern Approaches to Screening and Lead Discovery Drive Improved Success Rates in Drug Discovery

David Gosalvez, PhD, Director, Cheminformatics, PerkinElmer

Traditional screening systems have intrinsic limitations in the types, structure, and pipelines of what assay types can be handled. The risk of misidentifying leads increases even further if the right assay data are not available for SAR analysis. PerkinElmer has applied modern computational approaches to common challenges in analyzing assay, screening, and SAR data. Along with a customer, we will describe recent advances in how future proof approaches are being employed to enable scientists to increase hit rates for screening assays and lead development.

3:10 Enabling Insights through a Connected Laboratory Informatics Platform

Michael Huang, Product Manager, Digital Science, Thermo Fisher Scientific

Digital technology is changing every aspect of the world around us and the lab is no exception. Streamlining laboratory processes is important in creating efficiency and harmonization in the laboratory. Thermo Fisher Scientific™ Platform for Science™ delivers integration and connectivity capabilities, bringing data together into a single lab informatics solution.

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing


4:00 Chairperson’s Remarks

Dana Vanderwall, PhD, Director, Biology & Preclinical Sciences IT, Research & Development IT, Bristol-Myers Squibb

4:05 Umbrella: Providing FAIR Data Management from the Point of Entry

Frida Thorsteinsdottir, Head, Clinical & Biomarker Informatics, Roche

We present a novel informatics landscape that provides a flexible way to define, ingest, check quality, store and share data from complex, largely outsourced clinical studies. This landscape additionally supports efficient sample tracking, provides data analysis capabilities and offers a variety of scientific dashboards to provide real-time insights of the status of ongoing studies.

4:30 A Journey to Build an Ecosystem of Data, Technology, and Culture for Insight Generation

Hongmei Huang, PhD, Senior Director, Head.of Development Sciences Informatics, Development Sciences, Roche Genentech

We aim to transform drug development and translational sciences by establishing the data and informatics ecosystem. Where do you even start when data are not readily accessible and are constantly evolving? How do you mobilize the organization to change the mindset for data sharing? This talk will share with you the journey we are on to revamp the data management practice and to build the end-to-end engine for FAIRification of our key data assets. We will share the challenges we have overcome and the successes leading to scientific impacts.

4:50 Making Drug Discovery Data FAIR: The Yawning Gap between Aspiration and Implementation

Christopher Southan, PhD, Principal Consultant, TW2Informatics

The FAIRification of data is gaining impetus. However, for drug discovery, the envisaged increased flow of structures and bioactivity into major public databases, such as PubChem, has not happened. Reasons will be reviewed, but a key impediment is that even when supplementary data from journal papers is submitted to open repositories, such as figshare, there is neither push nor pull into PubChem. Ways to ameliorate this major bottleneck will be discussed, including bypassing the entombment of chemistry in PDFs.

5:05 Design Hub for Early Phase Drug Discovery

Andras Stracz, Head Developer, Drug Design, Hub Development, ChemAxon

We will introduce ChemAxon’s platform for integration of a variety of data sources and services to augment real-time design. We cover two use cases: 1) how MMP analysis supports design out of hERG liability; and 2) how a search on >500M molecules, from combined public databases, can be analyzed in seconds.

5:20 Sponsored Presentation (Opportunity Available)


5:35 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing

6:45 End of Day

Thursday, April 23

7:30 am Registration Open and Morning Coffee

8:00 Organizer’s Remarks

Cindy Crowninshield, RDN, LDN, Executive Event Director, Cambridge Healthtech Institute




8:05 Awards Program Introduction

8:10 Benjamin Franklin Award and Laureate Presentation

J.W. Bizzaro, Managing Director,




Discngine8:35 Bio-IT World Innovative Practices Awards

Allison Proffitt, Editorial Director, Bio-IT World




9:00 AI in Pharma: Where We Are Today and How We Will Succeed in the Future

Natalija Jovanovic, PhD, Chief Digital Officer, Sanofi Pasteur




9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced at 10:00


10:30 Organizer’s Remarks

Cambridge Healthtech Institute

10:35 Chairperson’s Remarks

10:40 Powering Question-Driven Problem Solving to Improve the Chances of Finding New Medicines

Samiul Hasan, PhD, Scientific Analytics and Visualization Director, Data and Computational Sciences, GlaxoSmithKline

Making true “molecule”-“mechanism”-“observation” relationship connections is a time-consuming, iterative and laborious process. In addition, it is very easy to miss critical information that affects key decisions or helps make plausible scientific connections. The current practice for deciphering such relationships frequently involves subject matter experts (SMEs) requesting resources from resource-constrained data science departments to refine and redo highly similar ad hoc searches. The result of this is impairment of both the pace and quality of scientific reviews. In this presentation, I show how semantic integration can be made to ultimately become part of an integrated learning framework for more informed scientific decision-making. I will take the audience through our pilot journey and highlight practical learnings that should inform subsequent endeavors.

11:10 Computational Efforts on Drug Repurposing for Rare Diseases

Bin Li, PhD, Director, Computational Biology, Takeda Pharmaceutics

We conducted in silico screens trying to repurpose >100 compounds for ~4000 rare disease indications. Various data types were utilized (protein-protein interaction network, pathways, disease-driven genes, competitive intelligence, etc), and different computational methods were implemented and evaluated. Some biologically interesting drug/disease pairs were observed.

11:40 Presentation to be Announced

12:10 pm Session Break

12:20 Luncheon Presentation I to be Announced

12:50 Luncheon Presentation II (Sponsorship Opportunity Available)

1:20 Dessert Refreshment Break in the Exhibit Hall with Last Chance Poster Viewing


1:55 Chairperson’s Remarks

Bino John, PhD, Associate Director, Data Science – Clinical Pharmacology & Safety Sciences, AstraZeneca R&D

2:00 KEYNOTE PRESENTATION: Accelerated Drug Development Using AI

Bino John, PhD, Associate Director, Data Science – Clinical Pharmacology & Safety Sciences, AstraZeneca R&D

Drug development is an expensive and costly endeavor, costing on an average of 2.6 billion dollars to bring a drug to market. Data science and artificial intelligence are essential in reducing the costs and time to bring these to the clinic. This talk will highlight some of the current initiatives in analytics at AstraZeneca, spanning chemical and biological data. The talk will provide specific use cases in which we use AI to improve drug design and develop safer medicines.

2:30 Machine-Learned Molecular Models for Protein Structure, Networks, and Design

Mohammed AlQuraishi, PhD, Systems Biology Fellow, Harvard Medical School

3:00 Progress in Diagnosing Rare Disease Patients Leveraging NLP

Tom Defay, Senior Director, R&D Strategy and Alliances, SPMD, Strategy, Program Management and Data Sciences, Alexion

3:30 Mining Drug-Target-Disease Trends from Public Data Sources

Peter Henstock, PhD, AI & Machine Learning Technical Lead, Pfizer

4:00 Close of Conference

Platinum Sponsors