AI for Drug Discovery and Development Image

The AI for Drug Discovery and Development track will discuss opportunities and challenges that biopharma organizations are experiencing in harnessing the power of artificial intelligence and machine learning technologies to maximize and accelerate drug discovery and development efforts from early stage to adoption to practical application. Speakers will explore the role of AI in transforming disease understanding and target ID, approaches using AI and human expertise to help identify and deliver validated targets, as well as enhance chemical drug design and precision medicine.

Monday, September 20

7:30 am Registration Open
8:00 am Recommended Pre-Conference Workshops*

Cambridge Healthtech Institute is pleased to offer morning and afternoon pre-conference workshops on Monday, September 20, 2021. They are designed to be instructional, interactive and provide in-depth information on a specific topic. They allow for one-on-one interaction and provide a great way to explain more technical aspects that would otherwise not be covered during the main conference tracks that take place Tuesday-Wednesday. 

*Separate registration required. See Workshop page for details.

9:30 am Break
9:45 am Recommended Pre-Conference Workshops*
11:15 am Enjoy Lunch on Your Own
12:45 pm Recommended Pre-Conference Workshops*
2:15 pm Break
2:30 pm Recommended Pre-Conference Workshops*
4:00 pm Session Break and Transition to Plenary Keynote


4:15 pm Innovative Practices Awards – Winners Spotlight

Pharma Executive Roundtable: Broadening the Data Ecosystem

Panel Moderator:
Lita Sands, Head, Life Sciences, Amazon Web Services

The Bio-IT World community employed creativity, problem solving, and technical ingenuity to weather 2020 and never was the work more important. Meanwhile, digitization has been broadening the horizons of new possibilities and initiatives that are driving innovation in the life sciences sector. While over the past year many pharmaceutical companies have seen an acceleration of digital transformation, there are still many that are unsure what to expect going forward. Digital transformation is now a strategic imperative, not a buzzword. Join our Pharma Executive Roundtable to discover how biopharma companies are broadening their digital strategies and capabilities to develop products and services to scale, streamline operations, and drive innovation in life sciences R&D. 

Ramesh V. Durvasula, PhD, Vice President & Information Officer, Research Labs, Eli Lilly & Co.
Michael Montello, Senior Vice President, R&D Tech, GlaxoSmithKline
Bryn Roberts, PhD, Senior Vice President & Global Head of Data Services, Roche
Holly Soares, PhD, Vice President & Head, Precision Medicine, Pfizer Inc.
Lihua Yu, Chief Data Officer, FogPharma
5:45 pm Welcome Reception in the Exhibit Hall with Poster Viewing
7:00 pm Close of Day

Tuesday, September 21

7:00 am Registration Open and Morning Coffee


8:00 am Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Bio-IT World Conference & Expo

8:05 am Chairperson's Remarks
8:10 am

AI: Earlier Cancer Detection

Jake Orville, Pipeline General Manager, Exact Sciences

Cancer is detected too late, but the innovation around multi-cancer early detection tests brings hope that we can change the detection paradigm and the outlook of a cancer diagnosis. This presentation will dive into the development of tools to catch cancer early and change how cancer is treated, and the importance of providing life changing answers.

8:40 am

Intelligent Machines Take on Clinical Data Management

Prasanna Rao, Head, AI & Data Science, Data Monitoring and Management, Clinical Sciences and Operations, Global Product Development, Pfizer Inc.

Artificial Intelligence and Machine Learning (ML)  are over-hyped with very high expectations from various stake holders. Many organizations have embarked on this journey and are in various stages of implementation. In this session we will discuss Pfizer's AI/ML journey in Clinical Development, a few successful outcomes, best practices, and the successful use of NLP/ML in Pfizer's portfolio.

9:10 am Coffee Break in the Exhibit Hall with Poster Viewing
Mike Lelivelt, PhD, Vice President, DNAnexus

Being able to predict whether an individual is more susceptible to Adverse Drug Reactions (ADRs) is incredibly useful in both research and clinical context. Pharmacogenomics companies can leverage the UK Biobank, a large-scale biomedical database, to better understand how genes affect a person’s response to drugs. Learn how DNAnexus Apollo efficiently analyzes this massive dataset with explainable machine learning models to gain insight into ADRs.

10:30 am

PIES – Novel AI to Automatically Generate Textual Narratives for Regulatory Reports, Based on Data from Experiments

Bing Chen, PhD, Senior Principal Data Scientist, DevSci Informatics, Genentech, Inc.

We introduce Pangaea's Intelligence Extraction and Intelligence Extraction and Summarization (PIES), a neural architecture, which uses a unique data augmentation procedure to address data sparsity (training with only 200 examples). Coupled with the copy mechanism, PIES ensures model interpretability and precision of values, which appear in the output (textual narratives). PIES is generalizable for various input datasets. The outputs are validated by human experts which allows efficient model improvement with minimal human effort.

Jingqing Zhang, Head of AI, Technology, Pangaea Data

Pangaea's Intelligence Extraction and Summarization (PIES) incorporates novel unsupervised NLP and NLG demonstrating significantly higher accuracy and real world application against generic language models like GPT-3. PIES has effectively helped the pharmaceutical industry to: discover new clinical features to characterize hard to diagnose conditions and find more patients, especially undiagnosed and misdiagnosed; auto-generate textual narratives for regulatory reports; extract and format adverse events; summarize patient records; generate synthetic data.

12:00 pm Session Break and Transition to Exhibit Hall
12:15 pm Refreshment Break in the Exhibit Hall with Poster Viewing


Michael Stapleton, PhD, Managing Director, Life Sciences, Accenture
1:15 pm

How Digital Evolution and an Attitudinal Revolution are Re-Shaping the Future of the Life Sciences Industry

Nimita Limaye, PhD, Research Vice President, Life Sciences R&D Strategy and Technology, IDC

The world has rapidly transitioned to a model of disaggregated care and decentralized clinical trials, with a heightened focus on patient-centricity. Digital resiliency has become the priority and discretionary spend on R&D platforms has been delayed. Federated-learning models are fueling co-innovation and GPU-powered transformer models are accelerating drug discovery. Technology is enabling access and equity. The borders between healthcare and life sciences are blurring and real-world data is being leveraged to drive a precision medicine strategy.

1:50 pm

All of Us Research Program – Seeking To Advance Precision Health for All Populations

Joshua Denny, MD, MS, CEO, All of Us Research Program, National Institutes of Health

The All of Us Research Program launched May 6, 2018 and currently has over 375,000 participants who have contributed biospecimens, health surveys, and a willingness to share their EHR. Participants are partners in the program and receive research results from data they contribute, including genetic ancestry and traits. In the future, participants will also receive health-related genomic results from whole genome sequencing. In May 2020, the program launched the beta version of the Researcher Workbench. Once researchers register and are approved to use the workbench, they can access individual-level data and a suite of tools to analyze these data. All of Us is committed to catalyzing a robust ecosystem of researchers and providing a rich dataset that drives discovery and improves health.

2:30 pm Refreshment Break in the Exhibit Hall with Poster Viewing
3:00 pm Chairperson's Remarks
3:05 pm

Artificial Intelligence enabled De Novo design of novel compounds that are synthesizable

Govinda R Bhisetti, PhD, Principal Investigator & Head, Computational Chemistry, Biogen

Development of computer-aided de novo design methods to discover novel compounds in a speedy manner to treat human diseases has been of interest to drug discovery scientists for the past three decades. In the beginning, the efforts were mostly concentrated to generate molecules that fit the active site of the target protein by sequential building of a molecule atom-by-atom and/or group-by-group while exploring all possible conformations to optimize binding interaction with the target protein. In recent years, deep learning approaches are applied to generate molecules that are iteratively optimized against a binding hypothesis (to optimize potency) and predictive models of drug-likeness (to optimize properties). Synthesizability of molecules generated by these de novo methods remains a challenge. This review will focus on the recent development of synthetic planning methods that are suitable for enhancing synthesizability of molecules designed by de novo methods.

3:35 pm

Accelerating Drug Discovery by Embedding Biomedical Knowledge Graphs into Research Workflows

Timothy J. Schultz, PhD, Principal Scientist, Data Science, Janssen Pharmaceuticals, Inc.

Biomedical knowledge graphs have exploded in popularity in past years, providing a framework for integrating and querying insights which conventionally have been siloed.  What was once novelty has evolved into a powerful commodity for accelerating drug discovery by enabling unprecedented views of biology. Janssen has conceptualized knowledge graphs beyond the convention of standalone network visualizations and rather embedded them within research workflows, facilitating adoption across its therapeutic areas. Recent use-cases will be highlighted to demonstrate the platform’s (1) extensibility in performing diverse analyses through its API (2) integration with Jupyter Notebooks (3) coupling with raw biology to surface novel targets.

4:05 pm Refreshment Break in the Exhibit Hall with Poster Viewing
Archna Bhandari, Executive Vice President, Data and Analytics,

Data is critical for biopharma organizations, but this data often comes in the form of language – unstructured and varied across sources – making it difficult to process and gain insights at scale. This session will dive into how you can automate human-like understanding of language data, allowing for large quantities of scientific literature, clinical studies, and medical reports to be analyzed.  Reduce the hours spent reading documentation to empower your organization with accelerated drug discovery and development efforts and cutting-edge medical innovations.

Shimon Ben-David, CTO, Engineering, WekaIO

GPUs and AI/ML are used in multiple markets to accelerate results, or completely enable workflows that wouldn’t have been possible. At the core of these use cases is a requirement for massive amounts of data, and the need to enable researchers with the agility needed to conduct experiments and research. This session will review a use case of AI/ML for drug discovery and how its enabled by the WekaIO data platform.

5:35 pm Networking Reception in the Exhibit Hall with Poster Viewing
6:35 pm Close of Day

Wednesday, September 22

7:30 am Registration Open
8:00 am Interactive Discussions

Interactive Discussions are informal, moderated discussions, allowing participants to exchange ideas and experiences and develop future collaborations around a focused topic. Each discussion will be led by a facilitator who keeps the discussion on track and the group engaged. For in-person events, the facilitator will lead from the front of the room while attendees remain seated. For virtual attendees, the format will be in an online networking platform. To get the most out of this format, please come prepared to share examples from your work, be a part of a collective, problem-solving session, and participate in active idea sharing. Please visit the website's Interactive Discussions page for a complete listing of topics and descriptions.

9:00 am Coffee Break in the Exhibit Hall with Poster Viewing


9:45 am Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Bio-IT World Conference & Expo

9:50 am

Chairperson's Remarks

Michael Liebman, PhD, Managing Director, IPQ Analytics, LLC
9:55 am

Unlocking the Value of R&D Data by Implementing a Semantic Knowledge Graph

Sabine Schefzick Jalaie, PhD, Director Advanced Analytics Platform, Science & Clinical Analytics & Analytic Innovation, Pfizer Inc.

This talk will summarize Pfizer's Knowledge Graph journey and highlight their lessons learned. By leveraging Knowledge Graph technology to make R&D data FAIR, Pfizer can surface actionable and meaningful insights and make knowledge accessible to and consumable by domain users. Pfizer's KG implementation helps accelerate R&D data discovery and understanding. It supports data scientists, researchers, scientists, and clinical and project leads in bringing medicine to the market faster by enabling advanced analytics on demand.

10:25 am

Improving Medical Phenotypes: Are Big Data and EHRs Enough?

Michael Liebman, PhD, Managing Director, IPQ Analytics, LLC
Jonathan Morris, MD, Vice President, Provider Solutions; Chief Medical Informatics Officer, Real World Insights, IQVIA
Nick Sarlis, MD, PhD, Executive Medical Advisor, The Lynx Group, LLC

An accurate medical phenotype, i.e.; diagnosis, is essential to optimize patient treatment and outcome, to improve drug development and clinical trial performance and to refine reimbursement policies. Advanced analytics can provide insights but more critically it is the quality, not the quantity, of both the data and the clinical questions to be addressed. Examples using multiple sclerosis, heart failure and triple negative breast cancer will be discussed.

Thomas Hasaka, PhD, Scientific Account Manager, Genedata

By automating HCS image analysis across Biopharma R&D, the Bio-IT award-winning software Genedata Imagence enables scientists to apply deep learning methods to simple and complex phenotypic imaging assays. Intuitive and interactive data representations enable scientists to train neural networks within minutes for fast, automated analysis of production screens. The scalable enterprise architecture of Genedata Imagence can use on-premise compute and cloud resources to serve a global organization.

Yev Monisova, Manager, Life Sciences Practice, Kanda Software

There is a large amount of data generated outside of clinical trials. Real World Evidence (RWE), the analytical insights from Real World Data (RWD), plays an increasingly important role for Life Science companies both internally and with external stakeholders. However, data silos and interoperability challenges can limit the usability of data. With the right technology and ML/AI analytics infrastructure, different stakeholders can generate powerful RWE insights and visualizations.

11:55 am Session Break and Transition to Luncheon Presentation
Greg Mazzu, Strategic Accounts Manager, Sales, WekaIO

Health and Life Sciences organizations’ storage systems, infrastructure and applications have grown organically as new data sources and associated workflows were developed to meet demands of innovative projects like genomic sequencing, drug discovery, and Cryo-EM.

This session will discuss:

- Notes from the field: Overcoming legacy environments

- Leveraging flexibility in hybrid-cloud drug discovery

- Simplifying data ownership at any scale whether on-prem, hybrid or cloud-native




12:55 pm Session Break and Transition to Exhibit Hall
1:10 pm Refreshment Break in the Exhibit Hall with Poster Viewing



Trends from the Trenches

Panel Moderator:
Kevin Davies, PhD, Executive Editor, The CRISPR Journal; Founding Editor, Bio-IT World

Since 2010, the “Trends from the Trenches” presentation, given by Chris Dagdigian, has been one of the most popular annual traditions on the Bio-IT Program. The intent of the talk is to deliver a candid (and occasionally blunt) assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. The presentation has helped scientists, leadership, and IT professionals understand the basic topics related to computing, storage, data transfer, networks, cloud, data science, and machine learning that are involved in supporting data-intensive science. In 2021, Chris will give the “Trends from the Trenches” presentation in its original “state-of-the-state address” followed by guest speakers giving podium talks on relevant topics. An interactive Q&A moderated discussion with the audience follows. Come prepared with your questions and commentary for this informative and lively session. To stay connected with Trends from the Trenches updates after today and all year, sign up for BioTeam's newsletter here:

Chris Dagdigian, Senior Director, BioTeam, Inc.
Fernanda S. Foertter, PhD, Director of Applications, NextSilicon
Karl Gutwin, PhD, Director, Software Engineering Services, BioTeam, Inc.
Adam Kraut, Director Infrastructure & Cloud Architecture, BioTeam, Inc.
3:30 pm Refreshment Break in the Exhibit Hall with Poster Viewing


4:00 pm Chairperson's Remarks
4:05 pm

AI Applications in Drug Development Using Real-World Data

Xiong Sean Liu, PhD, Director, Data Science & Artificial Intelligence, Novartis

Advancements in AI technologies, such as machine learning and deep learning, have provided new strategies to analyze large, multidimensional real-world data (RWD). I will provide an overview of the drug development studies that use both AI and RWD based on a review of articles from the past 20 years. I will also discuss current research gaps and future opportunities.

4:35 pm

Vaxi-DL: A Web-based Deep Learning Server to Identify Potential Vaccine Candidates

Kamal Rawal, PhD, Associate Professor, Bioinformatics & Computational Biology, Amity University

As infectious diseases such as COVID19 are raging the world, the demand for new vaccines is all time high. Current vaccine candidate identification approaches are ill suited to find vaccine candidates at whole proteome level. Vaxi-DL combines the strength of deep learning systems, immunoinformatics and text mining algorithms to achieve a combination of high speed, sensitivity and accuracy to predict vaccine candidates. Antigen identification is an important step in the vaccine development process. Here we present Vaxi-DL (, a web-based deep learning (DL) software that evaluates the potential of individual protein sequences as vaccine target antigens. It is designed to predict vaccine candidates for bacteria, protozoa, fungi, and viruses that cause infectious diseases in humans. The average validation accuracy obtained from the five iterations of the bacterial, protozoan, fungal, and viral models are 93%, 96%, 95%, and 92% respectively.

5:35 pm Close of Conference

Submit Your Speaker Proposal

Data Platforms and Storage Infrastructure