Original Agenda
We are actively working with our speakers to confirm their availability for our new dates. Initial response from our speakers has been very positive, and we are optimistic we will have the new programs ready to share here soon.

Data and Metadata Management

With the increased demand in computing power from life science researchers and scientists tackling big data issues, storage and infrastructure must be able to scale to handle billions of data points and files efficiently. The problem is administration of data to ensure information can be integrated, accessed, shared, linked, analyzed, and maintained to best effect across the organization. The Data and Metadata Management track will explore how to manage workflows with data and metadata without rerunning everything, but with the ability to handle data updates and new versions of the software. We will also explore how to associate the processed data and features with the raw data for analysis purposes.

Final Agenda

Monday, April 20

9:00 am - 5:00 pm Hackathon*

*Pre-registration required.

Tuesday, April 21

7:30 am Workshop Registration Open and Morning Coffee

8:30 am - 3:30 pm Hackathon*

*Pre-registration required.

8:30 - 11:30 am Recommended Morning Pre-Conference Workshops*

W1. Data Management for Biologics: Registration and Beyond

Diana Bowley, Business Relationship Manager, Biologics, AbbVie

Benjamin Li, Head of IT RDM Biological Sample Production & Management, Boehringer Ingelheim

Yuan Lin, Senior Manager, Pfizer Digital, Pfizer

Sebastian Schlicker, Director, Biologics Business, Genedata, Basel, Switzerland

Monica Wang, PhD, Principal Bioinformatics Architect, Project and Program Manager, Global Research IT, Takeda

12:30 - 3:30 pm Recommended Afternoon Pre-Conference Workshops*

W10. Data Science Driving Better Informed Decisions

Meghan Raman, Head, R&D Data Lake and Analytics, Bristol-Myers Squibb

Nigel Greene, PhD, Director, Head, Data Science and Artificial Intelligence, Imaging and Data Analytics, Clinical Pharmacology and Safety Sciences, AstraZeneca

Farhan (CJ) Hameed, MD, MS, VP, Global Real World Data – Strategy, Analytics & Insights (GRWD-SAI), Analytics, Informatics & Business IntelIigence (AIBI), Pfizer Digital

*Separate registration required.

2:00 - 6:30 Main Conference Registration Open


4:00 Welcome Remarks

Cindy Crowninshield, RDN, LDN, Executive Event Director, Cambridge Healthtech Institute




4:05 Keynote Introduction

4:15 PLENARY KEYNOTE PRESENTATION: NIH’s Strategic Vision for Data Science

Susan K. Gregurick, PhD, Associate Director, Data Science (ADDS) and Director, Office of Data Science Strategy (ODSS), National Institutes of Health





Rebecca Baker, PhD, Director, HEAL (Helping to End Addiction Long-term) Initiative, Office of the Director, National Institutes of Health





Riffyn_new 5:00 - 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing



Wednesday, April 22

7:30 am Registration Open and Morning Coffee


8:00 Welcome Remarks

Allison Proffitt, Editorial Director, Bio-IT World




8:05 Keynote Introduction

8:15 Toward Preventive Genomics: Lessons from MedSeq and BabySeq

Robert Green, MD, MPH, Professor of Medicine (Genetics) and Director, G2P Research Program/Preventive Genomics Clinic, Brigham & Women’s Hospital, Broad Institute, and Harvard Medical School




8:45 PANEL DISCUSSION: Game On: How AI, Citizen Science, and Human Computation Are Facilitating the Next Leap Forward

Seth CooperSeth Cooper, PhD, Assistant Professor, Khoury College of Computer Sciences, Northeastern University






Lancashire_LeeLee Lancashire, PhD, Chief Information Officer, Cohen Veterans Bioscience






Pietro Michelucci, PhD, Director, Human Computation Institute






Jérôme WaldispühlJérôme Waldispühl, PhD, Associate Professor, School of Computer Science, McGill University






While the precision medicine movement augurs for better outcomes through targeted prevention and intervention, those ambitions entail a bold new set of data challenges. Various panomic and traditional data streams must be integrated if we are to develop a comprehensive basis for individualized care. However, deriving actionable information requires complex predictive models that depend on the acquisition and integration of patient data on a massive scale. This picture is further complicated by new data streams emerging from quantified self-tracking and health social networks, both of which are driven by experimentation-feedback loops. Tackling these issues may seem insurmountable, but recent advancements in human/AI partnerships and crowdsourcing science adds a new set of capabilities to our analytic toolkit. This talk describes recent work in online collective systems that combine human and machine-based information processing to solve biomedical data problems that have been otherwise intractable, and an information processing ecosystem emerging from this work that could transform the landscape of precision medicine for all stakeholders.


9:45 Coffee Break in the Exhibit Hall with Poster Viewing


10:50 Organizer’s Welcome Remarks

Cambridge Healthtech Institute

10:55 Chairperson’s Remarks

11:00 Building a Toolkit for FAIR Implementation by Life Science Industry

Harrow_IanIan Harrow, PhD, Consultant Project Manager and Manager, FAIR Implementation and Ontologies Mapping Project, Pistoia Alliance

We report on building a new toolkit to help life science industry implement the FAIR (Findable, Accessible, Interoperable, Reusable) principles for data management and stewardship. It provides practical support by bringing together relevant methods for tools, training and managing change, which are illustrated by use cases mostly from life science industry. These elements are assembled together as one user-friendly and freely accessible website.

11:20 Normalizing Adverse Events Terminologies for Text Processing

Qais Hatim, PhD, Computer Scientist, U.S. Food and Drug Administration

The FDA receives a high proportion of data as unstructured text. Natural Language Processing (NLP) is used to normalize information for further analysis or machine learning. A challenge is the variation in the ways that concepts are referred. Although there are a number of open source terminologies, not all are designed for text processing. We will describe how to adapt terminologies and extend matching, e.g., for spelling or OCR errors. This work will highlight the way to import new ontology that will be designed to be in agreement with FDA standards in different fields such as drug label, NDA, etc.

11:40 A New Compound Platform for Enhanced Access to Chemical Space for Screening

Lange_MartinMichael Lange, ML/AI Lead, R&D Informatics, Small Molecule

Discovery Informatics, Roche

Over the last years, the commercially available chemical space (with pharmaceutical relevance) has rapidly increased. Several providers today are offering catalogs consisting of several hundred millions of screening compounds. We built a new compound platform to enable browsing, searching, selection, and ordering of compound sets from these libraries. The platform offers these capabilities by standardizing and preprocessing all molecules, calculating relevant properties, and enabling access to these libraries by combining fast structure-based search with property and metadata filters. This presentation will present the overall architecture and highlight some of the challenges encountered during the implementation.

12:00 pm Sponsored Presentation (Opportunity Available)

12:30 Session Break

Igneous_Horizontal 12:40 LUNCHEON PRESENTATION I: Petabytes Today, Exabytes Tomorrow: What Now?

Adam Marko Scientific Solutions Lead Engineering Igneous

Data generation rates in the life sciences continue to increase. NGS, imaging data, and other scientific instruments and applications regularly stress storage systems, and few organizations are prepared to handle this challenge. Pieced together solutions are unable to meet the needs of research. We will present how Igneous solutions can enable a storage management plan that allows organizations to simplify their NAS footprint and eliminate the need for heterogeneous applications and manual intervention.

1:10 Luncheon Presentation II (Sponsorship Opportunity Available)

1:40 Session Break


1:50 Chairperson’s Remarks

Brian Bissett, MBA, MSEE, FAC P/PM, IT Specialist, Hardware Engineering, US Government

1:55 Defending against the Persistence of Inevitability

Bissett_BrianBrian Bissett, MBA, MSEE, FAC P/PM, IT Specialist, Hardware Engineering, US Government

Most data breaches represent a systemic breakdown along multiple lines of both technical and human factors. While many factors can contribute to an unauthorized release, the effort necessary to protect against these factors is not equal. This discussion will be from a holistic viewpoint of many security breaches, the breakdowns in fundamental security concepts which lead to the breaches, and the factors of paramount consideration in protecting an enterprise.

2:15 Data Security and Governance for Biopharma

Gambhir_JyotinJyotin Gambhir, MBA, CISM, Founder, SecureFLO

Governance provides a playbook for a biopharma company to manage security and privacy compliance. Good governance leads to a better managed goal and a focused IT environment. CyberHygiene today is critical for any company developing a drug or researching cures and trying to protect intellectual property, as well as subjects’ personal information. Regulations under FDA and FTC, as well as EU GDPR, can be complicated.

2:35 Dynamic Encryption and Watermarking of Genomic Sequencing Data to Facilitate Privacy-Preserving Ownership-Based Data Governance

Gai_XiaowuXiaowu Gai, PhD, Director, Bioinformatics; Associate Professor, Clinical Pathology, Pathology & Laboratory Medicine, Children’s Hospital of Los Angeles

To facilitate privacy-preserving ownership-based data governance, we developed two novel algorithms which can be used to implement flexible fine-grained protection of genomic data: a) dynamic privacy-preserving encryption of user-specified genomic regions; and b) ownership and utility-preserving watermarking of the sequencing data. This empowers individuals to control when, for how long, and for what purpose any portion of their genomic data is shared, all in an auditable manner.

2:55 Sponsored Presentation (Opportunity Available)

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing


4:00 Chairperson’s Remarks

Sanjay Joshi, Industry CTO, Healthcare, Dell EMC

4:05 PANEL DISCUSSION: Real-World Evidence (RWE): Data Provenance, Format, Ingest, Quality (Bias), Integration, Visualization, Transformation, Verification & Validation, and Implementation


Joshi_SanjaySanjay Joshi, Industry CTO, Healthcare, Dell EMC


Gamerman_VictoriaVictoria Gamerman, PhD, Head of US Health Informatics and Analytics, Boehringer Ingelheim

Goetz_LauraLaura Goetz, FACS, MD, MPH, Assistant Clinical Professor, Department of Medical Oncology & Therapeutics Research, Division of Clinical Cancer Genomics, City of Hope Comprehensive Cancer Center

Mills_Shaw_KennaKenna Mills Shaw, PhD, Executive Director, Institute for Personalized Cancer Therapy, MD Anderson Cancer Center

Kelly Zou, PhD, PStat®, ASA Fellow, Vice President, Head of Medical Analytics and Insights Research, Development and Medical, Upjohn Division, Pfizer, Inc.

The future of the intersection of healthcare and the life sciences will be data- and process-focused, not application- or software-focused. “Bringing the analytics to Data” is the challenge from an infrastructure and methods perspective. According to the FDA, Real-World Evidence (RWE) is defined as “the clinical evidence regarding the usage and potential benefits or risks of a medical product derived from analysis of Real-World Data (RWD): e.g., effectiveness or safety outcomes from an RWD source in randomized clinical trials or in observational studies.” Our topical, honest, and “real-world” panel will discuss the sources of RWD (EHR, Claims & Billing, Registries, Patient Reported Data, etc.) and their process implications for RWE and the future of clinical trials themselves.

5:05 Presentation to be Announced

5:20 Sponsored Presentation (Opportunity Available)






5:35 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing






6:45 End of Day

Thursday, April 23

7:30 am Registration Open and Morning Coffee


8:00 Organizer’s Remarks

Cindy Crowninshield, RDN, LDN, Executive Event Director, Cambridge Healthtech Institute




8:05 Awards Program Introduction

8:10 Benjamin Franklin Award and Laureate Presentation

J.W. Bizzaro, Managing Director, Bioinformatics.org





Discngine8:35 Bio-IT World Innovative Practices Awards

Allison Proffitt, Editorial Director, Bio-IT World




9:00 AI in Pharma: Where We Are Today and How We Will Succeed in the Future

Natalija Jovanovic, PhD, Chief Digital Officer, Sanofi Pasteur




Penguin_Computing_Tagline 9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced at 10:00




10:30 Organizer’s Remarks

Cambridge Healthtech Institute

10:35 Chairperson’s Remarks

10:40 Cascadia Data Discovery Initiative: Accelerating Health Innovation and Cancer Research through Collaboration, Data Sharing, and Data-Driven Research

Trunnell_MatthewMatthew Trunnell, Vice President and Chief Data Officer, Fred Hutchinson Cancer Research Center

11:10 The National Microbiome Data Collaborative: A FAIR Data Resource for Microbiome Research

Fagnan_KjierstenKjiersten Fagnan, PhD, Chief Informatics Officer, Data Science and Informatics Leader, DOE Joint Genome Institute, Lawrence Berkeley National Laboratory

Our multi-lab collaborative partnership will pilot an integrated, community-centric framework within 27 months to fully leverage existing microbiome data science resources and high-performance computing systems available within the DOE complex for data access, integration, and advanced analyses. In this talk I will cover some of the challenges in microbiome data sciences and how we aim to overcome these by creating a large, open-access repository of FAIR data.

11:40 Sponsored Presentation (Opportunity Available)

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Last Chance Poster Viewing


1:55 Chairperson’s Remarks

Kevin Davies, PhD, Executive Editor, The CRISPR Journal; Founding Editor, Bio-IT World



Dagdigian_ChrisChris Dagdigian, Co-Founder and Senior Director, Infrastructure, BioTeam, Inc.


Bonazzi_VivienVivien Bonazzi, PhD, Chief Biomedical Data Scientist, Managing Director, Deloitte


Cutts_TimTim Cutts, PhD, Head, Scientific Computing, Wellcome Trust Sanger Institute


Fagnan_KjierstenKjiersten Fagnan, PhD, Chief Informatics Officer, Data Science and Informatics Leader, DOE Joint Genome Institute, Lawrence Berkeley National Laboratory


Trunnell_MatthewMatthew Trunnell, Vice President and Chief Data Officer, Fred Hutchinson Cancer Research Center


The “Trends from the Trenches” will celebrate its 10th Anniversary at Bio-IT! Since 2010, the “Trends from the Trenches” presentation, given by Chris Dagdigian, has been one of the most popular annual traditions on the Bio-IT Program. The intent of the talk is to deliver a candid (and occasionally blunt) assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. The presentation has helped scientists, leadership, and IT professionals understand the basic topics related to computing, storage, data transfer, networks, and cloud that are involved in supporting data-intensive science. In 2020, Chris will give the “Trends from the Trenches” presentation in its original “state-of-the-state address” followed by guest speakers giving podium talks on relevant topics. An interactive Q&A moderated discussion with the audience follows. Come prepared with your questions and commentary for this informative and lively session.

4:00 Close of Conference

Platinum Sponsors