Cloud Computing

Cloud computing has become the platform enterprises utilize for their application of analyzing, storing, processing, exploring, and sharing dynamic data. These data-intensive life scientists from biological researchers to biopharmaceutical organizations demand this practicality and necessity. Thus, adoption has been greater than anyone expected, and users continue to expand applications. Through case studies, the Cloud Computing track explores the rapid growth and progressive maturation of cloud as well as evolving provider and user experiences and challenges. Some of these challenges include how to handle new equipment pumping data into cloud when most internal tools run locally, reproduce processes and new workflows, accelerate research and find new ways to collaborate, remove data storage and processing bottlenecks, and make significant business impact across R&D in enabling large-scale modeling and simulation. How will these activities be impacted by the evolution of quantum computing methods and eventually computers?

Final Agenda

Monday, April 20

9:00 am - 5:00 pm Hackathon*

*Pre-registration required.

Tuesday, April 21

7:30 am Workshop Registration Open and Morning Coffee

8:30 am - 3:30 pm Hackathon*

*Pre-registration required.

8:30 - 11:30 am Recommended Morning Pre-Conference Workshops*

W5. Giving the Personalized Digital Health Ecosystem a FAIRshake

Amir Lahav, ScD, Digital Health Innovation Consultant

Avi Ma’ayan, PhD, Professor, Department of Pharmacological Sciences; Director, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai

12:30 - 3:30 pm Recommended Afternoon Pre-Conference Workshops*

W12. Cancer Genome Analysis

Jeffrey Rosenfeld, PhD, Manager, Biomedical Informatics Shared Resource and Assistant Professor of Pathology and Laboratory Medicine, Rutgers Cancer Institute of New Jersey; President, Rosenfeld Consulting LLC

*Separate registration required.

2:00 - 6:30 Main Conference Registration Open

4:00 Welcome Remarks

Cindy Crowninshield, RDN, LDN, Executive Event Director, Cambridge Healthtech Institute




4:05 Keynote Introduction

4:15 PLENARY KEYNOTE PRESENTATION: NIH’s Strategic Vision for Data Science

Susan K. Gregurick, PhD, Associate Director, Data Science (ADDS) and Director, Office of Data Science Strategy (ODSS), National Institutes of Health





Rebecca Baker, PhD, Director, HEAL (Helping to End Addiction Long-term) Initiative, Office of the Director, National Institutes of Health





5:00 - 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)

Wednesday, April 22

7:30 am Registration Open and Morning Coffee

8:00 Welcome Remarks

Allison Proffitt, Editorial Director, Bio-IT World




8:05 Keynote Introduction

8:15 Toward Preventive Genomics: Lessons from MedSeq and BabySeq

Robert Green, MD, MPH, Professor of Medicine (Genetics) and Director, G2P Research Program/Preventive Genomics Clinic, Brigham & Women’s Hospital, Broad Institute, and Harvard Medical School




8:45 PANEL DISCUSSION: Game On: How AI, Citizen Science, and Human Computation Are Facilitating the Next Leap Forward

Pietro Michelucci, PhD, Director, Human Computation Institute






Additional Panelists to be Announced

9:45 Coffee Break in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)


10:50 Organizer’s Welcome Remarks

Cambridge Healthtech Institute

10:55 Chairperson’s Remarks

11:00 KEYNOTE PRESENTATION: The Road from Data Commons to Data Ecosystems: Challenges, Opportunities, and Emerging Best Practices

Robert Grossman, PhD, Frederick H. Rawson Distinguished Service Professor in Medicine and Computer Science; The Jim and Karen Frank Director, Center for Translational Data Science, University of Chicago

There are now several large-scale data commons supporting the biomedical research community and the beginnings of data ecosystems. In this talk, we discuss some of the emerging best practices around data ecosystems, as well as some of the challenges and opportunities. We also discuss some case studies of data commons and data ecosystems developing using the open source Gen3 data platform.

11:30 Harnessing Cloud for Mega-Biobanks: Efficient Computing with Sensible Data Governance

Saiju Pyarajan, PhD, Director, Center for Data and Computational Sciences, VA Boston Healthcare System; Faculty, Harvard Medical School

Google-Cloud-New12:00 pm Presentation to be Announced


12:15 Presentation to be Announced

12:30 Session Break

12:40 Luncheon Presentation I to be Announced

1:10 Luncheon Presentation II (Sponsorship Opportunity Available)

1:40 Session Break


1:50 Chairperson’s Remarks

John Dey, Senior HPC Systems Engineer, Fred Hutchinson Cancer Research Center

1:55 Evaluating Distributed Computing Infrastructures: An Empirical Study Comparing Hadoop Deployments on Cloud and Local Systems

Devipsita Bhattacharya, PhD, Assistant Professor, Information Security & Digital Forensics, University at Albany, State University of New York

This talk discusses current cloud-based distributed computing options and presents results from a study that compared cost and performance of cloud systems and in-house deployments. It is intended not only as an evaluation of infrastructural choices, but also proposes a metric framework that can serve as a baseline for researchers and practitioners examining distributed infrastructures on cloud.

2:25 Reproducible Software Stacks Everywhere - Cloud, Containers and OnPrem

John Dey, Senior HPC Systems Engineer, Fred Hutchinson Cancer Research Center

The Hutch uses EB for building software containers and all software for our computer cluster. Our software stack is published and citable and we share our work with the global community of EasyBuild users.

2:55 Supercharging Transformation across the Life Science Value Chain

Shez Partovi, MD, Director, Healthcare, Life Sciences, Genomics, Amazon Web Services

Learn how life science companies are working with AWS to transform their organizations, from creating labs of the future to enhancing clinical trials with artificial intelligence and machine learning, to developing digital therapeutics. Finally, hear how this transformation helps organizations drive top-line revenue and improve patient experiences.

3:10 Sponsored Presentation (Opportunity Available)

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing


4:00 Chairperson's Remarks

John Dey, Senior HPC Systems Engineer, Fred Hutchinson Cancer Research Center

4:05 Managing Large-Scale Lab Data in Cloud

Oleg Moiseyenko, Senior Cloud Architect, Scientific Computing Systems, Bristol-Myers Squibb

In days of high-speed internet and cloud computing, life sciences organizations still face significant difficulties moving laboratory data from labs to cloud. This includes but is not limited to typical challenges usually associated with big data as well as data validations, choosing effective data ingest mechanism, establish-ing the right data tagging systems, indexing for metadata catalogs, applying propriety encryption, systems qualification, and petabyte-scale data enrichment as well as data archival. This presentation highlights Bristol-Myers Squibb Company’s approach to these problems.

4:25 Mass Spectrometry Data Processing in the Cloud: From Biological Samples to Digital Results using Cloud Computing

Felipe Albrecht, PhD, Bioinformatics and Computer Scientist, Pharma Research and Early Development (pRED), Roche Diagnostics GmbH

In this presentation, we show how we are using cloud technologies for: (i) processing MS data through a cloud-based ETL pipeline for extracting features; (ii) analyzing MS data using in-house tools developed for Windows system in a cloud environment; and (iii) storing and organizing MS data and its metadata in a cloud-based Data Lake. The Data Lake allows researchers to access and explore the MS data through a convenient web interface, using metadata and data parameters, as well as the extracted data features. Finally, we present how our system uses the extracted features for automatic System Suitable Tests analysis, thus informing the researchers about the status and expected performance of the MS instruments.

4:45 Single Point of Truth for Target Identification and Validation – In-House Hosted OpenTargets

Sean Liu, PhD, Global Head, Scientific Assets & Decision Support, Scientific Informatics, Takeda Pharmaceuticals

Takeda has joined the Open Targets consortium and has installed the Open Targets platform on internal cloud platform to help our target discovery process. We will outline the motivation for this decision, give an overview of how we use the system and the benefits we derive from it.

5:05 Sponsored Presentation (Opportunity Available)


5:35 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing

6:45 End of Day

Thursday, April 23

7:30 am Registration Open and Morning Coffee

8:00 Organizer’s Remarks

Cindy Crowninshield, RDN, LDN, Executive Event Director, Cambridge Healthtech Institute




8:05 Awards Program Introduction

8:10 Benjamin Franklin Award and Laureate Presentation

J.W. Bizzaro, Managing Director,




Discngine8:35 Bio-IT World Innovative Practices Awards

Allison Proffitt, Editorial Director, Bio-IT World




9:00 AI in Pharma: Where We Are Today and How We Will.

Natalija Jovanovic, PhD, Chief Digital Officer, Sanofi Pasteur




9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced at 10:00


10:30 Organizer’s Remarks

Cambridge Healthtech Institute

10:35 Chairperson’s Remarks

10:40 Umbrella: Infrastructure-Enabling FAIR Data Management

Ludovic Sternberger, PhD, Principal Scientist, pRED Informatics, Hoffmann-La Roche

We present a microservice and cloud-based pipeline designed to flexibly define, ingest, and automatically check data quality from complex, largely outsourced clinical studies. We will introduce a Kubernetes and Cloud platform agnostic solution that allows checking data quality at scale with dozens of external providers.

11:10 PEPSI-KOLA: A Data-Driven Approach to Find and Connect with Thought Leaders

Rishi Gupta, PhD, Research Scientist, Information Research, AbbVie, Inc.

Within a Life Science organization, Key Opinion Leaders, or KOLs, are generally influential physicians or academic researchers who are respected and acknowledged as leaders in a subject area. In this work, we will showcase the KOL application utilized by business functions within AbbVie that uses several structured and unstructured data sources to evaluate each thought leader in qualitative and quantitative ways with the goal of optimizing selection and management.

11:40 Sponsored Presentation (Opportunity Available)

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Last Chance Poster Viewing



1:55 Chairperson’s Remarks

Kevin Davies, PhD, Executive Editor, The CRISPR Journal, Mary Ann Liebert, Inc.


Chris Dagdigian, Co-Founder and Senior Director, Infrastructure, BioTeam, Inc.

Vivien Bonazzi, PhD, Chief Biomedical Data Scientist, Managing Director, Deloitte

Tim Cutts, PhD, Head, Scientific Computing, Wellcome Trust Sanger Institute

Kjiersten Fagnan, PhD, Chief Informatics Officer, Data Science and Informatics Leader, DOE Joint Genome Institute, Lawrence Berkeley National Laboratory

Matthew Trunnell, Vice President and Chief Data Officer, Fred Hutchinson Cancer Research Center

The “Trends from the Trenches” will celebrate its 10th Anniversary at Bio-IT! Since 2010, the “Trends from the Trenches” presentation, given by Chris Dagdigian, has been one of the most popular annual traditions on the Bio-IT Program. The intent of the talk is to deliver a candid (and occasionally blunt) assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. The presentation has helped scientists, leadership, and IT professionals understand the basic topics related to computing, storage, data transfer, networks, and cloud that are involved in supporting data-intensive science. In 2020, Chris will give the “Trends from the Trenches” presentation in its original “state-of-the-state address” followed by guest speakers giving podium talks on relevant topics. An interactive Q&A moderated discussion with the audience follows. Come prepared with your questions and commentary for this informative and lively session.

4:00 Close of Conference

Platinum Sponsors