Original Agenda
We are actively working with our speakers to confirm their availability for the virtual event. Initial response from our speakers has been very positive, and we are optimistic we will have the new programs ready to share here soon.


The Bioinformatics track assembles thought leaders who will present case studies using computational resources and tools that discuss the problems and challenges of taking data from multiple -omics sources and aligning it with clinical action. Turning big data into smart data can lead to real-time assistance in disease prevention, prognosis, diagnostics, and therapeutics. With the ever-increasing volume of information generated for curing or treating diseases and cancers, bioinformatics technologies, tools, and techniques play a critical role in turning data into actionable knowledge to meet unstated and unmet medical needs. Case studies will be presented on addressing these problems and challenges, including making the jump from prototyping to production code, defining what a "validated" informatics pipeline means, how to balance agility needs with requirements to be consistent/compliant, pipeline and workflow frameworks, containerization for reproducibility, and more. How do your approaches deal with inconsistencies in definitions and meta-data across the multiple datasets that form the basis of big data?

Final Agenda


Monday, october 5

9:00 am - 5:00 pm Hackathon*

*Pre-registration required.

Tuesday, october 6

7:30 am Workshop Registration Open and Morning Coffee

8:30 am - 3:30 pm Hackathon*

*Pre-registration required.

8:30 - 11:30 am Recommended Morning Pre-Conference Workshops*

W3. Introduction to Data Visualization for Biomedical Applications

Nils Gehlenborg, PhD, Assistant Professor, Department of Biomedical Informatics, Harvard Medical School

12:30 - 3:30 pm Recommended Afternoon Pre-Conference Workshops*

W12. Cancer Genome Analysis

Jeffrey Rosenfeld, PhD, Manager, Biomedical Informatics Shared Resource and Assistant Professor of Pathology and Laboratory Medicine, Rutgers Cancer Institute of New Jersey; President, Rosenfeld Consulting LLC

*Separate registration required.

2:00 - 6:30 Main Conference Registration Open


4:00 Welcome Remarks

Cindy Crowninshield, RDN, LDN, Executive Event Director, Cambridge Healthtech Institute





4:05 Keynote Introduction

4:15 PLENARY KEYNOTE PRESENTATION: NIH’s Strategic Vision for Data Science

Susan K. Gregurick, PhD, Associate Director, Data Science (ADDS) and Director, Office of Data Science Strategy (ODSS), National Institutes of Health





Rebecca Baker, PhD, Director, HEAL (Helping to End Addiction Long-term) Initiative, Office of the Director, National Institutes of Health





5:00 - 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing



Wednesday, october 7

7:30 am Registration Open and Morning Coffee


8:00 Welcome Remarks

Allison Proffitt, Editorial Director, Bio-IT World




8:05 Awards Program Introduction

J.W. Bizzaro8:10 Benjamin Franklin Award and Laureate Presentation

J.W. Bizzaro, Managing Director, Bioinformatics.org





Allison Proffitt8:35 Bio-IT World Innovative Practices Awards

Allison Proffitt, Editorial Director, Bio-IT World




8:45 Keynote Introduction

8:55 PANEL DISCUSSION: Game On: How AI, Citizen Science, and Human Computation Are Facilitating the Next Leap Forward

Seth CooperSeth Cooper, PhD, Assistant Professor, Khoury College of Computer Sciences, Northeastern University






Lee LancashireLee Lancashire, PhD, Chief Information Officer, Cohen Veterans Bioscience






Pietro Michelucci, PhD, Director, Human Computation Institute






Jérôme WaldispühlJérôme Waldispühl, PhD, Associate Professor, School of Computer Science, McGill University






While the precision medicine movement augurs for better outcomes through targeted prevention and intervention, those ambitions entail a bold new set of data challenges. Various panomic and traditional data streams must be integrated if we are to develop a comprehensive basis for individualized care. However, deriving actionable information requires complex predictive models that depend on the acquisition and integration of patient data on a massive scale. This picture is further complicated by new data streams emerging from quantified self-tracking and health social networks, both of which are driven by experimentation-feedback loops. Tackling these issues may seem insurmountable, but recent advancements in human/AI partnerships and crowdsourcing science adds a new set of capabilities to our analytic toolkit. This talk describes recent work in online collective systems that combine human and machine-based information processing to solve biomedical data problems that have been otherwise intractable, and an information processing ecosystem emerging from this work that could transform the landscape of precision medicine for all stakeholders.

9:45 Coffee Break in the Exhibit Hall with Poster Viewing


10:50 Organizer’s Welcome Remarks

Cambridge Healthtech Institute

10:55 Chairperson’s Remarks

11:00 KEYNOTE PRESENTATION: The Human Intelligence Revolution – How Collaboration, Data Sharing, and Human Intelligence Will Create a Healthier Future

Reid_JeffreyJeffrey Reid, PhD, Vice President, Head of Genome Informatics & Data Engineering, Regeneron

The future of medicine will be enabled by our understanding of genetic disease drivers. As the architects of the world’s largest database of genomic data paired with de-identified health records, the Regeneron Genetics Center (RGC) is aiming to improve human health through data innovation and collaborative knowledge sharing. Dr. Reid discusses initiatives like Project Glow and the RGC-UK Biobank exome consortium which enable the global scientific community to tap key datasets and identify new, better ways of preventing and treating human disease.

11:30 Precision Cancer Medicine

Rosenfeld_JeffreyJeffrey Rosenfeld, PhD, Manager, Biomedical Informatics Shared Resource and Assistant Professor of Pathology and Laboratory Medicine, Rutgers Cancer Institute of New Jersey; President, Rosenfeld Consulting LLC

This presentation will illustrate the current methods that are used for determining the precise treatment of cancer rather than the standard chemotherapy methods.

12:00 pm Sponsored Presentation (Opportunity Available)

12:30 Session Break

12:40 LUNCHEON PRESENTATION I: Advancing Precision Medicine with a Complete Bioinformatics Ecosystem

Brandi Davis-Dusenbery, PhD, CSO, Seven Bridges

Elsevier-square 1:10 LUNCHEON PRESENTATION II: A Network Polypharmacology Approach to Diffuse Intrinsic Pontine Glioma

MacLean_FinlayFinlay MacLean, MSc, Data Scientist, Elsevier

Network medicine promises to be a potential linchpin in oncological drug repurposing. We developed a multi-scale heterogeneous knowledge graph spanning genomics, epigenetics, transcriptomics and proteonomics. We implemented a random walk and generated dense vector representations of the neighbourhoods (or interactomes) of key nodes and used these in downstream supervised machine learning tasks. Leveraging Entellect we plan to use the models in our collaboration with the University of Zurich, to suggest potential DIPG drug repurposing candidates.

1:40 Session Break


1:50 Chairperson’s Remarks

1:55 Building an Artificial Intelligence-Based Vaccine Discovery System – Applications in Infectious Diseases & Personalized Neoantigen-Related Immunotherapy for Treatment of Cancers

Rawal_KamalKamal Rawal, PhD, Associate Professor, Amity University, India; Adjunct Faculty, Baylor College of Medicine, Houston, USA

Infectious disease affects several million individuals all over the world, particularly from developing countries. We have built a bioinformatics pipeline which combines reverse vaccinology tools, network biology system and text mining algorithms to analyses proteomes of pathogens and ranks proteins based upon their propensity to be an optimal vaccine candidate. Our system compares various machine learning approaches such as support vector machines, neural networks, ensemble learning & decision trees.

2:25 Flexible Platform for Providing Broad-Based Bioinformatics Service

Ethan Yaoyu Wang, PhD, Senior Research Scientist, Department of Biostatistics, Harvard T.H. Chan School of Public Health

We introduce CNAP, a flexible cloud-based framework for distributing bioinformatics pipelines as a web-service application to non-technical researchers with limited computational support. CNAP is particularly suited for research community with limited computational resources and bioinformatics personnel to provide broad-based support on projects with a wide range of computational requirements and dataset sizes.

2:55 Sponsored Presentation (Opportunity Available)

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing


4:00 Chairperson’s Remarks

4:05 BLAST, Pipelines and FAIR on the Cloud

Madden_ThomasThomas Madden, PhD, Staff Scientist, NCBI/NLM/NIH

A sequence similarity search often provides essential information about a DNA or protein sequence. With the rapidly expanding use of high throughput sequencing, a few issues may occur for BLAST users. First, the need for searches may come in bursts, with many searches needing to be done at once and current resources unable to handle the load. Second, many searches are now part of a pipeline, which can be a powerful multiplier for bioinformatics tools, but is not straight-forward to maintain if it does not conform to FAIR (Findable, Accessible, Interoperable, and Reusable) principles. The cloud can help with the first problem, allowing a user to scale up the computational resources per their specific needs. Containerization and formal pipeline languages can help with the second issue, making pipelines more reproducible and easier to maintain. We discuss a containerized version of BLAST, usage with CWL, and a cloud infrastructure that includes databases hosted on cloud providers.


4:35 Data Integration Expectation Maps Project

Williams-DeVane_ClarLyndaClarLynda Williams-DeVane, PhD, Chair, Data Science and Bioinformatics, Fisk University

5:05 mTOR System: A Database for Systems-Level Biomarker Discovery in Cancer

Tavassoly_ImanIman Tavassoly, MD, PhD, Physician-Scientist, Mount Sinai Institute for Systems Biomedicine, Icahn School of Medicine at Mount Sinai

mTOR system is a database I have designed for exploring biomarkers and systems-level data related to mTOR pathway in cancer. This database consists of different layers of molecular markers and quantitative parameters assigned to them through current mathematical model. This database is an example of merging systems-level data with mathematical models for precision oncology.



5:35 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing


6:45 End of Day

Thursday, october 8

7:30 am Registration Open and Morning Coffee


8:00 Organizer’s Remarks

Cindy Crowninshield, RDN, LDN, Executive Event Director, Cambridge Healthtech Institute




Robert Green8:15 Toward Preventive Genomics: Lessons from MedSeq and BabySeq

Robert Green, MD, MPH, Professor of Medicine (Genetics) and Director, G2P Research Program/Preventive Genomics Clinic, Brigham & Women’s Hospital, Broad Institute, and Harvard Medical School




Natalija Jovanovic9:00 AI in Pharma: Where We Are Today and How We Will Succeed in the Future

Natalija Jovanovic, PhD, Chief Digital Officer, Sanofi Pasteur




9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced at 10:00




10:30 Organizer’s Remarks

Cambridge Healthtech Institute

10:35 Chairperson’s Remarks

10:40 Powering Question-Driven Problem Solving to Improve the Chances of Finding New Medicines

Hasan_SamiulSamiul Hasan, PhD, Scientific Analytics and Visualization Director, Data and Computational Sciences, GlaxoSmithKline

Making true “molecule”-”mechanism”-”observation” relationship connections is a time consuming, iterative and laborious process. In addition, it is very easy to miss critical information that affects key decisions or helps make plausible scientific connections. The current practice for deciphering such relationships frequently involves subject matter experts (SMEs) requesting resource from resource-constrained data science departments to refine and redo highly similar ad hoc searches. The result of this is impairment of both the pace and quality of scientific reviews. In this presentation, I show how semantic integration can be made to ultimately become part of an integrated learning framework for more informed scientific decision-making. I will take the audience through our pilot journey and highlight practical learnings that should inform subsequent endeavors.

11:10 Computational Efforts on Drug Repurposing for Rare Diseases

Li_BinBin Li, PhD, Director, Computational Biology, Takeda Pharmaceutics

We conducted in silico screens trying to repurpose >100 compounds for ~4000 rare disease indications. Various data types were utilized (protein-protein interaction network, pathways, disease driven genes, competitive intelligence, etc), and different computational methods were implemented and evaluated. Some biologically interesting drug/disease pairs were observed.

accenture 11:40 Presentation to be Announced

12:10 pm Session Break

Deloitte_ConvergeHealth12:20 Luncheon Presentation I to be Announced

Schrodinger12:50 Luncheon Presentation II


1:20 Dessert Refreshment Break in the Exhibit Hall with Last Chance Poster Viewing


1:55 Chairperson’s Remarks

2:00 Pediatric Cell Atlas: Using Single-Cell Technology to Understand Childhood Health and Disease

Deanne Taylor, PhD, Director of Bioinformatics, DBH, Children’s Hospital of Philadelphia

2:30 Scaling scRNASeq Visualization to Unlimited Datasets with Cellxgene Gateway

Saldanha_ALokAlok Saldanha, PhD, Technical Associate Director, NIBR Informatics, Novartis Institutes for Biomedical Research

Cellxgene Gateway is an open source tool (https://github.com/Novartis/cellxgene-gateway) which allows you to use the Cellxgene Server provided by the Chan Zuckerberg with multiple datasets. I will introduce this tool in the context of a typical single-cell RNA-Seq analysis workflow, and touch on deployment issues in an enterprise cloud with a budget.

3:00 Presentation to be Announced

Quackenbush_JohnJohn Quackenbush, PhD, Henry Pickering Walcott Professor of Computational Biology and Bioinformatics; Chair, Department of Biostatistics, Harvard T.H. Chan School of Public Health



3:30 Embedding Single-Cell RNA-Seq Profiles in Non-Euclidean Spaces

Ding_JiaruiJiarui Ding, PhD, Postdoctoral Researcher, Aviv Regev’s Lab, Broad Institute of MIT and Harvard

Single-cell RNA-Seq has become an invaluable tool for studying biological systems in health and diseases. We introduced scPhere, a scalable deep generative model to embed cells into low-dimensional hyperspherical or hyperbolic spaces, as a more accurate representation of the data. scPhere resolves cell crowding, corrects multiple, complex batch factors, facilitates interactive visualization of large datasets, and gracefully uncovers pseudotemporal trajectories.

4:00 Close of Conference

Platinum Sponsors