2018 Archived Content
Track 9: Data Visualization and Exploration Tools

As data-generating technologies become more and more commonplace in research and drug discovery labs, data visualization becomes more necessary than ever to gain insight from big data sets. It is more important than ever to develop data visualization and exploration tools alongside the rest of the analytics, as opposed to later in the game. Track 9 will address ways to not only develop, design, and implement visualization tools in genomics, drug discovery, clinical development, and translational research, but also address real-world case studies where these tools have been successfully used.

Tuesday, May 15

7:00 am Workshop Registration Open (Commonwealth Hall) and Morning Coffee (Foyer)

8:0011:30 Recommended Morning Pre-Conference Workshops*

W5. Data Visualization to Accelerate Biological Discovery

12:304:00 pm Recommended Afternoon Pre-Conference Workshops*

W11. Data Science Driving Better Informed Decisions

* Separate registration required.

2:006:30 Main Conference Registration Open (Commonwealth Hall)

4:00 PLENARY KEYNOTE SESSION (Amphitheater & Harborview 2)

5:007:00 Welcome Reception in the Exhibit Hall with Poster Viewing (Commonwealth Hall)

Wednesday, May 16

7:00 am Registration Open (Commonwealth Hall) and Morning Coffee (Foyer)

8:00 PLENARY KEYNOTE SESSION (Amphitheater & Harborview 2)

9:45 Coffee Break in the Exhibit Hall with Poster Viewing (Commonwealth Hall)


10:50 Chairperson’s Remarks

Melissa Landon, PhD, Regional Director, Applications Science, Northeast; Director, Education, Schrödinger

11:00 Lineage: Visualizing Multivariate Clinical Data in Genealogy Graphs

Alexander Lex, PhD, Assistant Professor, SCI Institute, School of Computing, University of Utah

The majority of diseases that are a significant challenge for public and individual heath are caused by a combination of hereditary and environmental factors. In this paper, we introduce Lineage, a novel visual analysis tool, designed to support domain experts that study such multifactorial diseases in the context of genealogies. Incorporating familial relationships between cases can provide insights into shared genomic variants that could be implicated in diseases, but also into shared environmental exposures.

11:30 Creating Population Health Informatics Using Tableau

Frank Wang, Clinical Assistant Professor, Health Informatics and Health Sciences, Sacred Heart University

Population Health Management and Accountable Care Organizations (ACO) are key initiatives under Medicare Access and CHIP Reauthorization Act of 2015 (MACRA). CMS has developed guideline to reward quality of care and better outcome. These topics will review CMS data sets (Medicare Utilization and Reimursement, Part B and Part D pharmaceutical products usage and clinical quality improvement) and demonstrate how to use Tableau to derive actionable insights in healthcare informatics and analytics.


12:00 pm Where Consumers Meet Biology: PatientsLikeMe

Kim Goodwin, Senior Advisor, Consumer Strategy, PatientsLikeMe

Expanding application of biological data in both precision medicine and consumer wellness means patients are both contributors to our datasets (in the form of patient-reported information) and consumers of the results. How do you structure data for consumer report and consumption, as well as computability? How do you show biological data in a way that’s relevant to patients? Kim Goodwin will share work in progress from PatientsLikeMe.

12:30 Session Break

12:40 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:40 Session Break


1:50 Chairperson’s Remarks

Baisong Huang, Principal Statistical Analyst, Novartis Institutes for BioMedical Research

1:55 Scalable Visualization and Exploration Tool for Single-Cell Genomics Data

Marcin Tabaka, PhD, Postdoctoral Associate, Regev Lab, Broad Institute of MIT and Harvard

We developed an interactive tool for the visualization and exploratory analysis of massive single-cell omics data. Users can visualize a billion cells on a personal computer, show density of cells or gene expression values on a 2D embedding, plot gene expression profiles for selected groups of cells, visualize and annotate clusters of cells.

2:15 CanvasXpress: A Versatile Interactive High-Resolution Scientific Multi-Panel Visualization Toolkit

Baohong Zhang, Director of Clinical Bioinformatics, Precision Medicine, Pfizer, Inc.

CanvasXpress (http://canvasxpress.org) and CanvasDesigner (https://baohongz.github.io/canvasDesigner ), were developed as the core visualization component for bioinformatics and systems biology analysis at Pfizer and Bristol-Myers Squibb and further enhanced by scientists around the world and served as a key visualization engine for many popular bioinformatics tools. It offers a rich set of interactive plots to display scientific and genomics data, such as oncoprint of cancer mutations, heatmap, 3D scatter, violin, radar, and profile plots.

2:25 MJFF Initiative for Open Source PD Research and Data Integration

Luba Smolensky, Director, Data Science & Analytics, The Michael J. Fox Foundation

As the world’s largest nonprofit funder of Parkinson’s research, The Michael J. Fox Foundation (MJFF) is dedicated to accelerating a cure for Parkinson’s disease and improved therapies for those living with the condition today. MJFF is leading a Parkinson’s research data curation and standardization effort that will accelerate insights into the disease. The goal is to provide access to curated datasets across platforms for all researchers across academia, public institutions, and industry.

Schrodinger2:55 Collaborative Drug Discovery with LiveDesign: Integrated Computational Chemistry and Cheminformatics

Melissa Landon, PhD, Regional Director, Applications Science, Northeast; Director, Education, Schrödinger

Drug Discovery has become increasingly dependent upon a plethora of computational tools and data, requiring collaboration across computational and medicinal chemistry project teams for ideation, querying, and project management. Herein we present LiveDesign, a highly-collaborative web-based platform for workflow management by bringing computational modeling alongside experimental data and informatics, presented with real-world examples.

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing (Commonwealth Hall)


4:00 Chairperson
Sanjay Joshi, Chief of Technology, Healthcare and Life Sciences, H20.zi

4:00 Enhancing Precision Wellness with Knowledge Graphs and Semantic Analytics

James Hendler, PhD, Tetherless World Chair of Computer, Web and Cognitive Sciences; Director, RPI-IBM Center for Health Empowerment by Analytics, Learning and Semantics, Institute for Data Exploration and Applications, Rensselaer Polytechnic Institute

Today’s patients, clinicians and researchers have gone from a world of too little data to one of too much. Discovering relevant information, integrating it from multiple sources, deciding what to believe, and exploring alternative treatments are all challenges that go beyond many of today’s medical support systems. In this talk, we explore how AI and machine learning can be used by healthcare providers and consumers to better understand and overcome health challenges.

4:30 Drug Discovery at Scale: Interpreting Biology with High-Dimensional Data

Peter McLean, PhD, Lead Data Scientist, Analysis, Recursion Pharmaceuticals

Recursion Pharmaceuticals leverages image-based cellular phenotyping for drug discovery by using computer vision and machine learning to turn biological questions into tractable data-science questions. Translating data science-derived conclusions back into a biological or business framework introduces its own challenges. Here, I will introduce some of Recursion’s approaches for addressing the challenges around distilling computational models of high-dimensional cellular morphology - across hundreds of disease and treated states - into interpretable, actionable data.

BIOVIA5:00 Towards Tailor-Made Drugs with AI-Driven Drug Design

Ton van Daelen, PhD, BIOVIA

Enabling pharma, biotech, and agrichemical businesses to more efficiently produce safe, efficacious medicines and agents is key to improving productivity and competitiveness. Recent advances in machine learning methods and artificial intelligence (AI) have shown great promise in bringing true automation to this process and the potential for these tools to finally become mainstream. By leveraging both existing in-house and publicly available assay data, predictive models can be trained and then applied to rapidly design small molecule or biologics therapeutic starting points in silico against desired target, anti-target, safety and toxicity profiles in parallel. Once optimized, these in silico molecules are passed to the lab to be created and tested. Results from each “Virtual-to-Lab” cycle are then used to assess and retrain the predictive models ahead of the next design round. By tightly coupling the design cycle with predictive models, which are in turn, tightly coupled to available assay results, research organizations can efficiently leverage all available data far more effectively than is currently possible. In turn, this helps research organization discover drugs and chemicals not only faster, but with a much increased chance of meeting regulatory, business, and market requirements for a new product.

 DDN Storage 5:15 Architecting for Success with Machine Learning Data Platforms

Kurt Kuckein, Director, Marketing, DDN Storage

Machine learning is being applied to many aspects of precision medicine. Organizations with a vision for the future will realize how the data at the heart of their ML initiatives will require extensive scaling. This presentation reviews key considerations for creating and developing ML data platforms to ensure deeper insights, a shorter path to value, and capability for effortless scaling.

5:30 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing (Commonwealth Hall)


7:0010:00 Bio-IT World After Hours @Lawn on D
 **Conference Registration Required. Please bring your conference badge, wristband, and photo ID for entry.   

Thursday, May 17

7:30 am Registration Open (Commonwealth Hall) and Morning Coffee (Foyer)

8:00 PLENARY KEYNOTE SESSION & Awards program (Amphitheater & Harborview 2)

9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced (Commonwealth Hall)


10:30 Chairperson’s Remarks

  Adnan Derti, PhD, Director of Translational Science, Surface Oncology Inc 

10:40 Genomic Data Visualization for Large Heterogenous Datasets

Peter Kerpedjiev , PhD, Postdoctoral Associate, Biomedical Informatics, Harvard Medical School

Visual tools bridge the gap between algorithmic data analysis approaches and the cognitive skills of investigators. Addressing this need has become crucial at a time when many studies are no longer driven by well-defined hypotheses but by the availability of vast amounts of genomic and other associated data. In this talk I will focus on common visualization techniques for genomic data and review their suitability for precision medicine data.

11:10 Target Gene Notebook: Connecting Genetics and Drug Discovery Through Computational and Logistical Tools

Mary Pat Reeve, Associate Director, Informatics, Eisai AiM Institute

Linking associations to functional biological information is essential to translating genetic insights to drug discovery; however, there is currently no way to maintain group curation of relevant genetic and functional information and to integrate it with proprietary experimental data. We introduce an e-notebook to facilitate the organization of results and provide freely-available software, Target Gene Notebook, to assist therapeutic target evaluation and create durable institutional or public knowledge bases.

11:40 Q & A with Speakers

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing (Commonwealth Hall)

Beacon Hill

1:55 Chairperson’s Remarks

Peter Kerpedjiev , PhD, Postdoctoral, Biomedical Informatics, Harvard Medical School

2:00 CO-PRESENTATION: Assessment of Disease and Relapse Using Remote Monitoring Technology

Maximilian Kerz, PhD, BRC Software Developer, Biostatistics & Health Informatics, King’s College London

Nikolay Manyakov, PhD, Principal Scientist, Janssen

Typically, disease progression is monitored during infrequent clinical visits, generating a sparse and subjective clinotype derived during periods of sickness. This subsequently leads to late intervention with modest outcomes. Continuous monitoring could help to generate a more objective, pervasive phenotype throughout the disease continuum. The EU Innovative Medicines initiative €25m major programme, RADAR-CNS (https://www.radar-cns.org/) is exploring the use of remote measurement technologies, utilizing smartphone sensors, consumer wearables, information about smartphone usage, and experience sampling method to predict and avert negative outcomes through monitoring of current clinical states and assessment of future deterioration.

2:30 Developing Digital Biomarkers through Crowdsourcing

Larsson Omberg, Vice President, Systems Biology, Sage Bionetworks

The high quality sensors embedded in the typical smartphone coupled with the ease of gathering high frequency data is opening up new ways of tracking disease and performing participant centered research. Building biomarkers from this data is a non-trivial task however. In this talk I will present our experience in collecting data from 20,000 participants to build disease biomarkers and how we engaged 400 researchers across the globe to enhance them.

Beacon Hill

3:00 Visualization Approaches for High Dimensional Data Found in Drug Discovery Screening

Peter Henstock, PhD, Senior Data Scientist, Pfizer

Screening imposes a bottleneck in the discovery phase of many institutions. It remains challenging since it encompasses many assay types and data sizes, but has little standardization. With the goal of extracting insight from the screening data, solutions span from Excel with macros through sophisticated cloud-based software. This presentation focuses on a few software platforms aimed at ensuring high screen quality and interpreting the hits for medium- to high-throughput screens.

3:30 IOBIO: Realtime Visualization and Analysis of Big Genomic Data

Chase Miller, Research Director, Center for Genetic Discovery, University of Utah

IOBIO is a web-based platform facilitating real-time analysis and visualization of large, remotely-stored, distributed datasets. Real-time interaction makes it easier to explore and understand genomic data, which is often large, complex and hard to access. We have developed several IOBIO web apps including quality control analysis of genomic alignment and variant data, interrogation of potential disease causing variants, and species identification and classification of raw sequencing data (see http://iobio.io).

4:00 Conference Adjourns

Register Early and Save

Data Platforms and Storage Infrastructure