Track 8: Data Visualization and Exploration Tools

With a sharp increase in the volume and complexity of big data sets for research and drug discovery labs, data visualization is needed to clearly express the complex patterns. It is more important than ever to develop data visualization and exploration tools alongside the rest of the analytics, as opposed to later in the game. The Data Visualization & Exploration Tools track will address ways to not only develop, design, and implement visualization tools in genomics, drug discovery, clinical development, and translational research, but also address real-world case studies where these tools have been successfully used.

Final Agenda

Tuesday, April 16

7:00 am Workshop Registration Open and Morning Coffee

8:0011:30 Recommended Morning Pre-Conference Workshops*

W2. Data Visualization to Accelerate Biological Discovery

12:304:00 pm Recommended Afternoon Pre-Conference Workshops*

W9. Research Project Management

* Separate registration required.

2:006:30 Main Conference Registration Open


5:007:00 Welcome Reception in the Exhibit Hall with Poster Viewing

Wednesday, April 17

7:30 am Registration Open and Morning Coffee


9:45 Coffee Break in the Exhibit Hall with Poster Viewing

Back Bay

10:50 Chairperson's Remark's

Simon Taylor, Vice President, Global Partners & Alliances, Lucidworks

11:00 Data-Driven Healthcare: Visual Analytics for Exploration and Prediction of Clinical Data

Adam Perer, PhD, Assistant Research Professor, School of Computer Science, Human-Computer Interaction Institute, Carnegie Mellon University

Healthcare institutions are now recording more electronic health data about patients than ever before. Many hope that if researchers tap into this real world observational data, the collective experience of the healthcare system can be leveraged to unearth insights to improve the quality of care. My research focuses on building interactive visual systems that leverage machine learning so clinicians and researchers can derive such insights.

11:30 Interactive Concept Learning for Visual Exploration of Epigenetic Patterns

Fritz Lekschas, PhD Candidate, Hanspeter Pfister Lab, Computer Science, Harvard University

Epigenetic datasets contain rich sets of patterns but searching and exploring nonstandard patterns is often time consuming and visual feedback is needed for verification of the results. I am going to present Peax, a new web-based tool for interactively training a classifier that learns your notion of interestingness and operates on deep learning-based unsupervised featurizations of the epigenetic datasets.

Riffyn 12:00 pm Data Analysis Without Borders: Eliminating Data Silos in Life Science R&D

Douglas Williams, Executive Director, Commercial Development, Riffyn, Inc.

Deeper insights into R&D results require close collaboration across teams. The cloud-based Riffyn software structures and links experimental designs and measurement data across organizational boundaries. Riffyn is helping R&D organizations use this capability to identify unexpected correlations, uncover root causes of error, and deliver right-first-time technology scale-up and technology transfer.

12:30 Enjoy Lunch on Your Own (Lunch Available for Purchase in Exhibit Hall)


Back Bay

1:50 Chairperson’s Remarks

Hector Corrada Bravo, PhD, Assistant Professor, Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland, College Park

1:55 Computational Steering of Interactive Exploratory Analysis of Genomics Data

Hector Corrada Bravo, PhD, Assistant Professor, Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland, College Park

Interactive visual analysis integrated with computational analyses has gained popularity in genomics. We have previously built interactive analysis systems that are efficient and effective for exploratory analyses of large datasets arranged over richly structured features. Here, we discuss the next generation of tools where tighter integration between visualization and computation is used to guide and steer data analysts exploration based on the results of computations of interest.

2:25 MERmaid: A WebGL-Based Tool for Exploring Spatially Resolved Single-Cell Transcriptomics Data

Jean Fan, PhD, Postdoctoral Fellow, Chemistry and Chemical Biology, Harvard University

Recent advancements in highly multiplexed spatially-resolved single-cell gene expression measurements demand scalable computational tools to assist in data exploration and hypothesis-generation. We present MERmaid, an open-source visualization tool built on WebGL, that provides a rich interface for rapid exploration of spatially-resolved transcriptomics data. We apply MERmaid to visualize cell-type heterogeneity in tissues as well as intra-cellular heterogeneity in mRNA localization in MERFISH data. MERmaid is available online at

2:55 Beyond the Arc Diagram: Interactive 3d Visualization of Chromatin Loops to Support Epigenetics Hypothesis Generation

Mark Wissler, Lead Data Scientist and UX Researcher, Exaptive, Inc.

The traditional method for visualizing chromatin has been to stretch it out along a one-dimensional axis and draw arcs representing areas that are packaged more closely together. Unfortunately this visualization technique linearizes the very structures that makes chromatin so interesting - its loops. In a collaborative project with the Oklahoma Medical Research Foundation, a team of Exaptive data scientists and visualization designers developed a novel interactive 3d visualization method that allowed the researcher to selectively visualize particular loops and cross-reference with external genome annotation. This talk will explain not only the visualization method, but the variety of other UX considerations that were required to enable ad-hoc exploratory hypothesis generation, and it will discuss the new opportunities this interface provides for both machine learning and multi-researcher collaboration.

Back Bay

3:10 Simplifying the Adoption of Machine Learning for Real Time Clinical Data Insight and Decision Making

Simon Taylor, Vice President, Global Partners & Alliances, Lucidworks

The biggest barrier to delivering business relevant visualization for improved decision making is providing live access to consistent data with advanced ML algorithms that turn it into valuable information in ways the business can understand. This is the realm of the data scientist however most spend their time cleansing data vs. delivering much needed analytics value. In this session we'll work through a real-life clinical KM challenge from the initial "what question I'm trying to answer" to driving rapid data ingest and correlation using an open core platform, with integrated machine learning and workbench driven visualization. The result x10 rapid time to value, with results in hours vs. months.

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing, Meet the Experts: Bio-IT World Editorial Team, and Book Signing with Joseph Kvedar, MD, Author, The Internet of Healthy Things℠ (Book will be available for purchase onsite) 

Harborview 2

4:00 FEATURED PRESENTATION: Expanding Access to Dynamic Clinical Biomarker Visualizations: Automation, Integration and Exploration of Data Lakes

Philip Ross, PhD, Head of Translational Bioinformatics Data Science, Translational Medicine, BMS

With biomarker samples from thousands of patients across multiple indications, how do we detect meaningful clinical biomarker results in a reasonable timeframe and at reasonable levels of effort? Dynamic visualizations with up-to-date data provide evolving insights. We are automating the integration of clinical and biomarker results in data lakes and leveraging dynamic visualizations to give the best possible access and exploration of emerging clinical biomarker data signals and trends.

4:30 Universal Spotfire Template (UniSpoT) for Clinical Biomarker Discovery

Sittichoke Saisanit, PhD, Principal Scientist, Data Science, Pharma Research and Early Development Informatics (pREDi), Roche Innovation Center New York

UniSpoT is the Roche pRED standardized visual analytics platform for clinical biomarker data. Enabled by the underlying BRAVE data process, it addresses the increasing and unmet business need for near real-time access to biomarker data, integrated with clinical data for exploratory analysis. It has been used for early clinical studies which have open-label design (e.g. phase 1b). Using UniSpoT, scientists can gain earlier and better understanding of biology, generate hypothesis, improve biomarker strategy and quality of data collection.

harborview 2

5:00 From Data Inspection to Disease-Specific Data Viewing 

David Kreda, Sync for Science Project, Harvard Medical School Department of Biomedical Informatics

We will present a new data inspection tool for examining a patient's structured data in a disease-agnostic way. We will then show how adding disease-specific views can assist human medical reasoning. Finally, we will discuss our approach for integrating computational services to annotate and organize disease-specific views.

5:30 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing

Thursday, April 18

7:30 am Registration Open and Morning Coffee


9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced

back Bay

10:30 Chairperson’s Remarks

Carlos Rios, PhD, Senior Research Investigator, Computational Genomics - Translational Medicine, Bristol-Myers Squibb

10:40 Longitudinal and Context Visualization for Precision Oncology

Jeremy Goecks, PhD, Assistant Professor of Biomedical Engineering and Computational Biology, Oregon Health and Science University

The goal of precision oncology is to find effective treatments for each patient’s cancer based its molecular profile. Visualization plays a key role in precision oncology, helping to understand and integrate longitudinal and complex data analyses and then communicate results to physicians, patients, and other stakeholders. We will discuss our work applying visualization for precision oncology and identify opportunities and challenges for visualization in precision oncology going forward.

11:10 Sharing and Visualizing Cancer Genomics Datasets Using cBioPortal

Carlos Rios, PhD, Senior Research Investigator, Computational Genomics - Translational Medicine, Bristol-Myers Squibb

BMS has been using cBioPortal for visualizing cancer genomics datasets since early 2016, supported by The Hyve, an open source bioinformatics company based in The Netherlands. The cBioPortal server runs on Amazon AWS and is tied to the company’s Active Directory for authentication and uses Keycloak for authorization. Data can be loaded through a pipeline that takes input files from Amazon S3. For BMS, cBioPortal was extended with support for rich metadata and canvasXpress integration.

11:40 OncoThreads: Exploratory Visualization of Longitudinal Cancer Genomics Data

Theresa Anisja Harbig, MS, Research Associate, Visiting Graduate Student, Biomedical Informatics, Gehlenborg Lab, Harvard Medical School

New immuno-profiling assays and liquid biopsies have enabled researchers to study tumor development over time and to explore the effect of different therapies on cancer.  To support exploration of such datasets, we developed OncoThreads, a tool for the visualization of longitudinal cancer genomics data in patient cohorts ( The tool is based on alignment of patient timelines into blocks, which can show data associated with patient samples or events such as drug administration.  We demonstrate how the design of OncoThreads enables researchers to find temporal patterns in longitudinal cancer genomics data, such as effects of treatments on mutation patterns

12:10 pm Enjoy Lunch on Your Own (Lunch Available for Purchase in the Exhibit Hall)

1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing

Back Bay

1:55 Chairperson’s Remarks

Baohong Zhang, PhD, Director of Genome Informatics, Translational Biology, Biogen

2:00 Creating Effective Visualizations – Design and Choreography for the Chaos of Data

Martin Krzywinski, Staff Scientist, Genome Sciences Centre, BC Cancer Research Centre

The process of design, which is a kind of choreography for the page, can be of great help in assembling individual data visualizations into a cohesive explanation across many levels of detail. In the same way that visualizations are a way to organize data, design is a way to organize visualizations. I will share with you my experiences in combining science, visualization and design to create explanations, promote engagement, inspire imagination and, where possible, provide visual support in the often vexing process of research.

2:30 Big Data to Insights Visually

Baohong Zhang, PhD, Director of Genome Informatics, Translational Biology, Biogen

How to utilize the most advanced JavaScript visualization tool kits, such as D3.js, canvasXpress.js and canvasDesigner.js to empower everyday scientists to extract biological insights from ever growing data sets.

3:00 CanvasXpress: An R-Library Data Visualization for Reproducible Research

Isaac M. Neuhaus, PhD, Director, Computational Genomics, BMS

CanvasXpress is a standalone JavaScript library used for visualization of genomics and non-genomics data sets. It has a user-friendly and unobtrusive interface to allow users to explore data sets and customize their visualizations. It also has a sophisticated mechanism to track all user interactions and modifications, which makes it ideal for use in Reproducible Research. More information can be found at

3:30 Interactive Visualization of Person-Generated Health Data for Precision Health: Challenges & Possibilities

Arlene E. Chung, MD, MHA, MMCi, Associate Director of Health & Clinical Informatics, University of North Carolina School of Medicine; Lead Informatics Physician for Patient Engagement, UNC Health Care

While there is much interest in remote monitoring using person-generated health data (PGHD) from wearables and other data streams, transforming these data into meaningful and actionable insights for precision health is an open challenge as heterogeneity, missingness, and sparsity are inherent within these data. This presentation focuses on how interactive data visualization approaches could allow clinicians and patients to better understand the impact of lifestyle on symptoms and health outcomes.

4:00 Conference Adjourns

Purchase on Demand