2017 Archived Content

Track 9: Data Visualization and Exploration Tools

Data visualization tools have become necessary as the amount of data we are able to generate only continues to grow exponentially. With more frequent use of sequencing, integrated informatics, and electronic health records, the need for more effective ways to process, analyze, and interpret this data is evident. Track 9 addresses designing and implementing data viz and exploration tools, applications in visualizing genomic data for research and drug discovery, considerations for bioinformatics data, and visualization for clinical and translational research.

Tuesday, May 23

7:00 am Workshop Registration and Morning Coffee

8:0011:30 Recommended Morning Pre-Conference Workshops*

(W4) Data Visualization to Accelerate Biological Discovery

12:304:00 pm Recommended Afternoon Pre-Conference Workshops*

(W9) Data Science Driving Better Informed Decisions

* Separate registration required.


GeneData logo2: 006:00 Main Conference Registration Open


5:007:00 Welcome Reception in the Exhibit Hall with Poster Viewing

Wednesday, May 24

7:00 am Registration Open and Morning Coffee


9:50 Coffee Break in the Exhibit Hall with Poster Viewing


10:50 Chairperson’s Remarks

Nils Gehlenborg, Ph.D., Assistant Professor, Biomedical Informatics, Harvard Medical School

11:00 Deploying OMERO at Harvard Medical School

Jay Copeland, Manager, Harvard Medical School Image Management Core, HMS IT, Harvard Medical School

OMERO is an open source server platform for managing, viewing, and sharing microscopy images and metadata. The HMS IT Department is deploying OMERO as a major new service targeted to supporting researchers and science. The goal is to make it easier for researchers to manage and share their microscope data and simplify collaboration using large, multi-terabyte, microscope image datasets. We also aim for HMS to contribute to the development of OMERO for the benefit of broader international community of scientists using OMERO.

11:30 A Lean Spotfire Implementation for Streamlining, Visualizing, and Analyzing Discovery Research Data

Yi Lin, Ph.D., Principal Scientist, Data Science V, pRED Informatics, Roche Pharma Research and Early Development, Roche Innovation Center Shanghai

We developed a Spotfire-based solution integrating multiple internal and external data sources to support decision making for drug discovery projects. An automated process extracts, streamlines, and transforms complex scientific data into a chemistry-friendly interface with customized views enabling interactive analysis of both molecular structures and biological data.

12:00 pm Sifting through the Noise of mHealth Data in Clinical Research

Nick Neri, Platform Manager, ERT

As drug development organizations incorporate wearables and other mHealth sensors into clinical trials, they face key challenges in analyzing the large volume of data these devices produce, and integrating these data with other patient data to create a coherent picture of patient outcomes. Nick Neri will present use cases on analyzing and visualizing integrated mHealth data in clinical research.

12:15 Enjoy Lunch on Your Own

1:40 Session Break


1:50 Chairperson’s Remarks

G. Elisabeta Marai, Ph.D., Associate Professor, Electronic Visualization Lab, University of Illinois

1:55 Galaxy as a Platform for Visual Analytics

Aysam Guerler, PhD, Software Engineer, Taylor Lab., Johns Hopkins University

Galaxy (https://galaxyproject.org) is an open-source, Web-based scientific gateway for analyzing large biomedical datasets that is used by thousands of scientists worldwide. In this talk, I will describe how Galaxy makes it simple to add new visualizations and combine them with Galaxy datasets, tools, and workflows. Visualizations implemented as static scripts, Web-based dynamic displays, and even client-server applications can all be integrated into Galaxy.

2:25 Multi-Scale Visualization Tools for Exploration of Chromosome Interaction Data

Nils Gehlenborg, Ph.D., Assistant Professor, Biomedical Informatics, Harvard Medical School

How do you visualize a 3 million x 3 million matrix and allow users to explore features across a wide range of different scales? We built HiGlass, a web-based visualization tool for analysis of Hi-C and other genome-wide chromosome interaction data that enables comparison of multiple contact matrices and integration of other data types. In my talk, I will discuss several use cases and describe how we architected HiGlass.

2:55 Delivering Computational Chemistry to Cheminformatics: Collaborative Drug Discovery with LiveDesign

Erin Davis, Ph.D., Senior Product Manager, Enterprise Informatics, Schrödinger Inc.

Reducing attrition in drug discovery means better predictions earlier in the pipeline, drawing on the ideas of the team’s various expertise. LiveDesign facilities this by enabling collaborative computational chemistry in near real time delivered through a user-friendly web-based cheminformatics platform. Here we will demonstrate the streamlining of drug discovery through several successful use cases.

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing


4:00 Co-Presentation: Multiple and Extensively Drug-Resistant Tuberculosis Data Exploration Portal (MXDR-TB DEPOT)

Michael Harris, Senior Informatics Scientist, Bioinformatics and Computational Biosciences Branch (BCBB), NIAID, NIH

Darren Schneider, Senior Manager, Analytics + Information Management, Deloitte Consulting LLP

The National Institute of Allergy and Infectious Diseases (NIAID) and multiple vendors worked collaboratively to design and develop an analytics portal to support hypothesis generation and testing aimed at improving TB patient diagnostics and outcomes. The publicly available solution enables clinicians and researchers to create and compare cohorts of patients based on clinical, socioeconomic, genomic, and diagnostic image data.

5:00 Journey to the Center of the Nucleus: Exploring 3D Genomic Datasets with Juicebox

Muhammad Saad Shamim, MD/PhD Candidate, Medical Scientist Training Program, Baylor College of Medicine/Rice University

Juicebox is a tool for exploring contact maps generated using Hi-C and other 3D genome-sequencing technologies; it allows users to zoom in and out interactively and supports a variety of annotation tools, enabling researchers to more accurately examine genomes and how they fold in 3D. This talk will explore ways to use Juicebox as well as the types of data sets that can be examined.

5:30 – 6:30 15th Anniversary Celebration in the Exhibit Hall with Poster Viewing and Best of Show Awards

Thursday, May 25

7:00 am Registration Open and Morning Coffee


8:05 Benjamin Franklin Awards and Laureate Presentation

8:35 Best Practices Awards Program

8:50 Plenary Keynote

9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced


10:30 Chairperson’s Remarks

Baohong Zhang, Ph.D., Director of Clinical Bioinformatics, Early Clinical Development, Pfizer, Inc.

10:40 Data Visualization: Make Every Data Point Alive

Baohong Zhang, Ph.D., Director of Clinical Bioinformatics, Early Clinical Development, Pfizer, Inc.

This presentation will describe interactive visualization of next-generation sequencing data using the latest web 2.0 techniques, such as jQuery, D3 and many other javascript libraries. I will be using single cell RNA-seq data as showcase to demonstrate the easy-to-share, publication-ready, server-less, internet-less, reproducible data exploration tool.

11:10 Big Display Visualization of Bioinformatics Data

G. Elisabeta Marai, Ph.D., Associate Professor, Electronic Visualization Lab, University of Illinois

Visualization is an increasingly important component in the effective analysis of large biological datasets. However, visualization of large datasets also suffers from scalability issues. While multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large biological datasets. In this talk, I describe our group’s work on designing novel and scalable systems for the visual analysis on large display environments.

11:40 Enjoy Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing


1:55 Chairperson’s Remarks

Christel Chehoud, Ph.D., Scientist, Data Sciences, Janssen

2:00 Framework for Management, Analysis, and Visualization of Matched Genetic and Clinical & Real-World Datasets

Alexandra Dumitriu, Ph.D., Manager, Computational Biology and Biomedicine, Genome Sciences and Technologies, Pfizer

Coupled Electronic Health Records (EHR) and genetic datasets have recently become more accessible for exploratory clinical research. This advancement brings opportunities for genotype-to-phenotype (G2P) queries, but also poses challenges for researchers who need to address similar issues around data management, analysis, and visualization. The presentation will describe our approach and learning related to building an analytical framework focused on EHR-based G2P resources, which allows for streamlined protocols, including semi-automated computational cohort definitions and PheWAS analyses.

2:30 Clinical Trials Innovations in the Age of Big Data and Advanced Analytics

Christel Chehoud, Ph.D., Scientist, Data Sciences, Janssen

Clinical trials operations have historically been a domain rich in data as by nature clinical trials are heavily regulated processes that entail data collection. As a result, during the entire life cycle of a trial enormous amount of data is collected and stored in all phases, from selection of sites to monitoring and auditing to ensure quality and compliance. This has culminated in sponsors of trials having the unique opportunity to leverage big data and advanced analytics to optimize and improve clinical trials operations at a time when advanced analytics is coming of age. At Janssen, the data sciences group in partnership with global clinical operations has launched initiatives in site selection, risk-based monitoring, and quality and compliance to bring innovations based on big data and advanced analytics to clinical trials operations. This has resulted in improved efficiencies in different aspects of operations during the life cycle of a trial. Additionally, we have pioneered the application of technologies such as machine learning, natural language processing, and artificial intelligence to create novel solutions which have resulted in data driven efficiencies realized from predictive and prescriptive analytics on clinical trials data. This presentation will delve on aspects of this work and present vignettes to highlight the challenges and successes.

3:00 Big Data Analysis of Human Gliomas Using Oncoscape

Eric C. Holland, M.D., Ph.D., Senior Vice President, Director, Human Biology, Seattle Translational Tumor Research, Fred Hutchinson Cancer Research Center

We have developed an open access on-line tool for visualizing and interacting with large clinical/molecular datasets of cancer patients. This tool (oncoscape) collapses large data using MDS and connects tumors with the molecular alterations found in them and with clinical outcome. The analytic tabs are websockets that can be written independently. Subsets of patients or genetic alterations can be identified in one tab and further refined in other tabs.

3:30 Conference Adjourns

Platinum Sponsors:


Dell EMC

Elsevier small logo