Bio IT World Expo 2016  
Bio IT World Expo 2016

Track 7 - April 21 – 23, 2015

Data Visualization and Exploration Tools 

Genomics, Drug Discovery and Clinical Development

As pharma and biotech generate ever greater volumes of data, data visualization allows for deeper analysis and better informed decision-making from these big data sets. Track 7 will showcase how to design, implement and evaluate visualization techniques and tools in support of genomics and sequencing research, as well as in drug discovery and clinical development.

Final Agenda

Download Brochure | Workshops 

Tuesday, April 21

7:00 am Workshop Registration and Morning Coffee

8:00 – 11:30 Recommended Morning Pre-Conference Workshops*

Integrative Visualization Strategies for Large-Scale Biological Data - View Detailed Agenda 

12:30 – 4:00 pm Recommended Afternoon Pre-Conference Workshops*

Customizing your Digital Research Environment with Genome Browsers - View Detailed Agenda 

* Separate registration required

2:00 – 6:30 Main Conference Registration


Click here for detailed information. 

5:00 – 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing

Wednesday, April 22

7:00 am Registration Open and Morning Coffee


Click here for detailed information. 

9:00 Benjamin Franklin Awards and Laureate Presentation

9:30 Best Practices Awards Program

Internet 2

9:45 Coffee Break in the Exhibit Hall with Poster Viewing


10:50 Chairperson’s Opening Remarks

Alexander Lex, Ph.D., Postdoctoral Fellow & Lecturer, Harvard School of Engineering & Applied Sciences

11:00 AIDEAS: An Integrated Cheminformatics Solution

Rishi Gupta, Senior Research Scientist, Platform Informatics and Knowledge Management, AbbVie, Inc.

AIDEAS is a novel concept that has brought together scientific tools and techniques under a unified platform that has enabled Chemists and Biologists to do their own data analysis and visualization. AIDEAS is a simple, single point of entry, easy-to-use workbench accessed via Spotfire with BioVia's Accelrys Enterprise Platform as a work engine resulting in development of chemistry design and synthesis workflows. This has not only significantly improved the use of all the Cheminformatics tools but also made AIDEAS an indispensable tool across the Discovery community. This presentation will be specifically directed towards a unique method called iSCORE that was developed as a probabilistic multi-parametric scoring methodology. iScore uses data based on AbbVie’s proprietary in vivo and in vitro assay data as well as in silico ADMET models.

11:30 Bringing Process, Chemical & Analytical Data Together: Data Mining & Visualization

Jean-Michel Adam, Ph.D., Senior Principal Scientist, Preclinical CMC Process Research, Roche Pharma Research & Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd.

Automated reactors, coupled with in-/off-line analytical tools, are routinely used in the chemical process R&D world. While these do help increase process knowledge and overall productivity, an increasing amount of data are is being generated, generally in a fragmented way. We would like to report a first approach aiming at integrating process data from automated reactors, analytical systems output as well as chemical information from Electronic Lab Notebook.

Schrodinger12:00 pm Collaborative Drug Design at Bristol-Myers Squibb

Brian Claus, Senior Scientist, Bristol-Myers Squibb

Bristol-Myers Squibb has created an environment, based on the LiveDesign and Protein-Ligand Database (PLDB) products from Schrödinger that empowers its scientists to share ideas and modeling results, as well as to keep up-to-date on the latest data in their projects. The environment centralizes and connects computational tools with experimental data. Project team members can add idea compounds simultaneously or collaborate asynchronously. The presentation will discuss system design, customization, and lessons learned.

12:30 Session Break

12:40 Luncheon Presentation (Sponsorship Opportunity Available) or Lunch on Your Own

1:40 Session Break

1:50 Chairperson’s Remarks

Hector Corrada Bravo, Ph.D., Assistant Professor, Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland, College Park

1:55 UpSet: Visualization of Intersecting Sets

Alexander Lex, Ph.D., Postdoctoral Fellow & Lecturer, Harvard School of Engineering & Applied Sciences

Understanding relationships between sets is an important analysis task in the life sciences. The major challenge for creating insightful visualizations is the combinatorial explosion of the number of set intersections if the number of sets exceeds a trivial threshold. The large number of intersections is particularly problematic for Venn and Euler diagrams. To address this, we developed UpSet, a novel, interactive and web-based visualization technique for the quantitative analysis of sets, their intersections, and aggregates of intersections. UpSet has been used for various applications both in the life sciences and in other domains. It was employed, for example to evaluate variant calling algorithms, and to explore the specificity of drugs at inhibiting genes. UpSet is open source and can be accessed at

2:15 Talk Title to be Announced

Heike Hofmann, Ph.D., Professor, Statistics, Iowa State University

2:35 Toward an Open Source Suite to Bridge the Gap between Plate-Based Screening and Results

Peter Henstock, Ph.D., Senior Principal Scientist, Research Business Technology Group, Pfizer, Inc.

Scientists in academic laboratories through large pharmaceutical companies have all encountered the challenges of efficiently extracting results from plate-based assay data. Issues from compound/reagent/plate management, assay format variability, instrumentation, output file formats, and analysis software invariably lead to a cumbersome process. To improve the efficiency, an open source suite of web-based tools is being developed that spans the key steps of plate editing, QC/QA calculation and visualization, and a user-driven non-coding approach to output file parsing. For results analysis, the suite includes visualization and computational approaches for interactively interpreting single-point, dose-response, and multivariate data.

Andrés Arslanian*, David Bonner*, Ivan Bugarinovic*, Mark Ford*, Alexander Galushka*, Cindy J. Liu*, Zachary Martin*, Frank O’Connor*, Alan A. Orcharton*, Gerson A. Rodrigues*, Sean M. Sinnott*, Timothy S. Stefanski*, Jaime A. Valencia*, Nikita E. Yaroshevsky*, Saaqib Zaman*, Robert Zupko*‡, Peter V. Henstock*†
*Harvard University, ‡Essen BioScience Inc., †Pfizer Inc.

Tamr2:55 Combining Machine & Human Intelligence to Successfully Integrate Biomedical Data

Timothy Danford, Ph.D., Field Engineer, Tamr

Clinical and biomedical data collections are the source of an increasingly large number of data types, formats, data capture systems, and experimental measurement methodologies. Before this data can be gathered into a data warehouse, analyzed to support decisions, or submitted to a regulatory agency, it must first be integrated and curated into a single data-set. Traditional methods for data integration, involving teams of data curators and manually-constructed sets of programmatic scripts or rules, are unable to scale to the growing size of modern data sources or are brittle in the face of changing data standards. We will describe Tamr, a new and automated method for data integration and curation, which combines the power and speed of automated machine learning techniques with the accuracy of human domain expertise. By bringing human data curators and domain experts “into the loop” of advanced machine learning algorithms, Tamr can speed the process of data curation while improving reproducibility of integration and maintaining the high standards of accuracy expected from human curators. Using examples from clinical and genomics datasets, we will describe how Tamr shortens the time between raw data collection and advanced analytic insight.

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing

4:00 Making Visualization and Exploration Tools Truly Useful in the Regulatory Setting

Timothy Kropp, Ph.D., Associate Director for Innovation, Office of Computational Science, US FDA/CDE

As FDA applies tools and technologies to regulatory data (“big” data as well as “little”) a lot is being learned about what is truly useful and in what contexts (not what is pretty or simply interesting). This talk will provide an overview of what informatics approaches FDA/CDER is using for visualization and exploration of scientific/clinical review data, how we are modifying what we use for better usefulness, what our biggest challenges and opportunities are, and where we want to go.

4:30 Feeding the Analytics Engine: Targeting Optimal Clinical Trial Sites, a Case Study

James Gill, Ph.D., Director, Analytics Tools and Technology, Bristol-Myers Squibb R&D

Balazs Flink, Feasibility Analytics Lead, Bristol-Myers Squibb R&D

It is no surprise that as soon as an analytical approach is proposed, access to data becomes a hurdle. In this talk we review a successful approach to improving our clinical trials site selection process by leveraging unique data in a dashboard format. Our keys to success included a clear understanding of the impact of different factors on site performance, how we can find surrogates for non-existing data and using an exploratory process with our scientists.

5:00 Delivering Standardized Clinical and Preclinical Data to Scientists in Guided Analysis

Baisong Huang, Principal Statistical Analyst, Novartis Institutes for BioMedical Research, Inc.

As visualization tools evolve and become widely accepted in investigating and monitoring drug safety and efficacy, rapid access to standardized, interpretable data views is becoming essential. We will present some examples how we standardized and aggregated data in both translational and clinical settings and provided guided analysis to visualize the data in real-time.

5:30 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing

6:30 Close of Day

Thursday, April 23

7:00 am Registration and Morning Coffee


Click here for detailed information. 

10:00 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced


10:30 Chairperson’s Remarks

William J.R. Longabaugh, MS, Senior Software Engineer, Institute for Systems Biology


Jessie Kennedy, Dean of Research and Innovation, Edinburgh Napier University

Most visualizations that display pedigree structure for genetic research have been designed to deal with human family trees. Animal and plant breeders study the inheritance of genetic markers in pedigrees to identify regions of the genome that contain genes controlling traits of economic benefit and, ultimately, to improve the quality of animal and plant breeding programs. However, due to the size and nature of plant and animal pedigree structures, human pedigree visualizations tools are unsuitable for use in studying animal and plant genotype data. We discuss two visualization tools, VIPER (designed for cleaning genotyping errors in animal pedigree genotype datasets) and Helium (designed to visualize the transmission of alleles encoding traits and characteristics of agricultural importance in a plant pedigree-based framework), and show how they support the work of biologists.

11:10 Visualization Tools for the Refinery Platform

Nils Gehlenborg, Ph.D., Research Associate, Center for Biomedical Informatics, Harvard Medical School

The Refinery Platform ( is a web-based data visualization and analysis system for epigenomic and genomic data designed to support reproducible biomedical research. The analysis backend employs the Galaxy Workbench and connects to a data repository based on the ISA-Tab data description format. In my talk I will discuss the exploratory visualization tools that we have integrated into Refinery.

11:40 Visualizing Genomic Variants and Annotations is Vital for Accurate Interpretation

Gabe Rudy, Vice President, Product & Engineering, Golden Helix, Inc.

In both the research and clinical context, the analytical steps to discover candidate variants of importance involves many transformations and cross-referencing of genomic datasets. Genomic visualization with tools like GenomeBrowse ( provide a genomic context critical for accurately interpreting function as well as detecting false-positive and false-negative calls and annotations. With visual case studies of variants, their alignments and genomic context, I will discuss the different representation of multi-nucleotide polymorphisms and other issues that impact public data annotations and functional classification of variants.

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing


1:55 Chairperson’s Remarks

Nils Gehlenborg, Ph.D., Research Associate, Center for Biomedical Informatics, Harvard Medical School

2:00 Combing the Hairball: Network Visualization with BioTapestry and BioFabric

William J.R. Longabaugh, MS, Senior Software Engineer, Institute for Systems Biology

Networks models are crucial for understanding complex biological systems, yet traditional node-link diagrams of large networks provide very little visual intuition, and there is a need to develop scalable, unambiguous, and rational network visualization techniques. Our applications, BioTapestry ( and BioFabric (, are designed to address this need, and I will discuss how they use novel approaches to avoid the “hairball” trap.

2:30 Visualization of Comparative Genomics Data: Results, Challenges, and Open Questions

Inna Dubchak, Ph.D., Senior Scientist, Lawrence Berkeley National Laboratory

As the rate of generating sequence data continues to increase, visualization tools for interactive exploration and interpretation of comparative data at the level of gene, genome, and ecosystem are of critical importance. We will talk about strengths and limitations of existing methods, and highlight new challenges in the visualization of huge volumes of complex comparative data.

3:00 Interactive and Exploratory Visualization of Epigenome-Wide Data

Hector Corrada Bravo, Ph.D., Assistant Professor, Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland, College Park

Data visualization is an integral aspect of the analysis of epigenomic experimental results. Commonly, the data visualized in these tools is the output of analyses performed in computing environments like Bioconductor. These two essential aspects of data analysis, algorithmic/statistical analysis and visualization, are usually distinct and disjoint but are most effective when used iteratively. We will introduce epigenomics data visualization tools that provide tight-knit integration with computational and statistical modeling and data analysis: Epiviz (, a web-based genome browser application, and the Epivizr Bioconductor package that provides interactive integration with R/Bioconductor sessions. This combination of technologies permits interactive visualization within a state-of-the-art functional genomics analysis platform. The web-based design of our tools facilitates the reproducible dissemination of interactive data analyses in a user-friendly platform. We will illustrate these tools via analyses of the colon cancer epigenome, in particular, the relationship between clonal and population heterogeneity as inferred from DNA methylation sequencing data.

3:30 Visual-Analytic Systems for Integrative Genomic Analysis of Cancer Data

Raghu Machiraju, Ph.D., Professor, Ohio State University

Cancers are highly heterogeneous with different subtypes. Recently, integrative approaches were adopted that combined multiple types of omics data. In this talk, I present visual analytic solutions for the simultaneous and integrative exploration of multiple types genomics data including those from The Cancer Genome Atlas (TCGA) project. Using different combinations of mRNA and microRNA features we suggest potential combined markers for prediction of patient survival.

4:00 Conference Adjourns

Download Brochure | Workshops 

Reg Early


View 2015 Brochure
View 2015 Brochure
View Videos & Photos 
Platinum Sponsors

Cycle Computing logo

DDN Storage  


Illumnia logo  

Intel Logo  


Official Media Partner

Conference CD

CD iconOrder the 2015 event proceedings - now available on CD

Complimentary Downloads

View white papers, listen to podcasts, and more!

  • Making the World's Knowledge Computable
  • Bioinformatics in the Cloud
  • The Application of Text Analytics to Drug Safety Surveillance

Related Event

 Medical Informatics World Related