Track 6 - April 5 – 7, 2016

Next-Gen Sequencing Informatics

Advances in Large-Scale Computing

Tremendous advancements have been made to broaden NGS applications from research to the clinic. Especially as genomics becomes more integrated with precision medicine initiatives. In spite of this, enormous challenges for NGS still exist including real-time sequencing, data storage, processing, scaling, quality control management, security and compliance in the cloud, and interpretation. Track 6 presents case studies on these challenges.

Tuesday, April 5

7:00 am Workshop Registration and Morning Coffee

8:00 – 11:30 Recommended Morning Pre-Conference Workshops* Intelligent Methods Optimization of Algorithms of NGS

12:30 – 4:00 pm Recommended Afternoon Pre-Conference Workshops* Determining Genome Variation and Clinical Utility

* Separate registration required

2:00 – 6:00 Main Conference Registration


Click here for detailed information

Precision for Medicine5:00 – 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing

Wednesday, April 6

7:00 am Registration Open and Morning Coffee


Click here for detailed information

9:00 Benjamin Franklin Awards and Laureate Presentation

9:30 Best Practices Awards Program

9:45 Coffee Break in the Exhibit Hall with Poster Viewing


10:50 Chairperson’s Opening Remarks

Hans Cobben, CEO, Bluebee

11:00 Time to Build Personal Genome

Wenming Xiao, Ph.D., Staff Fellow, Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA

Precision medicine is based on interrogation of genetic alteration in one individual, which requires precise and complete characterization of personal genome. Whole genome sequencing has been becoming cheaper and affordable and the challenge of routinely applying it in the precision medicine era largely rests on bioinformatics solution, particularly for personal genome assembly. This study is to establish the best practice of personal genome assembly and quality matrices and to provide guidance for usage of personal genome in clinical application by investigating the impact of various the next-generation sequencing (NGS) parameters, such as coverage, read length, and methods on assembly quality.

11:30 An Innovative and Globally Distributed Genome Management System

Thomas Thies, Senior Scientist, Data/Information Architecture and Terminology, pREDi, Roche

The huge amount of genomic data which needs to be analyzed timely by a globally distributed scientific workforce cannot move around the globe. Instead the analysis pipes are brought to the data. This talk will introduce you to a solution that follows this new paradigm. In addition it will explain how we are leveraging existing HPC environments including governance models which fuel the innovative capacity of our computational scientists.

12:00 pm An Integrated High Performance Analytics Solution for Genomics and Translational Research

Kathy Tzeng, WW Technical Lead, Healthcare and Life Science Solutions, IBM Systems, IBM

Janis Landry-Lane, WW Program Director, Healthcare and Life Science Solutions, IBM Systems, IBM

The rapid advances in sequencing technology are driving the use of genomics information in various domains. Processing raw data from a sequencer and translating it into insights in a timely fashion requires a high performance, scalable analytics solution to integrate genomics information with other data sources. IBM’s approach of building integrated solutions with our customers and partners will be highlighted.

12:30 Session Break

12:40 Luncheon Presentation I: Not Just Noise: Transforming Big Data into Smart Data

Brady Davis, Senior Director, Informatics, Illumina, Inc.

When it comes down to it, big data is only a big deal when you can attach context and meaning to it. Smart data -- that is the right data at the right time to the right person -- can help professionals enhance and inform care decisions. That’s the prize; and while everyone’s got their eyes on it, not everyone knows how to get their hands on it. This session will focus on how Illumina is working to provide solutions that look at data at every stage, from collection and protection to collaboration, storage and analysis.

Cray1:10 Luncheon Presentation II: The Edge of Analytics Insight

Ted Slater, M.A., M.S., Global Head of Healthcare & Life Sciences, Cray Inc.

Matt Gianni, Functional Solution Architect, Cray Inc

Learn how to power your life science pipelines — from deep learning to clinical genomics — using the latest advances in analytics. Step up performance, enable the rigor your workloads require, and flex with the evolving needs of your business. Learn what software strategies scale best with Cray’s novel, advanced system — including interconnects, advanced memory stacks, graph engines, storage and cluster management.

1:40 Session Break


1:50 Chairperson’s Remarks
Shanrong Zhao, Ph.D, Director, Pfizer Worldwide Research & Development

1:55 QuickRNASeq Lifts Large-scale RNA-seq Data Analyses to the Next Level of Automation and Interactive Visualization

Shanrong Zhao, Ph.D, Director, Pfizer Worldwide Research & Development

RNA sequencing is being increasingly used, in part driven by the decreasing cost of sequencing. Nevertheless, the analysis of the massive amounts of data generated by large-scale RNA-seq remains a challenge. By combing the best open source tools developed for RNA-seq data analyses and the most advanced web 2.0 technologies, we have implemented QuickRNASeq (, a pipeline for large-scale RNA-seq data analyses and visualization. The high degree of automation and interactivity in QuickRNASeq leads to a substantial reduction in the time and effort, and QuickRNASeq advances primary RNA-seq data analyses to the next level of automation, and is mature for public release and adoption.

2:25 High-Throughput NGS Sequencing Using Ion Proton in a Clinical Genetic Testing Lab

Yirong Wang, Associate Director, Production Informatics, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai

Clustered Ion Protons provide a highly scalable framework for high throughput sequencing in any genetic testing labs or core sequencing facilities while keeping the cost manageable. Highly customized LIMS and efficient data analysis pipeline also play critical roles in quality control and report generation and delivery. In an initial pilot study, we are able to sequence and process 6000 samples for a large panel (500+ genes) screening under 8 weeks. 

2:55 Shifting a Pure Academic HPC Environment to a Mixed Protected and Free Environment – on the Same Platform

Vanessa Borcherding, Director, Scientific Computing Unit, Weill Cornell Medicine, Department, Physiology and Biophysics, Weill Cornell Medical College

Completely open research computing platforms make sense. They’re less expensive to design, build, and maintain, while giving unfettered access to data helps the collaborative process. However, increased collaborations with commercial entities and increased use of “deidentified” patient data are putting pressure on HPC operations to make security options seamless but without sacrificing performance and price points to which users are accustomed

3:10 Genomic Analysis on a Loosely Coupled AWS Platform with Highly Distributed NGS Data Analytics at a Massive Scale

Tristan Lubinski, Associate Scientist, NGS Informatics, AstraZeneca

The global NGS team at AstraZeneca implements a robust, flexible and consumable platform to perform genomic analysis at scale. The Bina solution was tested by processing tens of thousands of TCGA exomes with modern algorithms against latest reference genome (hg38), in turn demonstrating that the driver mutational landscape of the TCGA can be redefined when comparing against public domain data.

Microsoft Way3:25 Refreshment Break in the Exhibit Hall with Poster Viewing


4:00 Lessons Learned Analyzing Thousands of Samples for Clinical Use Cases Using Amazon Web Services

Ravi Madduri, Fellow, Computation Institute, University of Chicago; Project Manager, Math and Computer Science Division, Argonne National Lab

Globus Genomics is a cloud-based, large scale genomics analysis service that is used by research consortiums, healthcare providers for analyzing 1000s of raw genomics datasets. In order to deliver results of the analyses on the tight deadlines, we created cost-aware resource scheduling on AWS resources and reusable recipes for setting up appropriate security controls required for compliance. In this talk, we will present some of the use cases and success stories from our work.

4:30 Federated EHR Network for Patient Cohort Discovery
Bhanu Bahl, Director of Informatics, Harvard Catalyst

Patient Cohort discovery, across multiple healthcare institutions is a challenge. Accrual of sufficient numbers of patients for orphan diseases clinical trials further compounds the challenge. The Shared Health Research Information Network (‘SHRINE’), a Harvard Catalyst’s open source web-based query tool helps overcome the barriers arising due to variability in the source electronic health record (EHR) systems and returns aggregate numbers of patients across all sites with user-defined characteristics, currently demographics, diagnoses, medications, and selected lab values. By allowing semantic interoperability and consistency of data elements, SHRINE leverages the use of the Informatics for Integrating Biology and the Bedside (‘i2b2’) Hive software, an open source scalable informatics framework. Using federated search architecture, real-time queries can be performed across collaborating institutions, each with their own locally managed patient datasets.

5:00 Selected Poster Presentation: Integrating Data, Tools, and Infrastructure for Efficient Collaboration and Management in Large-scale Biomedicine
Sven Nahnsen, Ph.D., Head, Quantitative Biology Center (QBiC), University of Tuebingen

High-throughput biology in the medical context aims at developing predictive models for disease development and therapy outcome. OMICS technologies and especially next-generation sequencing are becoming increasingly popular for the acquisition of adequate system-wide data. Such experiments need to involve stringent modelling of experiments and bioinformatics workflows to reach comprehensive metadata annotation and to enable automated processing and analyses. We present the latest developments towards the integrative analysis of large and complex high-throughput data; these include the integration of data and project management with state-of-the-art bioinformatics pipelines, as well as a production-scale hard- and software stake. Our integrated technology builds on Liferay as a portlet container, on workflow engines and finally on openBIS for data management application. The infrastructure is embedded in a multi-center environment and allows for distributed data acquisition and management. The modular nature of our software architecture allows for rapid extension of the functionality, such as novel pipelines are visualisation tools for NGS data.

5:30 – 6:30 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing

Thursday, April 7

7:00 am Registration and Morning Coffee


Click here for detailed information

10:00 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced


10:30 Chairperson’s Opening Remarks

10:40 Application of Targeted NGS Sequencing in Personalized Clinical Cancer Therapies

Qichao Zhu, Ph.D., Associate Professor, Genetics & Genomics Sciences, Icahn School of Medicine at Mount Sinai

Our current clinical cancer genome research project is focused on the three key components, sequence analysis for patient genetic profiling, biomarker (genetic variation) collection for cancer precision medicine, and the data processing and integration platform application for clinical report. The goal of the project is developing a comprehensive platform that can totally support precision medicine approach in cancer treatment. The approach is based on the approved concepts that tumor biomarkers are associated with patient prognosis and tumor response to therapy and patient genetic profile can be associated with drug metabolism, drug response and toxicity. Personalized tumor genetic profiles, combining with tumor site and other relevant information are then used for determining optimum individualized therapy options. This presentation concentrates on the following major components for our project: 1) Accurately detecting the tumor genetic and molecular variants in terms of both coverage and precision by developing the new algorithms to improve our variant calling; 2) Matching patients with treatments that are more likely to be effective and cause fewer side effects by collecting, curating and associating biomarkers (genetic and molecular variations) with diseases, drugs and treatment plans; and, 3) Handling the cases in a high-throughput manner by developing a web-based pipeline platform for cancer data processing, sequence analysis, data integration and report generation.

11:10 Integration of Whole Genome and RNA Sequencing to Inform Clinical Treatment of Cancer

Michael Zody, Ph.D., Research Director, Computational Biology, New York Genome Center


11:40 Building National-Scale Genomics Projects with Collaborative, Portable, Reproducible Analysis

Deniz Kural, CEO, Seven Bridges

The number of large genomics projects worldwide is rapidly growing. Such projects involve analysis of hundreds of thousands of whole genomes to accelerate discovery in basic and clinical research. National-scale genomics projects make intensive demands on computation and storage, and test the limits of existing infrastructure. They present severe challenges that require novel approaches to overcome.

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing

NGS and Informatics to Advance Precision Care

1:55 Chairperson’s Remarks
Yuval Itan, Ph.D., MRes, Research Associate, Human Genetics of Infectious Diseases, The Rockefeller University

2:00 Talk Title to be Announced

Gunaretnam (Guna) Rajagopal, Ph.D., Vice President & Global Head, Computational Sciences, Discovery Sciences, Janssen Research & Development, A Johnson & Johnson Company

2:30 A Clinical Genetics Diagnostic System Incorporating Next-Gen Sequencing and Informatics to Advance Pediatric Precision Care

Marcia Nizzari, MS, CIO, Claritas Genomics

Claritas Genomics serves children affected with complex genetic disorders by providing timely and accurate results, resolving families’ long search for answers. We developed a unique “orthogonal sequencing” approach that simultaneously sequences exomes on both the Illumina NextSeq and the Life Technologies Ion Proton instruments. This talk will cover both the lab approach and the bioinformatics analysis pipelines, key components of Claritas’ enterprise architecture for pediatric precision care.

3:00 Software for Interpretation of Next-Gen Sequencing Data in a Clinical Setting

Neil Miller, Director, Informatics, Center for Pediatric Genomic Medicine, Children’s Mercy, Kansas City

The scale and complexity of NextGen Sequencing Data present unique informatics challenges particularly with the issues of variant characterization and clinical interpretation. The Center for Pediatric Genomic Medicine at Children's Mercy, Kansas City has developed novel software applications which are specifically designed to enable non-expert clinicians and researchers to make use of targeted NGS in the diagnosis and management of rare disease. The software programs described are the analytical backbone of the clinical and research applications at CMH including STAT-seq, a program for the ultra-rapid whole genome sequencing of critically ill patients in the neonatal intensive care unit (NICU). Children's Mercy, Kansas City is a leader in the field of applying genomics to clinical care; STAT-seq was named one of Time Magazine's top 10 medical breakthroughs of 2012. The software developed at CMH has been referenced in multiple publications and will soon become available at no cost for research use. Attendees will learn an overview of an end to end solution for interpretation of NextGen Sequence data which is used extensively in a children's hospital. An introduction to software that will shortly become publicly available.

3:30 Finding a Needle in a Haystack: New Approaches to Identify Disease-Causing Mutations in Patients’ Next Generation Sequencing Data

Yuval Itan, Ph.D., MRes, Research Associate, Human Genetics of Infectious Diseases, The Rockefeller University

4:00 Conference Adjourns

Purchase on Demand