Bio IT World Expo 2016  
Bio IT World Expo 2016


Track 5: Next-Gen Sequencing Informatics 


Final Agenda: 

Track 5 explores sequencing platforms and managing the instruments, data mining, analysis tools and workflows, sequencing informatics and cancer, and trends and new applications.



7:00 am Workshop Registration and Morning Coffee

8:00 - 4:00 pm Pre-Conference Workshops*

Recommended workshop: Tools and Methods for RNA-seq Analysis (W6) 8:00-11:30am

Recommended workshop: Next-Generation Sequencing: from Data to Discovery (W13) 12:30-4:00pm


*Separate Registration Required

2:00 - 6:00 Main Conference Registration

4:00 Event Chairperson’s Opening Remarks

Cindy Crowninshield, RD, LDN, Conference Director, Cambridge Healthtech Institute

Sponsored by
Isilon Systems
4:05 Keynote Introduction
Chris Blessington, Life Sciences Solutions Architect, Isilon

Plenary Keynote
4:15 Making the World’s Knowledge Computable

StephenWolframStephen Wolfram, Ph.D., CEO, Wolfram Research; Creator of Wolfram|Alpha




Sponsored by
5:00 Welcome Reception in the Exhibit Hall and Poster Viewing



7:00 am Registration and Morning Coffee

8:15 Event Chairperson’s Opening Remarks

Phillips Kuhl, Co-Founder and President, Cambridge Healthtech Institute

Sponsored by
8:20 Keynote Introduction
Grant Stephen, CEO, Tessella, Inc.

Plenary Keynote
8:30 Interacting with Complex Information Landscapes: Integration and Next Generation User Interfaces

Bryn RobertsBryn Roberts, Ph.D., Global Head, Informatics, Pharma Research and Early Development, F. Hoffmann-La Roche Ltd.




9:00 Benjamin Franklin Award/Presentation & Best Practices Awards Program

Sponsored by

9:45 Coffee Break in the Exhibit Hall and Poster Viewing

Data Mining, Analysis and Flow

10:50 Chairperson’s Remarks

David J. Dooling, Ph.D., Assistant Director, The Genome Center, Washington University, St. Louis

11:00 From Data to Discovery: Case Studies, Lessons Learned, and Next Steps

Joseph Szustakowski, Ph.D., Senior Group Head, Bioinformatics, Biomarker Discovery, Novartis

This presentation will describe several case studies to highlight the bioinformatics challenges we face when analyzing NGS data, the computational infrastructure required to enable such analyses, and the analysis algorithms and strategies used to solve the problems at hand. From our early successes (and failures) we have already learned crucial lessons that will help to maximize the impact of future NGS projects, and to prepare for third generation sequencing technologies.


11:30 Extremely Fast Queuing and Sorting for Next-Gen Sequencing Data Flow and Data Mining

Jochen Kumm, Ph.D., Director, Biomathematics; Head, IT, Stanford Genome Technology Center, Stanford University

The Stanford Genome Technology Center is a world leading genomics facility bridging the gap between genomics and medical care. Our data flow and analysis pipelines are integrated to deliver high-throughput with simultaneous analysis. We see a five-fold increase in throughput and significant reduction in cost for IT infrastructure linking queuing theory and sorting algorithms. This case study discusses the next gen sequencing pipeline and illustrates the algorithms and software used for significant performance gain cost savings.

Sponsored by
Netezza NEW logo
12:00 pm Identification and Modeling of Gene-Environment Interactions: A Data Intensive Discovery Initiative Case Study with Netezza

Murali Ramanathan, Director of Graduate Studies, Pharmaceutical Sciences and Neurology, State University of New York
The risk of developing of many complex diseases is related to the interactions of environmental factors with genes. Effective and efficient methods for identifying and modeling gene-environment interactions (GEI) are critical for medical discovery from next generation sequencing studies. However, GEI analysis is a combinatorially explosive problem. I will describe AMBIENCE and related GEI analysis algorithms that use novel information theoretic search metrics to search combinatorial space. I will also demonstrate how novel data intensive supercomputing architectures are capable of enhancing computational efficiency in these applications.

Sponsored by
GenoLogic logo
12:30 Luncheon Presentation
Selecting A LIMS for Next Generation Sequencing Research

Michael Kuzyk, Ph.D., Product Manager, Omics Labs, GenoLogics Life Sciences Software
Bruce Pharr, VP Products, Marketing and Business Development, GenoLogics Life Sciences Software

No other industry has seen processing speeds rise and costs drop as dramatically as genomics. Modern genomics labs are now struggling to manage the data these techniques generate. A recent survey cites data storage, data management, and informatics as the biggest hurdle to expanding next gen sequencing (NGS). Moreover, analysis costs for sequencing remain high, spotlighting the need for better ways to centralize information and track sample information across experiments. This talk reviews the informatics challenges presented by NGS and proposes three criteria that labs should assess when selecting an NGS lab information management system (LIMS).

1:40 Chairperson’s Remarks

David J. Dooling, Ph.D., Assistant Director, The Genome Center, Washington University, St. Louis

1:45 Assessing Next-Gen Data Quality in Production Analysis

Tim Fennell, Senior Developer, Broad Institute

2:15 Data Driven Sequence Analysis

David J. Dooling, Ph.D., Assistant Director, The Genome Center, Washington University, St. Louis

The massive scale of next-generation sequence data forces analysts to often make compromises between sensitivity and specificity, accuracy and speed, etc. How can an analyst be certain that they are making the right choices? This presentation will discuss a combined computational and laboratory framework that allows for unprecedented exploration of the computational variable (tools and their parameters) space, ensuring optimal analysis pipelines are employed for each data set.

Sponsored by
Omixon small
2:45 How Many Indels Are You Missing? Highly Accurate Variant Analysis in Diagnostic Applications with Omixon Variant Toolkit

Attila Berces, Ph.D., CEO, Omixon

This presentation shows case studies applying the Omixon Variant Toolkit, a highly sensitive tool to find variants and small indels. The cases range from pathogen strain identification to human exome study. Results are compared to Bowtie, BFAST, SHRiMP, and Bioscope results.

Sponsored by
Complete Genomics
3:00 Accurate Complete Sequences of Over 1000 Human Genomes including a Ethnically Diverse Reference Panel Released to the Public
Steve Lincoln, Vice President of Scientific Applications, Complete Genomics
We have developed a custom sequencing platform which can inexpensively produce high-depth sequences of complete human genomes in large scale.  Sophisticated and specialized bioinformatics algorithms leveraging local de novo assembly allow this system to achieve high sensitivity and specificity for discovering SNPs, indels, and structural variants.  We have applied these methods to family studies of simple and complex disease,  cell biology studies and the study of somatic mutation in cancer.  We will review progress on the platform to date and focus on a set of over 60 ethnically diverse genomes which have been generated for release to the public.

3:15 Refreshment Break in the Exhibit Hall and Poster Viewing

3:45 Genome Sequencing in Support of Translational Research

Sandor Szalma, Ph.D., Head, Oncology Informatics, Oncology Biomarkers, Centocor R&D, Inc.

We have implemented a BioIT World award winning knowledge management platform - tranSMART - supporting translational research. The initial focus was to combine clinical, genomics and proteomics data from clinical and non-clinical studies. We now are extending the system to support biomarker discovery using genetics data - in particular SNP chips and next-generation sequencing. In this talk we will present how this open source system is being extended and initial success will be highlighted.

4:15 A Bi-Asymmetric-Laplace Model (BALM) to Analyze ChIP-seq and MBD-seq Data

Victor Jin, Ph.D., Assistant Professor, Department of Biomedical Informatics, The Ohio State University

This talk presents a novel algorithm based on a bi-asymmetric-Laplace model (BALM) to analyze both ChIP-seq and MBD-seq data. The algorithm was not only tested to achieve better accuracy on publicly available TF ChIP-seq data compared to other tools, but also applied to analyze MBD-seq data from breast cancer MCF7 cells. The results demonstrate the algorithm’s ability to distinguish closely positioned target sites and to accurately predict DNA methylation regions. This study demonstrates BALM may provide another useful tool for the sequencing user community.

      Sponsored by
4:45 The Pipeline Pilot NGS Collection: A New Approach to the Challenges of NGS Data Analysis
Clifford Baron, Product Marketing Director, Accelrys
In repeated surveys, scientists using next generation sequencing technologies report that data analysis is their greatest challenge, and the most significant impediment to continued market growth. This is so despite the availability of over a dozen commercial software offerings and literally hundreds of public domain NGS algorithms, with more appearing weekly. The most frequently discussed factor contributing to the data analysis challenge is the sheer volume of data generated. But as significant though less frequently acknowledged is the rapid evolution of available algorithms and attendant computational best practices, and the need for techniques tailored to specific research goals. We discuss how Pipeline Pilot, a widely used commercial software system for the rapid development and deployment of computational pipelines, can be used along with a newly released collection of NGS analysis components to address these fundamental challenges.

5:00 Sponsored Presentation (Opportunity Available)
5:15 Best of Show Awards in the Exhibit Hall

6:15 Exhibit Hall Closes


8:45 am Event Chairperson’s Opening Remarks

Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World

Sponsored by
BT small
8:50 KEYNOTE PANEL: Keynote Introduction
A special plenary session featuring a series of succinct, forward-looking presentations by:

Ken Buetow, Ph.D., Associate Director, Bioinformatics and Information Technology, National Cancer Institute

Debra Goldfarb, Senior Director, Strategy, Microsoft

Martin D. Leach, Ph.D., Executive Director, MRL IT for Discovery & Pre-Clinical Sciences, Merck & Co.

Mark Boguski, M.D., Ph.D., Founder, Resounding Health Incorporated

Jamie Heywood, Co-founder and Chairman, PatientsLikeMe
Yury Rozenman, Global Head of Marketing, Pharmaceutical and Life Sciences Sector, BT Global Services


10:30 Coffee Break in the Exhibit Hall with Poster Competition

Sequencing Informatics and Cancer

10:55 Chairperson’s Remarks

Tim Harris, Ph.D., CTO and Director, Advanced Technology Program, SAIC-Frederick

11:00 Does the Sequencing Data Tsunami Mean that People and Projects Are Going to be Left High and Dry? (60 min session)

Tim Harris, Ph.D., CTO and Director, Advanced Technology Program, SAIC-Frederick

Ewen Kirkness, Ph.D., Professor, The J. Craig Venter Institute

Robert Stephens, Ph.D., Director, Bioinformatics Support Group, Advanced Biomedical Computing Center, Information Systems Program, SAIC-Frederick/NCI-Frederick

There is an increasing disconnect between the ability to generate sequence data by using second and third generation methods and the ability to interpret what the sequence data means. In tumor DNA sequencing, for example, there are many common mutations being found in cancers but there are also mutations that are being found in the same cancers by some sequencing techniques but not by others. This presentation will explore why this is and what it means.

Sponsored by
12:00 pm Dell Next Generation Bioinformatics and Research Computing Solutions: The Power to do more Science

Jose Alvarez, Business Development Manager, HPC Solutions, Dell
Utilizing High Performance purpose build building blocks, Dell is simplifying research computing. Dell has created an ecosystem that is helping research groups accelerate their time to results and enhance the user interaction by simplifying reference architecture, deployment and integration. Dell has also partnered with Next Generation Sequencing (NGS) industry leaders and instrument vendors to deliver an array of solutions that facilitate the collection and analysis of NGS data. With an array of high performance storage and archival solutions, Dell has simplified the retention and management of the NGS data life cycle. In this short presentation the Dell Life Sciences Research Computing team will give a snapshot of the ecosystem that gives researchers the power to do more Science.  

Sponsored by
12:15 From NexGen Sequencing Data Management to 4'th Generation Sequencing
Michael Hehenberger, Ph.D., IBM, T.J. Watson Research Center
IBM is currently working with leading Sequencing Centers on data management challenges posed by whole genome sequencing activities. It is shown how leading edge hardware and software solutions can be used to address the related extreme requirements. In addition, IBM Research has partnered with Roche 454 to develop a new "DNA Transistor" based sequencing technology. While the technical challenges are significant, the partners are optimistic about being able to succeed with this exciting project.

12:30 Luncheon in the Exhibit Hall and Poster Viewing

2:00 Exhibit Hall Closes

1:55 Chairperson’s Remarks

Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World

2:00 Using Next-Gen Analysis to Improve Cancer Treatment Decisions

Paul Aldridge, CIO, Genomic Health

This presentation will cover various use cases for next generation sequencing data and analysis for research into cancer treatment efficacy. Attendees will gain a broader knowledge of costs and other considerations when using various approaches to enable R&D researchers to get more discoveries done.

Sequencing Informatics Trends
and New Applications

2:30 NGS-AaaS: Next Generation Sequencing-Annotation as a Service
Robert Haines, University of Manchester, UK

Next Generation Sequencing technologies bring genome-wide sequencing within the reach of a greater number of research labs. The $1000 genome, however, is accompanied by the $100,000 analysis. How do we keep down the cost of analytics? How do we enable labs with limited bioinformatics capability or local compute provision to benefit from NGS? Scientific workflow systems can be used for assembly and annotation pipelines. Focusing on the latter, Manchester, together with partners in Liverpool and Eagle Genomics Ltd, are using the commercial Amazon EC2 cloud and the open source Taverna workflow system to operate an on-demand, low cost, on-line analytics service for DNA analysis.  As a case study we will present an AaaS application for understanding genetic variation between cattle breeds.

3:00 Sequencing without a Sequencer: How Buying Lanes Can Beat Buying a Machine

Keith Robison, Ph.D., Lead Senior Scientist, Informatics, Infinity Pharmaceuticals, Inc.

What are the economics of buying sequencing services vs. owning your own lab? How can you mix internal operations with contracted ones? What are potential issues in vendor performance? What are the trade-offs of accessing multiple sequencing platforms through vendors? This talk will focus on the economic & operational issues around contracting for sequencing & analysis services including vendor selection issues, vendor experiences, and opportunities.

3:30 The Atlas Cloud Computing Infrastructure for Organizing and Querying Multiomics Data (Joint with Tracks 1-5)

Misha Kapushesky, Ph.D., Functional Genomics Team Leader, EBI, Cambridge UK

The Expression Atlas is a cloud computing based distributed infrastructure for organizing and querying multiomics data. Built upon the open-source Expression Atlas project at the EBI in partnership with the pharmaceutical industry, the Atlas provides a scalable solution that can be easily deployed on in-house servers or accessed remotely in the cloud. Learn how the Atlas deals with secure processing and combined analysis and integration of public/private transcriptomic and proteomic data, with an emphasis on our novel pipeline for next-generation sequencing data processing and reporting.

4:00 Conference Adjourns



View 2016 Photos & Videos  

View 2016 Brochure
View 2016 Brochure
Platinum Sponsors


Cycle Computing logo small

DDN Storage  

Elsevier R&D Solutions


 IBM Logo Illumnia logo  

Intel Logo  

Precision for Medicine


 Seven Bridges Genomics

View All Sponsors

Official Media Partner

Official PR Partner

View All Media Partners

Conference CD

CD iconOrder the 2015 event proceedings - now available on CD

Complimentary Downloads

View white papers, listen to podcasts, and more!

  • Making the World's Knowledge Computable
  • Bioinformatics in the Cloud
  • The Application of Text Analytics to Drug Safety Surveillance

Related Event

 Medical Informatics World Related