Final Agenda:
Track 5 explores sequencing platforms and managing the instruments, data mining, analysis tools and workflows, sequencing informatics and cancer, and trends and new applications.
TUESDAY, APRIL 12
7:00 am Workshop Registration and Morning Coffee
8:00 - 4:00 pm Pre-Conference Workshops*
Recommended workshop: Tools and Methods for RNA-seq Analysis (W6) 8:00-11:30am
Recommended workshop: Next-Generation Sequencing: from Data to Discovery (W13) 12:30-4:00pm
*Separate Registration Required
2:00 - 6:00 Main Conference Registration
4:00 Event Chairperson’s Opening Remarks
Cindy Crowninshield, RD, LDN, Conference Director, Cambridge Healthtech Institute
Sponsored by
4:05 Keynote Introduction
Chris Blessington, Life Sciences Solutions Architect, Isilon
Plenary Keynote
4:15 Making the World’s Knowledge Computable
Stephen Wolfram, Ph.D., CEO, Wolfram Research; Creator of Wolfram|Alpha
Sponsored by
5:00 Welcome Reception in the Exhibit Hall and Poster Viewing
WEDNESDAY, APRIL 13
7:00 am Registration and Morning Coffee
8:15 Event Chairperson’s Opening Remarks
Phillips Kuhl, Co-Founder and President, Cambridge Healthtech Institute
Sponsored by
8:20 Keynote Introduction
Grant Stephen, CEO, Tessella, Inc.
Plenary Keynote
8:30 Interacting with Complex Information Landscapes: Integration and Next Generation User Interfaces
Bryn Roberts, Ph.D., Global Head, Informatics, Pharma Research and Early Development, F. Hoffmann-La Roche Ltd.
9:00 Benjamin Franklin Award/Presentation & Best Practices Awards Program
Sponsored by

9:45 Coffee Break in the Exhibit Hall and Poster Viewing
10:50 Chairperson’s Remarks
David J. Dooling, Ph.D., Assistant Director, The Genome Center, Washington University, St. Louis
11:00 From Data to Discovery: Case Studies, Lessons Learned, and Next Steps
Joseph Szustakowski, Ph.D., Senior Group Head, Bioinformatics, Biomarker Discovery, Novartis
This presentation will describe several case studies to highlight the bioinformatics challenges we face when analyzing NGS data, the computational infrastructure required to enable such analyses, and the analysis algorithms and strategies used to solve the problems at hand. From our early successes (and failures) we have already learned crucial lessons that will help to maximize the impact of future NGS projects, and to prepare for third generation sequencing technologies.
11:30 Extremely Fast Queuing and Sorting for Next-Gen Sequencing Data Flow and Data Mining
Jochen Kumm, Ph.D., Director, Biomathematics; Head, IT, Stanford Genome Technology Center, Stanford University
The Stanford Genome Technology Center is a world leading genomics facility bridging the gap between genomics and medical care. Our data flow and analysis pipelines are integrated to deliver high-throughput with simultaneous analysis. We see a five-fold increase in throughput and significant reduction in cost for IT infrastructure linking queuing theory and sorting algorithms. This case study discusses the next gen sequencing pipeline and illustrates the algorithms and software used for significant performance gain cost savings.
Sponsored by
12:00 pm Identification and Modeling of Gene-Environment Interactions: A Data Intensive Discovery Initiative Case Study with Netezza
Murali Ramanathan, Director of Graduate Studies, Pharmaceutical Sciences and Neurology, State University of New York
The risk of developing of many complex diseases is related to the interactions of environmental factors with genes. Effective and efficient methods for identifying and modeling gene-environment interactions (GEI) are critical for medical discovery from next generation sequencing studies. However, GEI analysis is a combinatorially explosive problem. I will describe AMBIENCE and related GEI analysis algorithms that use novel information theoretic search metrics to search combinatorial space. I will also demonstrate how novel data intensive supercomputing architectures are capable of enhancing computational efficiency in these applications.
Sponsored by
12:30 Luncheon Presentation
Selecting A LIMS for Next Generation Sequencing Research
Michael Kuzyk, Ph.D., Product Manager, Omics Labs, GenoLogics Life Sciences Software
Bruce Pharr, VP Products, Marketing and Business Development, GenoLogics Life Sciences Software
No other industry has seen processing speeds rise and costs drop as dramatically as genomics. Modern genomics labs are now struggling to manage the data these techniques generate. A recent survey cites data storage, data management, and informatics as the biggest hurdle to expanding next gen sequencing (NGS). Moreover, analysis costs for sequencing remain high, spotlighting the need for better ways to centralize information and track sample information across experiments. This talk reviews the informatics challenges presented by NGS and proposes three criteria that labs should assess when selecting an NGS lab information management system (LIMS).
1:40 Chairperson’s Remarks
David J. Dooling, Ph.D., Assistant Director, The Genome Center, Washington University, St. Louis
1:45 Assessing Next-Gen Data Quality in Production Analysis
Tim Fennell, Senior Developer, Broad Institute
2:15 Data Driven Sequence Analysis
David J. Dooling, Ph.D., Assistant Director, The Genome Center, Washington University, St. Louis
The massive scale of next-generation sequence data forces analysts to often make compromises between sensitivity and specificity, accuracy and speed, etc. How can an analyst be certain that they are making the right choices? This presentation will discuss a combined computational and laboratory framework that allows for unprecedented exploration of the computational variable (tools and their parameters) space, ensuring optimal analysis pipelines are employed for each data set.
Sponsored by
2:45 How Many Indels Are You Missing? Highly Accurate Variant Analysis in Diagnostic Applications with Omixon Variant Toolkit
Attila Berces, Ph.D., CEO, Omixon
This presentation shows case studies applying the Omixon Variant Toolkit, a highly sensitive tool to find variants and small indels. The cases range from pathogen strain identification to human exome study. Results are compared to Bowtie, BFAST, SHRiMP, and Bioscope results.
Sponsored by
3:00 Accurate Complete Sequences of Over 1000 Human Genomes including a Ethnically Diverse Reference Panel Released to the Public
Steve Lincoln, Vice President of Scientific Applications, Complete Genomics
We have developed a custom sequencing platform which can inexpensively produce high-depth sequences of complete human genomes in large scale. Sophisticated and specialized bioinformatics algorithms leveraging local de novo assembly allow this system to achieve high sensitivity and specificity for discovering SNPs, indels, and structural variants. We have applied these methods to family studies of simple and complex disease, cell biology studies and the study of somatic mutation in cancer. We will review progress on the platform to date and focus on a set of over 60 ethnically diverse genomes which have been generated for release to the public.
3:15 Refreshment Break in the Exhibit Hall and Poster Viewing
3:45 Genome Sequencing in Support of Translational Research
Sandor Szalma, Ph.D., Head, Oncology Informatics, Oncology Biomarkers, Centocor R&D, Inc.
We have implemented a BioIT World award winning knowledge management platform - tranSMART - supporting translational research. The initial focus was to combine clinical, genomics and proteomics data from clinical and non-clinical studies. We now are extending the system to support biomarker discovery using genetics data - in particular SNP chips and next-generation sequencing. In this talk we will present how this open source system is being extended and initial success will be highlighted.
4:15 A Bi-Asymmetric-Laplace Model (BALM) to Analyze ChIP-seq and MBD-seq Data
Victor Jin, Ph.D., Assistant Professor, Department of Biomedical Informatics, The Ohio State University
This talk presents a novel algorithm based on a bi-asymmetric-Laplace model (BALM) to analyze both ChIP-seq and MBD-seq data. The algorithm was not only tested to achieve better accuracy on publicly available TF ChIP-seq data compared to other tools, but also applied to analyze MBD-seq data from breast cancer MCF7 cells. The results demonstrate the algorithm’s ability to distinguish closely positioned target sites and to accurately predict DNA methylation regions. This study demonstrates BALM may provide another useful tool for the sequencing user community.
Sponsored by
4:45 The Pipeline Pilot NGS Collection: A New Approach to the Challenges of NGS Data Analysis
Clifford Baron, Product Marketing Director, Accelrys
In repeated surveys, scientists using next generation sequencing technologies report that data analysis is their greatest challenge, and the most significant impediment to continued market growth. This is so despite the availability of over a dozen commercial software offerings and literally hundreds of public domain NGS algorithms, with more appearing weekly. The most frequently discussed factor contributing to the data analysis challenge is the sheer volume of data generated. But as significant though less frequently acknowledged is the rapid evolution of available algorithms and attendant computational best practices, and the need for techniques tailored to specific research goals. We discuss how Pipeline Pilot, a widely used commercial software system for the rapid development and deployment of computational pipelines, can be used along with a newly released collection of NGS analysis components to address these fundamental challenges.
5:00 Sponsored Presentation (Opportunity Available)
5:15 Best of Show Awards in the Exhibit Hall
6:15 Exhibit Hall Closes
THURSDAY, APRIL 14
8:45 am Event Chairperson’s Opening Remarks
Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World
Sponsored by
8:50 KEYNOTE PANEL: Keynote Introduction
A special plenary session featuring a series of succinct, forward-looking presentations by:
Ken Buetow, Ph.D., Associate Director, Bioinformatics and Information Technology, National Cancer Institute
Debra Goldfarb, Senior Director, Strategy, Microsoft
Martin D. Leach, Ph.D., Executive Director, MRL IT for Discovery & Pre-Clinical Sciences, Merck & Co.
Mark Boguski, M.D., Ph.D., Founder, Resounding Health Incorporated
Jamie Heywood, Co-founder and Chairman, PatientsLikeMe
Yury Rozenman, Global Head of Marketing, Pharmaceutical and Life Sciences Sector, BT Global Services
|
10:30 Coffee Break in the Exhibit Hall with Poster Competition
10:55 Chairperson’s Remarks
Tim Harris, Ph.D., CTO and Director, Advanced Technology Program, SAIC-Frederick
11:00 Does the Sequencing Data Tsunami Mean that People and Projects Are Going to be Left High and Dry? (60 min session)
Tim Harris, Ph.D., CTO and Director, Advanced Technology Program, SAIC-Frederick
Ewen Kirkness, Ph.D., Professor, The J. Craig Venter Institute
Robert Stephens, Ph.D., Director, Bioinformatics Support Group, Advanced Biomedical Computing Center, Information Systems Program, SAIC-Frederick/NCI-Frederick
There is an increasing disconnect between the ability to generate sequence data by using second and third generation methods and the ability to interpret what the sequence data means. In tumor DNA sequencing, for example, there are many common mutations being found in cancers but there are also mutations that are being found in the same cancers by some sequencing techniques but not by others. This presentation will explore why this is and what it means.
Sponsored by
12:00 pm Dell Next Generation Bioinformatics and Research Computing Solutions: The Power to do more Science
Jose Alvarez, Business Development Manager, HPC Solutions, Dell
Utilizing High Performance purpose build building blocks, Dell is simplifying research computing. Dell has created an ecosystem that is helping research groups accelerate their time to results and enhance the user interaction by simplifying reference architecture, deployment and integration. Dell has also partnered with Next Generation Sequencing (NGS) industry leaders and instrument vendors to deliver an array of solutions that facilitate the collection and analysis of NGS data. With an array of high performance storage and archival solutions, Dell has simplified the retention and management of the NGS data life cycle. In this short presentation the Dell Life Sciences Research Computing team will give a snapshot of the ecosystem that gives researchers the power to do more Science.
Sponsored by
12:15 From NexGen Sequencing Data Management to 4'th Generation Sequencing
Michael Hehenberger, Ph.D., IBM, T.J. Watson Research Center
IBM is currently working with leading Sequencing Centers on data management challenges posed by whole genome sequencing activities. It is shown how leading edge hardware and software solutions can be used to address the related extreme requirements. In addition, IBM Research has partnered with Roche 454 to develop a new "DNA Transistor" based sequencing technology. While the technical challenges are significant, the partners are optimistic about being able to succeed with this exciting project.
12:30 Luncheon in the Exhibit Hall and Poster Viewing
2:00 Exhibit Hall Closes
1:55 Chairperson’s Remarks
Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World
2:00 Using Next-Gen Analysis to Improve Cancer Treatment Decisions
Paul Aldridge, CIO, Genomic Health
This presentation will cover various use cases for next generation sequencing data and analysis for research into cancer treatment efficacy. Attendees will gain a broader knowledge of costs and other considerations when using various approaches to enable R&D researchers to get more discoveries done.
2:30 NGS-AaaS: Next Generation Sequencing-Annotation as a Service
Robert Haines, University of Manchester, UK
Next Generation Sequencing technologies bring genome-wide sequencing within the reach of a greater number of research labs. The $1000 genome, however, is accompanied by the $100,000 analysis. How do we keep down the cost of analytics? How do we enable labs with limited bioinformatics capability or local compute provision to benefit from NGS? Scientific workflow systems can be used for assembly and annotation pipelines. Focusing on the latter, Manchester, together with partners in Liverpool and Eagle Genomics Ltd, are using the commercial Amazon EC2 cloud and the open source Taverna workflow system to operate an on-demand, low cost, on-line analytics service for DNA analysis. As a case study we will present an AaaS application for understanding genetic variation between cattle breeds.
3:00 Sequencing without a Sequencer: How Buying Lanes Can Beat Buying a Machine
Keith Robison, Ph.D., Lead Senior Scientist, Informatics, Infinity Pharmaceuticals, Inc.
What are the economics of buying sequencing services vs. owning your own lab? How can you mix internal operations with contracted ones? What are potential issues in vendor performance? What are the trade-offs of accessing multiple sequencing platforms through vendors? This talk will focus on the economic & operational issues around contracting for sequencing & analysis services including vendor selection issues, vendor experiences, and opportunities.
3:30 The Atlas Cloud Computing Infrastructure for Organizing and Querying Multiomics Data (Joint with Tracks 1-5)
Misha Kapushesky, Ph.D., Functional Genomics Team Leader, EBI, Cambridge UK
The Expression Atlas is a cloud computing based distributed infrastructure for organizing and querying multiomics data. Built upon the open-source Expression Atlas project at the EBI in partnership with the pharmaceutical industry, the Atlas provides a scalable solution that can be easily deployed on in-house servers or accessed remotely in the cloud. Learn how the Atlas deals with secure processing and combined analysis and integration of public/private transcriptomic and proteomic data, with an emphasis on our novel pipeline for next-generation sequencing data processing and reporting.
4:00 Conference Adjourns