BIT Header Desktop  
BIT Header Mobile 
 

Track 5 - April 21 – 23, 2015

Next-Gen Sequencing Informatics 

Advances in Large-Scale Data Analysis and Interpretation

Tremendous advancements have been made to broaden NGS applications from the research to the clinic. In spite of this, enormous challenges for NGS still exist including data storage, processing, scaling, quality control management, and interpretation. Track 5 presents case studies on these challenges. Themes to be covered include database systems to manage and analyze NGS data, analytic tools and workflow solutions, cloud computing and collaborative technologies, and NGS variants & gene mapping and expression.

Final Agenda

Download Brochure | Workshops 

Tuesday, April 21

7:00 am Workshop Registration and Morning Coffee


8:00 – 11:30 Recommended Morning Pre-Conference Workshops*

Genome Assembly and Annotation

Intelligent Methods Optimization of Algorithms for NGS

12:30 – 4:00 pm Recommended Afternoon Pre-Conference Workshops*

Customizing Your Digital Research Environment with Genome Browsers

Large Scale NGS Analysis Using Globus Genomics

* Separate registration required


2:00 – 6:30 Main Conference Registration


» 4:00 PLENARY SESSION 

Click here for detailed information. 


5:00 – 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing

 

Wednesday, April 22

7:00 am Registration Open and Morning Coffee


» 8:00 PLENARY SESSION  

Click here for detailed information. 


9:00 Benjamin Franklin Awards and Laureate Presentation

9:30 Best Practices Awards Program

9:45 Coffee Break in the Exhibit Hall with Poster Viewing


EMERGING TRENDS AND PREDICTIONS OF NGS INFORMATICS

10:50 Chairperson’s Opening Remarks

Chairperson to be Announced, Bina Technologies, Inc.

11:00 Global Next Generation Sequencing Informatics Markets: Inflated Expectations in an Emerging Market

Greg Caressi, Senior Vice President, Healthcare and Life Sciences, Frost & Sullivan

This presentation evaluates the global next-generation sequencing (NGS) informatics markets from 2012 to 2018. Learn key market drivers and restraints, a detailed analysis of the changing competitive landscape, revenue forecasts, and important trends and predictions that affect market growth. Key highlights for many of the leading NGS informatics services providers, commercial primary and secondary data analysis tools vendors, commercial biological interpretation and clinical reporting tools vendors, and NGS LIMS vendors will be presented.


OPEN SOURCE AND LARGE-SCALE COMPUTING

11:30 Large-Scale NGS Analysis Using Globus Genomics: Challenges and User Success Stories

Ravi Madduri, Fellow, Computation Institute, University of Chicago and Argonne National Lab

Dinanath Sulakhe, Solutions Architect, Computation Institute, University of Chicago and Argonne National Lab

In this talk, we will present some of the challenges in scaling up NGS analysis on public cloud infrastructure and present user success stories where we have overcome them.

Maverix Biomics12:00 pm Turn-Key Variant Analysis for the Biologist: Using the Maverix Analytic Platform

Dan Kearns, Director, Software Development, Maverix Biomics, Inc.

Studies leveraging WGS, Exome, and Targeted sequencing data are commonly limited by the tools, infrastructure, and trained bioinformaticians necessary to process, interpret and manage the data. The Maverix Analytic Platform addresses these challenges through a unique environment designed for biologists. This cloud-based platform leverages best-in-class tools and methods, and provides an integrated environment to enable visualization and interpretation of results.

Data Direct Networks12:15 Developing and Provisioning Robust Automated Analytical Pipelines for Whole Genome-Based Public Health Microbiological Typing

Anthony Underwood, Ph.D., Lead, Bioinformatics, Infectious Disease Informatics, Microbiology Services Division, Public Health England

Whole genome sequencing has great potential for microbial characterization in public health. Open source bioinformatics tools can generate necessary information, however converting these tools for usage in routine public health is challenging. They must be automated, auditable, timely, and robust, as well as record errors and log outputs. Dr Underwood will discuss the infrastructure, software architecture and algorithms used for this at Public Health England.

12:30 Session Break

Illumnia logo12:40 Luncheon Presentation I: Sample Aggregation and Analytics in the Post-$1,000 Genome Era

Scott Kahn, Ph.D., Vice President, Commercial Enterprise Informatics, Illumina, Inc.

With the launch of the Illumina HiSeq X Ten system, the long-promised $1,000 genome became a reality. But as is often the case in science and engineering, the realization of one goal reveals new challenges to surmount. The economics of sequencing now make the sequencing of entire populations feasible, but aggregating, tracking, and analyzing whole human genome data cannot be done serially when it is produced in parallel. This presentation will discuss parallel sample processing approaches that enable multi-sample genome interpretation and analysis of large cohorts by employing cloud-scale computing.

Elsevier1:10 Luncheon Presentation II

Speaker to be Announced

 

1:40 Session Break

1:50 Chairperson’s Remarks

1:55 The Cloud Reigns: Enabling Scalable Analysis and Storage for High-Throughput Next-Gen Sequencing

John Penn, Associate Manager, NGS Data Analysis, Regeneron Genome Center

2:25 Data Intensive Academic Grid (DIAG): A Free Computational Cloud Infrastructure Designed for Bioinformatics Analysis

Anup Mahurkar, Executive Director, Software Engineering and IT, Institute for Genome Sciences, University of Maryland School of Medicine

IBM2:55 Co-Presentation: The Challenges of Scaling Platforms for Translational Science: New Approaches and Case Studies

Houtan Aghili, Ph.D., Senior Technical Staff Member, Industry Solutions - Healthcare and Life Sciences; IBM Software Group

Janis Landry-Lane, Genomics Solutions, Software Defined Infrastructure, IBM World-Wide

As researchers build platforms for translational science, High Performance Data Centric Computing will be a key investment that must be considered in order to provide an integrated and scalable solution which fulfills the needs of multiple departments. In this session, we will cover: processing the NGS pipeline in order to bring omics data into a scalable information management platform, the role of natural language processing for integrating unstructured information, the integration of on-premise and cloud solutions, and effective data and content management at scale. IBM will present both a vision and potential solutions that have enabled our customers to build an effective architecture.

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing


NGS VARIANTS & GENE MAPPNG AND EXPRESSION

4:00 Deep Sequencing Based Analysis of Ig repertoire in Humanized Mice

Stefan Klostermann, Ph.D., Expert Scientist, Bioinformatics / Data Science, Roche

On our quest for human biotherapeutical antibodies we developed a novel methodology: Instead of replacing the mouse genomic immune loci by the human orthologs we reconstituted the humoral immune response in immunodeficient mice transplanted with human hematopoietic stem cells. An in-depth characterization of the reconstituted immune system by data analysis of deep sequencing Ig repertoire validated the humanized mouse be immunological equivalent to human donors.

4:20 Development of Novel Algorithms for Assembly of RNA-seq Reads Into Transcriptomes

Guojun Li, Ph.D., Professor, Mathematics, Shandong University

We developed a more effective and efficient assembler to assemble RNA-seq reads into full-length transcripts encoded in a genome based on a new perception that the full-length transcripts would be better recovered from combinations of spliced junctions which can be detected by aligning RNA-seq reads against a reference genome using splice-awared aligner than from overlapped reads. As currently done with de novo assembly, we have modeled the de novo assembly problem as to find a min-cost minimum path cover over a junction graph defined on those spliced junctions of a gene. The preliminary implementation shows that it performs even better than reference-based approaches in most cases since it will not be adversely affected by errors introduced from the reference genome. Motivated from the current investigation we further found that a more general assembling problem would be modeled as to find a min-cost minimum path cover over a so called interval graph, which can be exactly solved by solving a series of bins packing problems. This is a global optimization strategy as opposed to our current de novo assembler Bridger, a heuristic approach, has improved existing de novo assemblers due to the sequence depth information being effectively incorporated into the assembly procedure and a new concept of junction graph being introduced to be in place of overlap graph defined in some existing assemblers.

4:40 BLASTing with Chromatin Architecture: A Novel Method of Genomic Functional Element Identification and Annotation

Michael J. Buck, Ph.D., Associate Professor, Department of Biochemistry, SUNY at Buffalo; Director, Stem Cell Sequencing/Epigenomics Center, The State University of New York at Buffalo; Co-Director, Next-Generation Sequencing & Expression Analysis Core, The State University of New York at Buffalo

Identification of genomic functional elements, i.e. promoters, insulators and enhancers, is essential to understanding the complex regulatory processes involved in cellular differentiation, response to the environment, and disease development and progression. However, finding these locations within the genome can be a laborious and expensive undertaking requiring site specific assays. Even more difficult is identifying entirely new classes of genomic features. In order to facilitate identification and characterization of new classes of genomic features, we developed and implemented a chromatin Architecture Basic Local Alignment Search Tool (ArchBLAST). The ArchBLAST algorithm utilizes conserved chromatin architecture or DNA-binding protein signatures at known sites of interest and globally searches the genome for similar sites. ArchBLAST differs from other approaches in that it uses the amplitude and spatial arrangement of all types of sequencing data to score similarity. ArchBLAST is extremely flexible and can search with all chromatin-based assays such as ChIP, FAIRE, and DNase-Seq as well as non-chromatin assays such as RNA and CAGE-Seq. Importantly, ArchBLAST allows for identification of subtypes of known genomic features and can accurately predict previously uncharacterized locations. ArchBLAST uses an innovative weighted profile generated from only the most informative genome-wide datasets and then scores the entire genome. We have validated the accuracy of our approach with multiple genomic features in both yeast and humans. We show ArchBLAST is capable of predicting both gene expression and genomic feature directionality as well as identifying cell-type specific enhancers using chromatin architecture and/or DNA-binding protein signatures.

5:00 Sponsored Presentation (Opportunity Available)

5:30 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing

6:30 Close of Day

Thursday, April 23

7:00 am Registration Open and Morning Coffee


» 8:00 PLENARY SESSION PANEL 

Click here for detailed information. 


10:00 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced


NGS DATA MANAGEMENT, PROCESSING, AND ANALYSIS

10:30 Chairperson’s Remarks

10:40 Informatics Infrastructure for Secure Access, Visualization and Analysis of NGS data

Ted Kalbfleisch, Ph.D., Assistant Professor, Biochemistry and Molecular Biology, University of Louisville

The Variant Call Format file provides a list of variants detected, and genotypes measured in a next generation sequence dataset, along with summary statistics that allow a user to assess the confidence with which they should accept the call. We provide a novel mechanism by which the source NGS records from which the VCF file was derived may be accessed for additional scrutiny, or re-evaluation, either visually, or algorithmically. This new formalism provides support for users to drag and drop links for NGS datasets between autonomous applications, even to the command line for straightforward access to and inspection of subsets of NGS records that are relevant to questions posed by researchers or clinicians. The audience will learn that it is possible to securely access, and share NGS data for both visualization and analysis in distributed environments. We will describe an architecture that may be extended to other –omics technologies that may fundamentally change in the way researchers access, analyze, and publish high throughput data.

11:10 NGS Data Management at Lilly: Progress towards Standardization

Yuhao Lin, Associate Consultant, Informatics Capabilities, Eli Lilly

11:40 Sponsored Presentation (Opportunity Available)

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing

1:55 Chairperson’s Remarks

2:00 Technology and Data Analysis Methods for NGS Data

Yaoyu Wang, Ph.D., Associate Director, Center for Cancer Computational Biology, Dana Farber Cancer Institute

2:30 Talk Title to be Announced

Craig Pohl, Co-Director, Bioinformatics, The Genome Institute, Washington University

3:00 Reproducible NGS Research: Practical Approaches and Case Studies

Joseph D. Szustakowski, Ph.D., Senior Group Head, Novartis Institutes for BioMedical Research

3:30 Steroid Resistance in Childhood Nephrotic Syndrome: Transcriptome-Wide Sequence Analysis Identifies SULF2 and Other Marker Genes

Saras Saraswathi, Ph.D., Research Scientist - Data Analyst, Sidra Medical and Research Center

4:00 Conference Adjourns


Download Brochure | Workshops 



View 2015 Brochure
 BIT-Agenda-icon 

Platinum Sponsors
 
Aspera 

 Cycle Computing logo 

 Data Direct Networks 

Elsevier 

 

IBM 

Illumnia logo 

Intel Logo 

OKTA 

Oracle Health Sciences 

Seagate  

 SGI small logo 

Thinkmate 

Thomson Reuters 


View All Sponsors 

 


Official Media Partner
  


View Media Partners 


Conference CD

CD iconOrder the 2014 event proceedings - now available on CD 


Complimentary Downloads

View white papers, listen to podcasts, and more! 

  • Making the World's Knowledge Computable
  • Bioinformatics in the Cloud
  • The Application of Text Analytics to Drug Safety Surveillance
 

Related Events

 HIT Logo for BIT 2015
CLN Co-located Event