Track 5 - April 5 – 7, 2016


Computational Resources and Tools to Turn Big Data into Smart Data

Track 5 assembles thought leaders who will present case studies using computational resources and tools that take data from multiple –omics sources, including microbiomics and metabolomics, and align it with clinical action. Turning big data into smart data can lead to real-time assistance in disease prevention, prognosis, diagnostics, and therapeutics. With the ever-increasing volume of information generated for curing or treating diseases and cancers, bioinformatics technologies, tools and techniques play a critical role in turning data into actionable knowledge to meet unstated and unmet medical needs.

Tuesday, April 5

7:00 am Workshop Registration and Morning Coffee

8:00 – 11:30 Recommended Morning Pre-Conference Workshops* Visualization for Biomedical Data Analysis: From the Basics to Applications

12:30 – 4:00 pm Recommended Afternoon Pre-Conference Workshops* iConquerMS™: A Patient-Centered Research Model

* Separate registration required

2:00 – 6:00 Main Conference Registration


Click here for detailed information

Precision for Medicine5:00 – 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing

Wednesday, April 6

7:00 am Registration Open and Morning Coffee


Click here for detailed information

9:00 Benjamin Franklin Awards and Laureate Presentation

9:30 Best Practices Awards Program

9:45 Coffee Break in the Exhibit Hall with Poster Viewing


10:50 Chairperson’s Opening Remarks

John Quinn, Ph.D., Application Scientist, FlowJo, LLC

11:00 canvasXpress: A Highly Interactive JavaScript Library for Analytic Visualization of Genomics (and Other High Dimensional) Data

Isaac Neuhaus, Ph.D., Senior Principal Scientist, Bristol-Myers Squibb

This talk will describe how this package integrates with our R environment.

11:30 eTRIKS and tranSMART in IMI’s PreDiCT-TB: Data Management, Modeling and Comparison

Francisco Bonachela Capdevila, Ph.D., Postdoc Data Coordinator, Translational Informatics and External Innovation, Janssen Pharmaceutica

PreDiCT-TB is an IMI-funded project which takes a comprehensive model-based approach to fill the gaps in the current drug development pathway in tuberculosis. Preclinical information is propagated into the clinical stage in order to optimize drug selection at the clinical phase. In this context, we have developed a tranSMART-based solution that extracts PK-PD modeling data from studies at any in vitro, in vivo or clinical stage of the drug development chain. This allows the ranking of drug regimens to be compared across preclinical and clinical studies. This ranking comparison will provide us with an informed framework for the translatability of drug regimens data during the clinical phase.

12:00 pm Fusing Systems Biology & Predictive Analytics for Multi `Omic Data: Demo of the PATH Platform for Knowledge Generation

Scott Marshall, Ph.D., Managing Director, Biomarker and IVD Analytics, Precision for Medicine

The future of healthcare will be transformed by flexible frameworks designed to discover complex signals in rich datasets through the merger of predictive genomic analytics and systems biology that are designed to incorporate information about molecular and cellular systems across multi `omic data. PATH™ a secure, scalable, cloud-based solution for predictive genomic analytics serves as a knowledge generation platform for translational and clinical research.

12:30 Session Break

12:40 Luncheon Presentation I: Medical Evidence Is Becoming the Currency of Healthcare Transformation

Mark Sexton, Principal Offering Manager, Epidemiology Method

This session will share experiences applying IBM Watson Real World Evidence solutions to help researchers explore huge volumes of unstructured and structured content to discover insights and information and produce medical evidence. Examples include identifying unmet medical needs; demonstrating product value and differentiation for pharmaceuticals and medical devices; improving drug comparative effective studies; and competitive intelligence.

Elsevier R&D Solutions1:10 Luncheon Presentation II: When Every Piece Matters: Mobilizing Informational Resources for Rare Diseases

Anton Yuryev, Ph.D., Consultant, R&D Solutions, Elsevier

Disease-centric knowledgebases remain a challenge as information is scattered across multiple resources. Elsevier collaborates with Findacure charity to create a portal for patients, researchers, and doctors to help finding up-to-date information and assist in new treatment discovery. We describe our integrative approach to construct a knowledgebase for congenital hyperinsulinism containing disease mechanisms, targets, drugs, key opinion leaders and institutions.

1:40 Session Break

Novel Bioinformatics and Data Analysis Approaches

1:50 Chairperson’s Remarks

Michael Liebman, Ph.D., Managing Director, IPQ Analytics, LLC

1:55 From Phenotype to Genotype: Using TranSMART for Managing Human Genetics Data

Andrew Hill, Science and Technology Lead, Research Business Technology, Pfizer

Genotype/phenotype analysis informs target identification, validation, mechanistic understanding, and precision medicine. Genetic variants and associated phenotype datasets are large, complex, and difficult to manage and access. The bioinformatics community needs to share information about both challenges and solutions. In this presentation we’ll describe our experience with using TranSMART as a repository for human genotype-phenotype data.

2:25 Case Study: Cloud-based High- Performance Platforms to Unblock and Speed Genome Analysis

Kurt Florus, CTO, Bluebee

Reducing cost and complexity of genetic analysis in a rapidly growing number of genome studies is challenging. Learn from our recent key projects how cutting-edge innovations in hardware acceleration and distributed computing are affecting cloud-based genomics platforms, and how leveraging HPC techniques enable researchers to ask even more ambitious research questions, and allow clinicians to take full advantage of these powerful NGS diagnostic tools.

2:55 Increasing the Competitiveness of Pharma Companies: Real Time Search and Analytics Across Structured & Unstructured Data

Xavier Pornain, WW Vice President, Sales and Alliances, Sales, Sinequa

This presentation highlights how Sinequa’s platform helps leading pharma companies in the following areas: 1) Speed up submission of New Drug Applications to reduce costs for new drugs development; 2) Drive innovation, accelerate research and shorten Drug Time-to-Market; 3) Foster cooperation in R&D while respecting information governance and security; and 4) Optimize clinical trials and catalyze drug repositioning.

3:10 From Out that Shadow: Diagnosis, Discovery and Data Integration in Single-Cell Phenomics

Michael Stadnisky, Ph.D., CEO, FlowJo, LLC

The standardization, throughput, and content of single cell assays has brought flow cytometry and digital PCR into the mainstream. However, data analysis has remained in the shadows, relying on expert supervision and manual analysis, and rarely integrated into the life science data ecosystem. We show that an intuitive analysis platform can democratize diagnosis and discovery in single cell assays and significantly accelerate time to insight.

Microsoft Way3:25 Refreshment Break in the Exhibit Hall with Poster Viewing

4:00 Selected Poster Presentation: A Computational Approach to Identify Antibody Functional Paratopes for Synthetic Antibody Library Design

Hung-Pin Peng, Ph.D., Postdoctoral Fellow, Genomics Research Center, Academia Sinica

Synthetic antibody library can be used for finding antibody to recognize specific antigen. The size and diversity determine the utility of library. Since both size and diversity have physical limitations, the design of synthetic antibody library is crucial. Understand how antibody interact with antigen, learning the common features of known functional antibodies and incorporate them in to library design is one way to reach the goal. In this work, a computational approach is developed to predict functional paratope residues from antibody structures. The prediction result is comparable with experimental alanine scanning data. 111 non-redundant antibody-protein complex structures are applied with the method to predict functional residues. The propensity of potential functional residues has both amino acid preference in aromatic residues and short chain hydrophilic residues. The distribution of potential functional residues on six CDR loops is also diverse. These features are designed on a common antibody framework to build a synthetic antibody library. The designed synthetic antibody library is screened with 14 protein antigens. 12 of 14 protein antigens are recognized by antibodies emerged from the synthetic antibody library screening. The result shows that features of functional antibodies captured by the computational method is feasible for binding various types of protein antigens. The computational method can be used in analyze antibody sequences to condense functional characters. As next generation sequencing is applied in synthetic antibody screening, the computational method can be used to predict the functionality of library design or screening results.

4:30 From GWAS and Whole Genomes to Personalized Therapeutics: Non-Coding Variants for New Drugs

Leonard Lipovich, Ph.D., Associate Professor, Center for Molecular Medicine and Genetics, Wayne State University

The ENCODE (Encyclopedia of DNA Elements) Consortium revealed that two-thirds of human genes do not encode proteins, and catalogued non-coding regulatory elements genomewide. Nevertheless, bioinformatics of significant disease-associated genetic variants identified from whole-exome chips, Genome-Wide Association Studies, and whole-genome sequencing continues to focus on protein-coding genes, even when those genes are far, and separated by recombination breakpoints from, the significant variants. The audience will learn how to use public transcriptome and epigenome datasets from the UCSC Genome Browser and its underlying UCSC Genome Database, including but not limited to ENCODE data, for both manual and automated integrative reannotation of disease-associated SNPs, with the goal of ranking SNPs outside of protein-coding regions based on the likelihood of their localization in a non-coding genomic functional element. Given that numerous post-GWAS bioinformatics portals still concentrate on protein-centric SNP annotations and poorly account for non-coding data types, this is an important insight for anyone in academia and industry who is interested in improving variant annotation pipelines to better account for the vast numbers of functional, and therefore candidate disease-causative, genomic elements outside of protein-coding gene exons. The audience will also gain an appreciation of the phenomenon of “SNP clouds” that we discovered during our pipeline development. This phenomenon manifests as genomic positional aggregations of multiple significant disease-associated non-coding variants from public GWAS datasets for related but nonidentical quantitative phenotypes and diseases and that reside within discrete, < 1-Mb contiguous genomic intervals. For example, we found that multiple SNPs significantly associated with BMI, waist circumference, fasting glucose levels, fasting insulin levels, obesity, and/or type 2 diabetes frequently cluster together in short discrete genomic regions. These “SNP clouds” allude to pleiotropic regulation in-cis and/or to the existence of multiple disease-specific non-coding regulatory elements that all may target the same nearby gene, causing their distinct but partially overlapping effects on phenotypes. The ultimate goal and promise of this approach is to identify functional, directly disease causal, non-coding RNA genes and non-coding regulatory sequences from exome, GWAS and whole-genome sequencing data. These genes and sequences can be therapeutically targeted using genome editing and, post-transcriptionally, antisense oligonucleotides. The recent evolutionary history, in human populations, of the non-coding candidate disease causal variants that we have canvassed in trans-ethnic mapping efforts will allow targeting to be customized to both populations and individuals, finally making post-genomic GWAS-empowered personalized medicine a reality.

Full list of authors:
Leonard Lipovich*, Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI;
Virginia Fisher, Department of Biostatistics, Boston University School of Public Health, Boston, MA;
Aldi T. Kraja, Division of Statistical Genomics, Washington University School of Medicine, St. Louis, MO;
James B. Brown, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA;
Jerome I. Rotter, LABioMed, Los Angeles, CA;
Ida Chen, LABioMed, Los Angeles, CA;
James B. Meigs, General Medicine Division, Massachusetts General Hospital, Boston, MA;
Ingrid B. Borecki, Division of Statistical Genomics, Washington University School of Medicine, St. Louis, MO;
CHARGE Adiposity Working Group;
CHARGE T2D / Glycemia Working Group;
CHARGE Consortium.

* study leader, first and presenting author

5:00 Selected Poster Presentation: Right-Size Your TCO: Advanced Cost Optimization Techniques for Secure Genomics in the Amazon Cloud

Andrey Kislyuk, Ph.D., Research Assistant, Georgia Institute of Technology

In recent years, public and private genomics enterprises have accelerated their adoption of cloud technologies, driven by the innovation and flexibility offered by cloud infrastructure service providers. Important barriers continue to impede scalable and efficient use of the cloud: lack of expert knowledge of cloud APIs, security best practices, compliance requirements, data management techniques, application migration paths, cost and lock-in concerns. We present a comprehensive system for simplifying the management of a multi-user research and clinical production environment in the AWS cloud. The system offers up to 50-80% cost savings vs. list price through use of compute optimizations and data lifecycle management techniques, while providing integrated authentication, monitoring, security and regulatory compliance policy enforcement. A migration path is provided for traditional HPC workloads through innovative use of dynamic shared filesystems and batch scheduling interfaces. Through use of GA4GH Common Workflow Language-based infrastructure, we also achieve portability and reproducibility of workload definitions, reducing platform lock-in. Beyond batch workloads, our system accommodates economical real-time data analytics using recently introduced AWS APIs and open-source front-end tool integrations. We demonstrate this capability in a prototype pathogen surveillance and identification application.

5:30 – 6:30 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing

Thursday, April 7

7:00 am Registration and Morning Coffee


Click here for detailed information

10:00 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced


10:30 Chairperson’s Opening Remarks

Mike Stadnisky, Ph.D., CEO, FlowJo, LLC

10:40 Dramatic Changes in US Patent Law: The Implications for Bioinformatics

John Conley, J.D., Ph.D., Professor, Law, University of North Carolina

Not too long ago, patents on software and software-based analytical methods--in medicine, finance, and business generally--were commonplace and concern about their effects was profound. Now, after a series of Supreme Court cases that brought about a dramatic shift in the approach taken by the lower courts and the Patent Office, those patents are facing legal extinction. These developments matter to bioinformatics: because of the centrality of software-dependent data analysis, whether that software can be patented—directly or indirectly—is a question of enormous economic significance to the industry. Whether software patents look like a good or bad thing will depend on where you are positioned in the industry—that is, are you primarily a creator of analytical tools or a user of others’ creations? This presentation will explain the recent developments in patent law and their legal, practical, and economic implications for the bioinformatics industry. The audience will gain an understanding of 1) why patents play an important role in bioinformatics; 2) the dramatic changes in the patentability of software-based analytical methods that have occurred over the past 3-5 years; 3) the implications of these changes for the bioinformatics industry, in legal, practical, and economic terms; and 4) the differential effects of these changes, depending on whether one is positioned as a producer or consumer of analytical inventions.

11:10 Building a Platform for Modeling Risk and Opportunities in Drug Development

Michael Liebman, Ph.D., Managing Director, IPQ Analytics, LLC

Sabrina Molinaro, Ph.D., Institute for Clinical Physiology, National Research Council, Italy

11:40 Selected Poster Presentation: Capturing BIA-10-2474 and Related FAAH Inhibitor Data in the IUPHAR/BPS Guide to PHARMACOLOGY 

Christopher Southan, Ph.D., Database Curator, IUPHAR/BPS Guide to PHARMACOLGY, University of Edinburgh

The clinical trial disaster in France where a fatality was associated with the Phase 1 FAAH inhibitor BIA-10-2474 has been widely reported on since January 15th 2016. Commentaries in Science, Nature News, Forbes and Chemical and Engineering News have included interviews with two of us (SPHA and CS). While the unfortunate events will have wide repercussions, the immediate consequence was a deficit of pharmacologically-relevant information. Because the IUPHAR/BPS Guide to PHARMACOLOGY database (GtoPdb) has a primary focus on the annotation of drugs and clinical candidates (PMID 26464438) we endeavoured to fill the gap. We already had entries for the primary target FAAH and the possible secondary target FAAH2, so we extended these with such provenanced data we could find. This is particularly important to support the in silico modelling community as soon as possible (e.g. for off-target prediction, cross-docking and ADMET computation). Since there were neither journal articles, nor name-to-structure declarations nor an open clinical trials entry, BIA-10-2474 was completely blinded until the release of a clinical protocol by Le Figaro newspaper on Jan 21st. This enabled us to curate N-cyclohexyl-N-methyl-4-(1-oxidopyridin-1-ium-3-yl) imidazole-1-carboxamide as ligand ID 9001 for BIA-10-2474. In addition, this is now name-mapped in PubChem via our submission. While the only document mapping was to a Bial patent, this included 388 analogues for interested parties to follow up. However, no IC50s were reported for the purified human enzyme, only % inhibition from a crude rat brain preparation. To expand key topics beyond what we typically capture according to our curation rules (see the database FAQ) we make use of blog posts. In this case we have used this avenue to highlight low activity analogues (i.e. SAR pairs) for Bial leads, JNJ-42165279 and PF-04457845. In addition we have connected patents to access extended SAR results. This information is valuable for the in silico modelling community since it provides comparator controls and sets for pharmacophore alignments. A set of links related to this abstract can be found as the NC-IUPHAR “Hot Topics” bulletin

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing


1:55 Chairperson’s Remarks

William Loging**, Ph.D., Associate Professor of Genomics & Head, Production Bioinformatics, Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai

2:00 A Bioinformatics Pipeline for Detection of Fusions and Gene Expression in Clinical Oncology Samples using RNA-Seq

Keith Callenberg, Ph.D., Lead Bioinformatics Scientist, Molecular & Genomic Pathology, University of Pittsburgh Medical Center

2:30 Talk Title to be Announced

Andreas Matern, GeneDx

3:00 Molecular Impacts of Immune Modulating Drugs on Cancer Patients

William Loging**, Ph.D., Associate Professor of Genomics & Head, Production Bioinformatics, Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai

The area of Immuno-Oncology provides a novel strategy for cancer treatment by utilizing the patient’s Immune system to combat tumor growth. We investigated the impact of specific immune modulating drugs on patients with diagnosed tumors in order to understand the molecular changes that take place at the pathway level. These data are correlated to phenotypic effect and provide insights into the mechanism of immune system directed therapies for cancer.

3:30 Biosimilar Structural Comparability Assessment by NMR: From Small Proteins to Monoclonal Antibodies

Bostjan Japelj, Ph.D., Senior Scientist, Protein Biophysics and Bioinformatics, Sandoz Biopharmaceuticals

This talk will discuss 1) the insight on how to use NMR as a method to evaluate high order similarity between biosimilar and reference product on the market; 2) methods to evaluate degree of similarity between two NMR spectra of proteins shown by examples from three case studies; and 3) an update on the current state of the art NMR spectroscopy in biosimilar drug product formulations and associated challenges.

4:00 Conference Adjourns

**Book signing in the Exhibit Hall (preceding talk) Thurs April 7 10:15am Booths 122 & 124 “Bioinformatics and Computational Biology in Drug Discovery and Development

Purchase on Demand