Bio IT World Expo 2016  
Bio IT World Expo 2016

Track 2 - April 21 – 23, 2015

Software Development  

Harnessing Data for Scientific Decision Making

Track 2 explores the technology and tools needed to connect data, applications, people, processes, and partners to ensure available, reliable, and actionable information for scientific decision making. Case studies will be presented that address how life science organizations address common problems in harnessing data including analytics, methods and standards, using open source, using in-house vs customized commercial platforms, transparency, efficiency, security, and cost-effective solutions.

Final Agenda

Download Brochure | Workshops 

Tuesday, April 21

7:00 am Workshop Registration and Morning Coffee

8:00 – 11:30 Recommended Morning Pre-Conference Workshops*

Aligning Projects with Agile Approach

Gamification of Science

12:30 – 4:00 pm Recommended Afternoon Pre-Conference Workshops*

Predictive Analytics

Large Scale NGS Analysis Using Globus Genomics

* Separate registration required

2:00 – 6:30 Main Conference Registration



Click here for detailed information. 

5:00 – 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing


Wednesday, April 22

7:00 am Registration Open and Morning Coffee



Click here for detailed information. 

9:00 Benjamin Franklin Awards and Laureate Presentation

9:30 Best Practices Awards Program

Internet 2

9:45 Coffee Break in the Exhibit Hall with Poster Viewing



10:50 Chairperson’s Opening Remarks
Wanmei Ou, Ph.D., Director, Product Strategy in Translational and Precision Medicine, Health Sciences Global Business Unit, Oracle 



Chris Dagdigian, Founding Partner & Director, Technology, BioTeam, Inc.

In one of the most popular presentations of the Expo, Chris delivers a candid assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences.

12:00 pm Introduction to EVO:RAIL by VMware

Michael McDonough, Senior Director, EVO:RAIL, VMware

VMware EVO:RAIL™ combines compute, networking, and storage resources into a hyper-converged infrastructure appliance to create a simple, easy to deploy, all-in-one solution offered by Qualified EVO:RAIL Partners. EVO:RAIL is a scalable Software-Defined Data Center (SDDC) building block that delivers compute, networking, storage, and management to empower private and hybrid cloud, end-user computing, test/dev, and branch office environments.

12:30 Session Break

IBM12:40 Luncheon Presentation I: Big Data for Genomics -- SCALE, SPEED and SMART

Frank Lee, Ph.D., Lead Architect, Genomics Solution, IBM

Explosive growth of big data is challenging researchers in genomics and life sciences around the world. Learn about some of the latest solutions, architecture and best practice to 1) acquire, store, access data in scale; 2) build a high-throughput computing infrastructure to process large genomic data set; 3) gain insights and knowledge from the data through translational research. Illustrated through real-life projects and case studies, join this session to learn of the latest approaches to tackle big data, the evolving ecosystem, success stories and lessons learned that highlight the potential for collaboration among genomic research communities. Share in a preview of the upcoming IBM genomics turn-key platform currently under development.

Intel Logo1:10 Luncheon Presentation II: Optimizing Genomic Sequence Searches to Next-Generation Intel Architectures

Bhanu Rekepalli, Ph.D., Senior Scientific Consultant & Principal Investigator, BioTeam Inc.

Upcoming bioinformatics, and biomedical, research requires fast processing and analytic tools due to the immense growth of genomic data added to the biological knowledge base with the advent of next generation sequencing technologies. The design of these tools should adhere efficiently to homogeneous and heterogeneous architectures while supporting scalability, accuracy, and reproducibility. The National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST) for genomics sequence searches is re-designed to scale on hybrid parallel architectures composed of Intel Xeon processors and Intel Xeon Phi coprocessors, denoted here as Highly Scalable Parallel Hybrid BLAST (HSPH- BLAST). Functionality enhancements, such as cross- compilation, dynamic load scheduling, master-worker model, input/output management, and database distribution are discussed. A performance evaluation of HSPH-BLAST demonstrates reduction in execution time, high scalability, and balanced processor utilization. HSPH-BLAST and similar tools integrated into scientific workflows pipelines can allow biologists to easily perform systematic studies resulting in rapid and high-impact scientific discovery.

1:40 Session Break



1:50 Chairperson’s Remarks
Brian Bissett, Senior Member, Institute of Electrical and Electronics Engineers 

1:55 Lies, Damn Lies, and Big Data: How to Best Utilize Data to Drive Decisions

Brian Bissett, Senior Member, Institute of Electrical and Electronics Engineers

Big Data is hailed as the solution to many problems in industry. In many respects this is a fallacy because it only takes a small amount of erroneous data to corrupt the usefulness of a large dataset. While Big Data can be extremely useful in predicting patterns for the masses such as traffic patterns and peak usage hours for a utility, its usefulness begins to diminish in situations where quality is more important than quantity. In addition, the underlying assumption of Big Data that the behavior of the masses is the correct course of action is not always true. The audience will gain an appreciation for how to best utilize data to drive decisions. Common fallacies will be addressed including the notion that Big Data sets are always superior to smaller data sets. The limitations of big data sets, the importance of quality data, effective display of quantitative information, boundary conditions, and the evaluation of quantitative and qualitative factors will all be discussed.

2:25 Data Publication and Discovery Using Globus Research Data Management Software-as-a-Service

Vas Vasiliadis, Director, Products, Computation Institute, University of Chicago and Argonne National Laboratory

Globus is software-as-a-service for research data management, used at dozens of institutions and national facilities for moving, sharing, and publishing big data. Recent additions to Globus include services for data publication and discovery that enable: publication of large research data sets with appropriate policies for all types of institutions and researchers; the ability to publish data directly from your own storage or from cloud storage that you manage, without third party publishers; extensible metadata that describe the specific attributes for your field of research; publication and curation workflows that can be easily tailored to meet institutional requirements; public and restricted collections that give you complete control over who may access your published data; a rich discovery model that allows others to search and use your published data. This presentation will give an overview and demonstration of these services, as well as case studies that illustrate how Globus is increasing researcher productivity and facilitating enhanced collaboration among researchers.

2:55 Leveraging Hadoop Mapreduce in Building Patient Timelines & Analyzing Health Resource Utilization

Saar Golde Ph.D., Informationist, Knowledgent

During this presentation we will introduce methodological innovations in analyzing real world evidence and observational data in health outcomes research. Attendees will learn how we leveraged Hadoop data lake to transform the transaction-level data into a patient-centric data model and to run large scale analysis in an efficient manner, yielding robust results in a timely and cost-effective manner.

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing

4:00 Semantic Integration of Unstructured Safety Study Data: Experiences and Outlook

Alain Nanzer, Ph.D., Global Head Safety & Development Workflows, Pharma Research and Early Development Informatics, Roche Innovation Center Basel

In pharmaceutical R&D vast amounts of study data are generated - in house and externally - which are used to advance drug projects and then end as reports in document management systems or on file shares. Most often these data are lost for further scientific analysis, as no structured search and access is possible. Common approaches to load such data to scientific data warehouses require complex ETL processes to normalize the results, are very labor intensive and not well suited for large sets of unstructured legacy data. The presentation will share our experiences implementing a platform using semantic integration technologies to provide scientists search, evaluation, and advanced visualization capabilities for safety in vivo study data. Furthermore we will show how the platform has been extended providing fast access to real-time study data, and then evolved to a data turntable for external study data and submissions to regulatory authorities. .

4:30 DIVOS: A Platform for Effective in vivo Study Knowledge Management at Genentech

Dana Caulder, Senior Software Engineer, Bioinformatics and Computational Biology, Genentech

Preclinical animal models are essential to understanding the fundamental biology of disease and the efficacy, pharmacokinetic, pharmacodynamic, and toxicity profiles of potential therapeutics. The variety of study designs and endpoints across therapeutic areas makes it challenging to develop systems that meet researchers’ short-term and the business’s long-term needs. In the Bioinformatics department within Genentech Research & Early Development we have developed an in vivo data management platform DIVOS that not only enables researchers across all therapeutic areas to manage their day to day work, but it also enables data reuse and data exploration of historical studies by both bench scientists and statisticians. We will present a DIVOS case study that includes both examples of the scientific successes that have been enabled by the system, and technical / architectural details that underlie DIVOS’s flexibility and extensibility. Attendees will walk away with an overview of a successful case study in in vivo data management, including 1) how the system was architected to be flexible enough to handle data and work processes across multiple therapeutic areas, 2) success factors in both the project/people and technical/implementation realms, 3) the importance of having a committed and engaged user community, and 4) how we’ve achieved that at both the sponsor and bench scientist level.

5:00 Accelerate Life Sciences Data Processing in a Secure, HIPAA-compliant Cloud Platform 

Ben Butler, Vice President, Business Development & Solutions Architecture, REĀN Cloud Solutions

REAN Cloud has partnered with several leading life sciences organizations to deploy and manage genomics and personalized medicine research data processing pipelines on the Amazon Web Services cloud. In this session, learn about win-win design patterns that leverage the benefits of high-scale, low-cost compute and storage of the cloud while also being highly secure and meeting stringent compliance standards, specifically the requirements of the the U.S. Health Insurance Portability and Accountability Act (HIPAA). We will provide insights into several customer case studies which showcase how REAN Cloud accelerates data processing genomics research, while reducing the time required to meet compliance requirements. REAN offers an innovative solution to meet analytical challenges such as accommodating peak compute demand, coordinating secure access for teams of scientists and analysts, and securely sharing validated tools and results. Attendees will receive our blueprint for implementing a robust, defense-in-depth architecture that directly addresses working with processing data that contains protected health information (PHI). 

5:30 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing

6:30 Close of Day


Thursday, April 23

7:00 am Registration Open and Morning Coffee



Click here for detailed information. 

10:00 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced



10:30 Chairperson’s Remarks
Noel Southall, Ph.D., Informatics, National Center for Advancing Translational Sciences, NIH   

10:40 A Case Study in Building a Clinical Research Database in a Translational Research Environment

Charlie Quinn, Director, Data Management & Software Development, Benaroya Research Institute

We have developed a database that integrates public and private clinical and experimental data in a translational research environment. We will discuss some of the challenges and solutions that we encountered in developing the database. Even though our research is primarily concentrated on autoimmune diseases the techniques and technologies developed are applicable to all. In addition, we will discuss our new open source spreadsheet wrangling tool which is instrumental in allowing us to capture, integrate, and manage the world of excel spreadsheets that live in most research environments.

11:10 Sciencescape - An Innovative Research Discovery Platform that Connects Users to Breaking Research As It Happens, Around the World, and Throughout History

Sam Molyneux, CEO & Co-Founder, Sciencescape

Sciencescape is an online platform that draws on biomedical and life science research from the past 100 years, and adds every paper as it appears - right up to the minute. Using a network-based algorithm called Eigenfactor, Sciencescape takes into account not only how many citations a paper has, but also where those citations come from and why they are important. Sciencescape organizes and delivers the most comprehensive real-time updates of peer-reviewed journals in life sciences based on scientists’ personalized preferences. Through our extensive publisher relationships, we’ve scanned, grouped, tagged, and categorized the full text of the majority of the over 24 million published peer-reviewed biomedical papers. This allows users to connect, organize, and display the scientific literature and stay on the leading edge while broadcasting their ideas and collaborating with their peers. Sciencescape totally transforms the research process, making it much easier for research to be efficient with their time, while also intuitively opening new opportunities for discovery.

11:40 Building a Global Framework for the Exchange of Drug Substance Information

Noel Southall, Ph.D., Informatics, National Center for Advancing Translational Sciences, NIH  

FDA needs a knowledge management system that can handle the enormous variety of substances found in commerce in a scientifically rigorous way. NIH’s National Center for Advancing Translational Sciences (NCATS) is working with FDA, global regulators and stakeholders to build this software and enhance the cooperation between agencies. NCATS’ charge is to develop, demonstrate and broadly disseminate tools for translational research that impact health care delivery, the proper use of medications, and their risk management. This project serves these goals and provides an example of how vision and innovation can come together within government to better serve public health.

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing



1:55 Chairperson’s Remarks

John M. Conley, J.D., Ph.D., William Rand Kenan, Jr. Professor of Law, University of North Carolina, Chapel Hill; Counsel, Robinson Bradshaw & Hinson



Roselie A. Bright, Sc.D., MS, PMP, Program Manager, Office of Information Management and Technology, Office of Informatics Technology and Innovation, Office of Operations, Office of the Commissioner, U.S. Food And Drug Administration (FDA)

OpenFDA was the first innovation created by Taha Kass-Hout, M.D., MS, upon joining FDA as the first Chief Health Information Officer in March 2013. OpenFDA was launched on June 2, 2014, allowing software developers, researchers and the public to tap into adverse events for drugs and medical devices; recalls, for drugs, devices and foods; and labeling for products on the market.

2:30 Global Developments in Privacy and Data Security Law

John M. Conley, J.D., Ph.D., William Rand Kenan, Jr. Professor of Law, University of North Carolina, Chapel Hill; Counsel, Robinson Bradshaw & Hinson

The international legal climate governing privacy and data security is changing. The European Union is in the midst of a fundamental shift in its approach. The U.S. still lacks a national data law, so the states and individual federal agencies are groping toward a strategy. This presentation focuses on the impact of these ongoing changes on genomics, bioinformatics and health research.

3:00 PANEL DISCUSSION: Achieving Much-Needed Innovation while Hurdling the Barriers of Stringent Regulation

Moderator: John M. Conley, J.D., Ph.D., William Rand Kenan, Jr. Professor of Law, University of North Carolina, Chapel Hill; Counsel, Robinson Bradshaw & Hinson


Roselie A. Bright, Sc.D., MS, PMP, Program Manager, Office of Information Management and Technology, Office of Informatics Technology and Innovation, Office of Operations, Office of the Commissioner, U.S. Food And Drug Administration (FDA)

Dana Caulder, Senior Software Engineer, Bioinformatics and Computational Biology, Genentech

Chris Dwan, Assistant Director, Research Computing and Data Services, Broad Institute of MIT and Harvard

Sanjay Joshi, CTO – Life Sciences, Emerging Technologies Division, EMC

Dave Peterson, Executive Director, Vendor & Third Party Assurance, National IT Compliance, Kaiser Permanente Information Technology

Vas Vasiliadis, Director, Products, Computation Institute, University of Chicago and Argonne National Laboratory

The growth in patient healthcare and life sciences innovations can be attributed to technology enhancements like cloud computing, big data analytics and mobile applications, but may conflict with increasing regulatory compliance demands to ensure protection of healthcare life and quality as well as patient data privacy and security. This panel of esteemed technology solution providers and regulators debates real-world challenges and how regulation must also innovate at technology’s pace.

4:00 Conference Adjourns

Download Brochure | Workshops 

Reg Early


View 2015 Brochure
View 2015 Brochure
View Videos & Photos 
Platinum Sponsors

Cycle Computing logo

DDN Storage  


Illumnia logo  

Intel Logo  


Official Media Partner

Conference CD

CD iconOrder the 2015 event proceedings - now available on CD

Complimentary Downloads

View white papers, listen to podcasts, and more!

  • Making the World's Knowledge Computable
  • Bioinformatics in the Cloud
  • The Application of Text Analytics to Drug Safety Surveillance

Related Event

 Medical Informatics World Related