2018 Archived Content
Track 1: Data Storage & Management

The unprecedented growth of data generation and research storage isn’t slowing down anytime soon. As such, storage is becoming a major cost element in the genomic IT world where organizations are spending millions on systems and platforms. The role of data engineering is critical in orchestrating, configuring, managing, and monitoring solutions to manage the data bloat problem. Track 1 assembles thought leaders and organizations from data centers and “centers of excellence” who have pioneered advances in large-scale data management, predictive analytics, and workflow automation. Presentations will focus on people, process and technology issues related to storage platforms, integration and migration plans, architectures, governance, and scalability.

Tuesday, May 15

7:00 am Workshop Registration Open (Commonwealth Hall) and Morning Coffee (Foyer)

8:0011:30 Recommended Morning Pre-Conference Workshops*

W6. An Intro to Blockchain in Life Sciences

12:304:00 pm Recommended Afternoon Pre-Conference Workshops*

W12. Bio-IT IOT Workshop: Accurate Data for Good Decisions

* Separate registration required.

2:006:30 Main Conference Registration Open (Commonwealth Hall)

4:00 PLENARY KEYNOTE SESSION (Amphitheater & Harborview 2)

5:007:00 Welcome Reception in the Exhibit Hall with Poster Viewing (Commonwealth Hall)

Wednesday, May 16

7:00 am Registration Open (Commonwealth Hall) and Morning Coffee (Foyer)

8:00 PLENARY KEYNOTE SESSION (Amphitheater & Harborview 2)

9:45 Coffee Break in the Exhibit Hall with Poster Viewing (Commonwealth Hall)

Waterfront 1

10:50 Chairperson’s Remarks

Vaughan Wittorff, PhD, Co-Founder & Chief Commercial Officer, PetaGene Ltd.

11:00 Business and Research Responses to the Changing Legal Environment for Data Management

John M. Conley, JD, PhD, William Rand Kenan, Jr. Professor of Law, University of North Carolina, Chapel Hill; Counsel, Robinson Bradshaw & Hinson

We are in the midst of major legal changes affecting data collection, storage, transfer, and use. For example, in the U.S., a major revision to the Common Rule for human subjects research will take effect in early 2018, the Federal Trade Commission has unexpectedly moved into the regulation of health data, and the federal government has interpreted HIPAA to expand patients' access to raw genomic data. In the European Union, a new General Data Protection Regulation will take effect in 2018, with major implications for both collecting health and research data and transferring it to the U.S. This presentation will review these developments and then discuss how Bio-IT companies and institutions should respond. The most fundamental questions are: who needs to worry, and what should they do?

11:30 A Reusable Cloud-Based Infrastructure for Growing Biotechs

John Keilty, General Manager, Platform Operations, Third Rock Ventures
Karina Chmielewski, Senior Director, Platform Operations, Third Rock Ventures

Pharmaceutical R&D has a data problem. With so many types of data - from experimental, to operational, to clinical, and more - from many different disparate sources, managing data has become a prevalent issue in the industry. The companies hit the hardest are the small, growing biotechs who attempt to rapidly scale innovative science but lack the formal infrastructure to get past these logistical hurdles. This presentation will address these issues and provide a case study on how Third Rock Ventures, a veritable expert on launching biotech startups, is addressing this common problem. By removing the operational bottlenecks involved with data management and storage, Third Rock Ventures enables their portfolio companies to focus on what matters - making a dramatic difference in patients’ lives.

Internet 212:00 pm Internet2: Leveraging Distributed Resources to Speed Discovery

Dan Taylor, Director, Business Development, Network Services, Internet2

Few Life Sciences organizations take advantage of the vast resources available to R&D organizations for continuous innovation and keeping pace with big data. This session will discuss the infrastructure underlying collaborations that use private, academic, and public resources – including commercial cloud and supercomputing centers storage and processing - to maximize options and speed discovery.


Weka12:15 Storage Systems that Support Tomorrow's Life Science Applications Today

David Hiatt, Director, Product Marketing, Marketing, WekaIO

Research has become increasingly compute intensive. While new tools and analytical processes such as AI and deep learning hold great promise, they stress the supporting IT infrastructure beyond the expectations of system designers. Learn how today's storage systems leverage software to deliver the performance, scale, and cost efficiencies for applications.

12:30 Session Break


12:40 Luncheon Presentation I: Addressing the Big Data Challenges in Genomics and BioImaging

Linda Zhou, Director, Research and Life Sciences Solutions, Western Digital

We will cover the Data challenges in both Genomics and BioImaging, including data growth and scale, the need for both collaboration and security, and the hybrid cloud processing requirements. We will describe best practices for cloud scale storage solutions to address these challenges, with example architectures from real customers in Genomics and BioImaging research.


Panasas1:10 Luncheon Co-Presentation II: Data Storage Benchmark Results for NGS and CryoEM Research

David Sallak, Vice President, Industry Solutions, Panasas, Inc.

Adam Marko, Senior Scientific Consultant, BioTeam, Inc.

Panasas and BioTeam will share benchmark results impacting NGS and CryoEM research. The Benchmarks were performed at the BioTeam Convergence Lab. BWA indexing of the human genome was performed for multiple simultaneous indexes and varying numbers of CPUs. The RELION application was used to perform a 3D classification of a publicly available CryoEM dataset of a human malaria parasite ribosome.

1:40 Session Break

Waterfront 1

1:50 Chairperson’s Remarks
Rachana Ananthakrishnan, Head of Products, Globus, University of Chicago

1:55 Managing Scientific Data Intelligently with iRODS

John Jacquay, Scientific Systems Engineer, BioTeam, Inc.

Scientific instrumentation generates vast quantities of data that must be processed, analyzed, and stored according to organization policies. The burden of managing this data grows larger every day, increasing exponentially with each scientific breakthrough and technological innovation. How can a lab, core facility, or large corporation keep up with this pace? Enter iRODS: the Swiss army knife of data virtualization and management. This talk will demonstrate how organizations can leverage the features of iRODS to setup automated bioinformatics pipelines, optimize data storage mediums and access patterns, share and collaborate on data, and provide intelligent insight via data visualizations.

2:25 Globus: Secure, Scalable Research Data Management Infrastructure for Life Sciences

Rachana Ananthakrishnan, Head of Products, Globus, University of Chicago

Scalable and robust data management infrastructure is now table stakes for life sciences researchers that wish to remain competitive in a data-intensive world. The Globus service supports over 80,000 investigators in multiple disciplines, who depend on its reliable, secure, file transfer, sharing, and data publication capabilities to streamline research workflows and simplify collaboration. We present use cases from genomics, imaging, and other biomedical research fields, and describe how recent enhancements to the service make Globus suitable for use in protected data environments.

2:55 Real-World Use Cases for High-Performance Storage Accelerating Life Sciences Research

George Vacek, PhD, Global Director, Life Sciences, DDN Storage

This session features in-depth case studies of leading life sciences organizations that are leveraging high-scale data solutions for genomics, imaging and simulation workflows. These focus on implemented solutions including: capturing and exploiting large scale data at speed; regulated and non-regulated stewardship considerations; transitioning from non-scaling architectures; and bringing the benefits of high-end HPC technologies into smaller deployments and collaborative scenarios.


3:10 Bringing Machine Learning to Imaging Clinical Practice by Deploying a Data Platform

Esteban Rubens, Global Enterprise Imaging Principal, Pure Storage

There is great interest in using machine learning to enhance human diagnostic ability across many areas of healthcare. The common denominator in all successful implementations of this technology is the training of models with robust and abundant annotated data. In this session we will discuss how IT infrastructure can support the timely and efficient training of these models.

3:25 Refreshment Break in the Exhibit Hall with Poster Viewing (Commonwealth Hall)

Waterfront 1

4:00 How the pRED Data Commons Facilitates Integration of –omics Data

Jan Kuentzer, PhD, Principal Scientist, pRED, Roche Innovation Center Munich

Omics data increasingly influences clinical decision-making. Well-designed and highly integrated informatics platforms become essential for supporting structured data capturing, integration and analytics to enable effective drug development.This talk presents principles and key learnings in designing such a platform, and contrasts our current approach to previous approaches in biomedical informatics. Finally, I will provide insights into the implementation of such a platform at Roche.

4:30 SELECTED POSTER PRESENTATION: Making the Most of Rare Data: Contextualizing Rare Data and Research Through Semantic Technologies
Robert Stanley, President and CEO, Melissa Informatics

5:00 PetaSuite Compression Cloud Edition - Get the Most Out of Hybrid Cloud and Radically Simplify Migration

Dan Greenfield, PhD, Co-Founder & CEO, PetaGene

Launching at Bio-IT World 2018, PetaSuite Cloud Edition (CE) combines two innovations: (i) the ability for a user’s software tools and pipelines to seamlessly integrate with a wide variety of cloud platforms without modification, and (ii) significantly improved, high-performance, scalable PetaSuite genomic compression technology with streaming of the compressed data for transparent on-the-fly decompression during use.


5:15 Sponsored Presentation (Opportunity Available)

5:30 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing (Commonwealth Hall)


7:0010:00 Bio-IT World After Hours @Lawn on D
**Conference Registration Required. Please bring your conference badge, wristband, and photo ID for entry.

Thursday, May 17

7:30 am Registration Open(Commonwealth Hall) and Morning Coffee (Foyer)

8:00 PLENARY KEYNOTE SESSION & AWARDS PROGRAM (Amphitheater & Harborview 2)

9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced (Commonwealth Hall)

Waterfront 2

10:30 Chairperson’s Remarks

Vahan Simonyan, PhD, Lead Scientist & R&D Director, High-Performance Integrated Virtual Environment (HIVE), FDA

10:40 Healthcare Data Exchange Framework: Scalable Economy of Secure Information and Services

Vahan Simonyan, PhD, Lead Scientist & R&D Director, High-Performance Integrated Virtual Environment (HIVE), FDA

This project demonstrates a unique framework that enables digital transformation of healthcare at a scale that was not possible before. Healthcare Data Exchange Framework has a potential to liberate data, empower patient ownership of data and create a free market where data assetization and securitization might serve as incentives for data sharing.

11:10 Healthcare Security Framework

Jim McGinnis, PhD, Assistant Professor, Engineering Technology, The University of Memphis
Healthcare is in a vulnerable position for infiltration or hacking of data. Sensitive patient data, financial data of the entity and insurance information are just some of the data that needs to be protected. The National Institute of Standards and Technology (NIST) has provided a Cybersecurity framework for general purposes. In the paper we will research some of the underlying layers of Cybersecurity that pertain to Healthcare. This research hopes to provide a concise framework for healthcare providers to use as a guideline for incorporating their own cybersecurity and to help in engaging cybersecurity third-party companies for assistance. The five layers of the NIST framework, Identify, Protect, Detect, Respond and Recover, leave healthcare organizations with a large amount of inhouse examinations in order to protect the data of the organization. While the organization must be diligent in the protection of the data, use of outside resources is a must in providing the utmost due diligence for the protection of the patient data, financial/insurance data of the patient and the entity as a whole. This document will attempt to build and expound on the NIST framework to provide additional guidance to healthcare providers.

Waterfront 2

11:40 A Modern Approach to Data Storage for Next Generation Sequencing & Medical Imaging

Steve Noel, Principal Systems Engineer, Qumulo

File storage is a critical component of the life sciences research workflow. For researchers to be able to do their work, their storage must be able to scale to and handle billions of files efficiently. They must also be able to access their research data from anywhere in the world. Learn how universal-scale file storage allows research organizations to manage massive, globally distributed file sets with ease.

11:55 Sponsored Presentation (Opportunity Available)

12:10 pm Session Break

12:20 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing (Commonwealth Hall)



1:55 Sponsor Introduction

Scott Jeschonek, Director, Cloud Services, Avere Systems

2:054:00 Panel Session: BioTeam Town Hall: 2018 Bio-IT Trends

Chris Dwan, Senior Technologist and Independent Life Sciences Consultant (Moderator)

Ari Berman, PhD, Vice President and General Manager of Consulting Services, BioTeam, Inc.

Tanya Cashorali, Founder, TCB Analytics

Kristen Cleveland, PMP, Director of Operations, BioTeam, Inc.

Chris Dagdigian, Co-Founder and Senior Director, Infrastructure, BioTeam, Inc.

Karl Gutwin, PhD, Senior Scientific Consultant, BioTeam, Inc.

Adam Kraut, Director of Infrastructure and Cloud Architecture, BioTeam, Inc.

Since 2010, the “Trends in the Trenches” presentation, given by Chris Dagdigian, has been one of the most popular annual traditions on the Bio-IT Program. The intent of the talk was to deliver a candid (and occasionally blunt) assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. The presentation tried to recap the prior year by discussing what has changed (or not) around infrastructure, storage, computing, and networks. This presentation has helped scientists, leadership, and IT professionals understand the basic topics involved in supporting data intensive science. In 2017, the “Trends in the Trenches” presentation evolved and expanded from 60 minutes to 120 minutes and featured more content, speakers, and interactive discussion. We will continue this format for 2018, featuring short, focused podium talks on current trends related to computing, storage/data transfer, networks, cloud, and managing successful IT projects. An interactive Q&A moderated discussion with the audience follows. Come prepared with your questions and commentary for this informative and lively session.

4:00 Conference Adjourns

Register Early for Maximum Savings

Modern Data Platforms and Storage Infrastructure