Bio IT World Expo 2016  
Bio IT World Expo 2016
Archived Content

Track 2: Software Development 

Track 2 explores data handling and integration activities. Themes covered include technologies and applications for managing/sharing/publishing/preserving data, software tools, open source software, grid engines, Hadoop, compute job management, and advances & trends.

Final Agenda

Download Brochure | Pre-Conference Workshops 


7:00 am Workshop Registration and Morning Coffee

8:00 Pre-Conference Workshops*

*Separate Registration Required

2:00 - 7:00 pm Main Conference Registration

4:00 Event Chairperson’s Opening Remarks

Cindy Crowninshield, RD, LDN, Conference Director, Cambridge Healthtech Institute

4:05 Keynote Introduction

Kevin Brode, Senior Director, Health & Life Sciences, Americas Hitachi Data Systems


Do Network Pharmacologists Need Robot Chemists?

Andrew HopkinsAndrew L. Hopkins, DPhil, FRSC, FSB, Division of Biological Chemistry and Drug Design, College of Life Sciences, University of Dundee


OKTA10 Minute Welcome to the Reception!

Mike Nolte, Regional Sales Manager – East, Okta

Cycle Computing logo
5:00 Welcome Reception in the Exhibit Hall with Poster Viewing

Drop off a business card at the CHI Sales booth for a chance to win 1 of 2 iPads® or 1 of 2 Kindle Fires®!*

*Apple® is not a sponsor or participant in this program


7:00 am Registration and Morning Coffee

8:00 Chairperson’s Opening Remarks

Phillips Kuhl, Co-Founder and President, Cambridge Healthtech Institute

8:05 Keynote Introduction

Sanjay Joshi, CTO, Life Sciences, EMC Isilon


Atul ButteAtul Butte, M.D., Ph.D., Division Chief and Associate Professor, Stanford University School of Medicine; Director, Center for Pediatric Bioinformatics, Lucile Packard Children’s Hospital; Co-founder, Personalis and Numedii


8:55 Benjamin Franklin Award & Laureate Presentation

9:15 Best Practices Award Program

9:45 Coffee Break in the Exhibit Hall with Poster Viewing

Technologies and Applications for Data Handling and Integration 

10:50 Chairperson’s Remarks

Will McGrath, Strategic Marketing Manager, Big Data Product Division, Quantum

11:00 Bringing Disaster Recovery to a Peta-Scale Archive Cost Effectively

Brant Kelley, Director, IT Services, The Scripps Research Institute

The Scripps Research Institute (TSRI) was confronted with the promise of a tsunami of data produced from genomic sequencers, mass spectrometry, imaging and other big research data generators. The silos of NAS and RAID storage across campus were no match for the amount of data currently generated and the growth in data was quickly going to exceed the capacity of its legacy tape-based data archive. In addition, the data protection practices across the silos of NAS and RAID arrays were inconsistent and the data archive was contained within a single tape library with no redundancy and no backup. A data center disaster such as a flood or fire could mean a total loss of years of data and have a catastrophic impact on research projects underway. In this presentation, The Scripps Research Institute examines the issues with its legacy systems and how it was able to conceive and deploy a highly scalable, tiered storage system that leverages disk and the economies of tape, in a fully redundant architecture. Determined to address these deficiencies and prepare for the massive growth in research data, The Scripps Research Institute evaluated a number of solutions from scale-up and scale-out NAS to tape-only solutions. They also noted feedback from the research community on the current data archive environment and what would be useful in a replacement solution. IT Services quickly determined that scale-up and scale-out NAS solutions were prohibitively expensive and the research community, having already been using a tape library for archive data, didn’t have a requirement for all research data to be hosted on disk. They simply wanted the archive data to be accessible with modern protocols such as CIFS and NFS rather than FTP, and to perform better in recall requests. With this knowledge in hand, IT Services developed a set of simple yet important success criteria for this initiative. A successful peta-scale storage solution for The Scripps Research Institute had the following requirements: a two-tiered storage solution in a global namespace comprised of 20% disk and 80% tape; directly accessible from CIFS and NFS and use an established directory services (Active Directory and NIS) for access control and file ownership; capable of mixed-mode CIFS and NFS file system access; data migration policy from disk to tape at the directory level; advanced data protection features for data on tape stored for many years; the ability to replicate data between redundant equipment in multiple data centers; and that it is commercially supported. The Scripps Research Institute was able to meet all of these criteria and today has capacity for two petabytes of research archive data. It also has the capability to scale many more petabytes of capacity with a high-performance file system and frame-based library solution from Quantum Corporation.

11:30 Developing Scalable Production Software for a Clinical Molecular Diagnostics Lab: A Case Study

Marcia Nizzari, Director of Informatics, Informatics & High Performance Computing/IT, Good Start Genetics, Inc.

This case study discusses many problems in our clinical laboratory business – data visualization, managing huge data loads, and creating a malleable system that is nimble. Learn about integration of different types of assay data, build versus buy tradeoffs, design considerations, and how a system built for a startup company is scaleable for handling large volumes in a commercial setting.

IBM small logo 12:00 pm Technologies and Applications for Managing and Sharing Data 

Janis E. Landry-Lane, Program Director, World Wide Deep Computing, Life Sciences/Higher Education Segments, IBM

Large data analytics poses several challenges: 1.  How can I afford to keep all of this data on-line?   Discussions will center around cost effective architectures for long-term data archive and use. Additionally, 2.  Where is the data?   Is it in an online file system, near-line tapes, relational databases, sensor data streams or on the web?  There are proven technologies that allow for the lifecycle management of data as well as schema that allow users to access data without worrying about where it is located, what protocol to use to connect and access the system and without establishing a separate account or password/certificate to each of the underlying computer systems to gain access, etc.  Genomic information is one of the newest sources of insight.  Learn how to integrate data and perform the analysis.

12:30 Luncheon Presentation (Sponsorship Opportunity Available) or Lunch on Your Own

1:40 Chairperson’s Remarks

Brian Bissett, Program Manager, Office of Systems, Social Security Administration 

1:45 Talk Title to be Announced

Brian Bissett, Program Manager, Office of Systems, Social Security Administration

2:15 Integration of High Throughput Omics Data Sets Using the Hadoop/HBase Platform

Ronald Taylor, Ph.D., Research Scientist, Computational Biology & Bioinformatics Group, Pacific Northwest National Laboratory

The scalable biological data warehouse system being built at the U.S. Department of Energy’s Environmental Molecular Sciences Laboratory (EMSL) using Hadoop and HBase will be described. This warehouse is being designed to store and manipulate data into the high terabyte range. A summary will be given of the current state of data storage capabilities and parallelization of analytics work.

2:45 Accelerated Bioinformatics - The Promises and Pitfalls  

Martin Gollery, CEO and chief consultant, Tahoe Informatics

Server farms to feed the ever-growing need for Biocomputing processing power have become huge drains on electricity, Air Conditioning, floorspace and manpower. To reduce these problems, many labs are turning to various types of accelerators to get the power they need to get their work done faster and more efficiently than ever before. This talk will include a discussion of the various accelerators available, the companies that supply them, and specific case studies that show how each of them are used in real-world settings.

3:15 Refreshment Break in the Exhibit Hall with Poster Viewing 

Project Management and Workflow Solutions 

3:45 Global Is the New Normal, How Can We Succeed in Implementing Laboratory Informatics Projects Globally

Eduard de Vries, Senior Manager, Information Technology, IDEXX Laboratories

How do you establish the right goals and guiding principles for successful implementation of laboratory informatics projects in an international multi-cultural environment and across acquisitions? This case study will describe what worked for us and change management methodologies used.

4:15 Transition of Project to Portfolio

Gurpreet Kanwar, Senior Project Manager, Information Management, Nav Canada

This presentation will help the project managers to successfully move the project to portfolio. It also provides the various techniques which can be used along portfolio managers to close the project. The speaker has been working as Project Manager for more than 10 years with experience of 18 years in IT technical project planning and implementation industry.

4:45 The Algorithm Makes the Results 

Jeffrey Rosenfeld, Ph.D., Assistant Professor of Medicine, IST/High Performance & Research Computing, New Jersey Medical School (UMDNJ)

There are currently a large number pipelines available for genome sequencing processing. Many of these pipelines rely on a number of simplifying assumptions that are not always found in the data. I will describe how the calling of complex variants and the joint annotation of nearby SNPs are critical for obtaining the correct results of sequencing. Further, I will discuss how the variability in the lengths of sequencing reads, along with the algorithms being utilized, can greatly alter the outcomes of an RNA-seq experiment.

5:15 Best of Show Awards Reception in the Exhibit Hall

6:15 Exhibit Hall Closes


Intel Logo7:00 am Breakfast Presentation Panel: Enabling Technology. Leveraging Data. Transforming Medicine.


Samuel Aronson, Executive Director, IT, Partners HealthCare Center for Personalized Genetic Medicine
Sanjay Joshi, CTO, Life Sciences, EMC Isilon
Glen Otero, Life Sciences HPC Solution Architect, Dell
Ketan Paranjape, Global Director, Healthcare & Life Sciences, Intel Corp.
Toby Bloom, Director, Informatics, Genomics Platform at Broad Institute

As we arrive at the $1000 genome, we find the fundamental problems have shifted... it is no longer about shrinking the cost of sequencing but the explosive growth of big data: the downstream analytics with rapidly evolving parameters, data sources and formats; the storage, movement and management of massive datasets and workloads; and perhaps most paradoxical of all, the challenge of articulating the results and translating the latest findings directly into improving patient outcomes.  Please join Intel and our distinguished panel to discuss how collaborating with a broad range of ecosystem partners to develop innovative solutions to seemingly intractable problems emerging in healthcare and life sciences today is driving us towards the vision of personalized medicine.


8:00 Featured Presentation Introduction

Geoffrey Noer, Senior Director, Product Marketing, Panasas

8:10 Trends in the Trenches 2013

Chris Dagdigian, Founding Partner and Director of Technology, BioTeam, Inc.

HPC Trends in the Trenches is one of the most popular presentations of the Expo! This talk will present how common HPC problems in life science informatics have been approached by organizations of varying type and size. We will discuss observed trends in computing, workflows and data movement, along with details on particularly clever solutions observed in production environments around the world.

Managing Big Data: Genome Center Perspectives 

8:45 Chairperson’s Opening Remarks

Jason Wang, Co-Founder & CTO, Arpeggi, Inc.

8:50 Managing Big Data: The Genome Center Perspective

Moderator: Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World

Guy Coates, Ph.D., Informatics Systems Group, The Wellcome Trust Sanger Institute

Eric Jones, Manager, Research Computing, Broad Institute

Xing Xu, Ph.D., Director, Cloud Computing Product, BGI Americas Corporation

Alexander (Sasha) Wait Zaranek, Ph.D., Director, Informatics, Personal Genome Project, Harvard Medical School; Scientific Director, Clinical Future, Inc.

Genome centers not only have the challenge of managing petabytes of data but the implied responsibility of sharing their hard-fought solutions and best practices with the multitude of organizations lacking their IT resources. This special session draws together the IT directors of various world-class genomics institutes to discuss their technological and organizational strategies for managing big data.

Hitachi Data Systems 9:50 Big Data: The Challenges Around the 5 Vs 

Kevin Brode, Senior Director, Health & Life Sciences, Americas Hitachi Data Systems

10:20 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced

10:45 Plenary Keynote Panel Chairperson’s Remarks

Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World

10:50 Plenary Keynote Panel Introduction

Yury Rozenman, Head of BT for Life Sciences, BT Global Services

Niven R. Narain, President & CTO, Berg Pharma


11:05 The Life Sciences CIO Panel

Remy Evard, CIO, Novartis Institutes for BioMedical Research
Martin Leach, Ph.D., Vice President, R&D IT, Biogen Idec
Andrea T. Norris, Director, Center for Information Technology (CIT) and Chief Information Officer, NIH
Gunaretnam (Guna) Rajagopal, Ph.D., VP & CIO - R&D IT, Research, Bioinformatics & External Innovation, Janssen Pharmaceuticals
Cris Ross, Chief Information Officer, Mayo Clinic
Matthew Trunnell, CIO, Broad Institute of MIT and Harvard

12:15 Luncheon in the Exhibit Hall with Poster Viewing

Panel Session:Building the IT Architecture of the New York Genome Center 

2:00 Closing Featured Panel Session Introduction

Wanmei Ou, Senior Product Strategist, Oracle Health Sciences

2:10 Panel Session: Building the IT Architecture of the New York Genome Center

Moderator: Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World

Chris Dwan, Acting Senior Vice President, Information Technology and Research Computing, New York Genome Center

Jim Harding, CTO, Sabey Corporation

Sanjay Joshi, CTO, Life Sciences, EMC Isilon Storage Division

Robert B. Darnell, M.D., Ph.D., President & Scientific Director, New York Genome Center

George Gosselin, CTO, Computer Design & Integration LLC


In 2011, a consortium of 11 major academic and medical organizations in and around New York announced the creation of the New York Genome Center (NYGC). Under the direction of Robert B. Darnell, the NYGC aspires to be a world-class genomics and medical research center, and is currently undergoing construction in the heart of Manhattan. NYGC management has the opportunity to design and create a state-of-the-art IT and data management infrastructure to handle, store and share the output from what will rapidly become one of the world’s foremost genome sequencing facilities. This series of talks will describe the thinking that went into the design, creation and construction of the NYGC’s IT infrastructure and entire data management strategy.

4:00 Conference Adjourns

Download Brochure | Pre-Conference Workshops 

*IBM and the IBM logo are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide.

View 2016 Photos & Videos  

View 2016 Brochure
View 2016 Brochure
Platinum Sponsors


Cycle Computing logo small

DDN Storage  

Elsevier R&D Solutions


 IBM Logo Illumnia logo  

Intel Logo  

Precision for Medicine


 Seven Bridges Genomics

View All Sponsors

Official Media Partner

Official PR Partner

View All Media Partners

Conference CD

CD iconOrder the 2015 event proceedings - now available on CD

Complimentary Downloads

View white papers, listen to podcasts, and more!

  • Making the World's Knowledge Computable
  • Bioinformatics in the Cloud
  • The Application of Text Analytics to Drug Safety Surveillance

Related Event

 Medical Informatics World Related