Data Science and Analytics Technologies

Best Practice Methods for Large-Scale Data to Advance Biomedical Research

May 17 - 18, 2023 ALL TIMES EDT

The Data Science and Analytics Technologies track will explore data science and analytics tools, technologies, and languages that data scientists are using to gain extra insights and value from data. Presentations will explore the importance of scalable platforms vs individual data science support, becoming a data-driven organization, innovative approaches to data management and analytics, understanding real questions that need to be answered, making real impact with data science, and applying data science and tools.

Monday, May 15

– 6:00 pm Hackathon*8:00 am

*Separate Complimentary Registration Required, see Hackathon page to submit your project OR register to participate

– 5:00 PM Registration Open – Come Early and Avoid the Lines2:00 pm

Tuesday, May 16

Registration Open7:00 am

Recommended Pre-Conference Workshops and Symposia*8:00 am

On Tuesday, May 16, 2023 Cambridge Healthtech Institute is pleased to offer nine pre-conference workshops scheduled across three time slots (8:00-10:00 am, 10:30 am-12:30 pm, and 1:45-3:45 pm) and two Symposia from 8:25 am-3:45 pm. All are designed to be instructional, interactive and provide in-depth information on a specific topic. They allow for one-on-one interaction and provide a great way to explain more technical aspects that would otherwise not be covered during the main conference tracks that take place Wednesday-Thursday.

*Separate registration required. For details, see Workshop agendas, FAIR Data Symposium agenda, and Knowledge Graphs Symposium agenda.

– 3:45 pm Hackathon*8:00 am

*Separate Complimentary Registration Required, see Hackathon page to submit your project OR register to participate

Refreshment Break and Transition to Plenary Keynote3:45 pm

PLENARY KEYNOTE PROGRAM

4:00 pm

Plenary Keynote Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute

4:05 pm

Innovative Practices Awards

Joseph Cerro, Independent Consultant

Chris Dwan, Independent Consultant, Dwan, LLC

Allison Proffitt, Editorial Director, Bio-IT World

The Innovative Practices Awards recognizes and celebrates innovation that advances life sciences research. Bio-IT World is currently accepting entries for the 2023 Innovative Practices Awards, a competition designed to recognize partnerships and projects pushing our industry forward. Winners will be announced in mid-April 2023, recognized during the Tuesday May 16 Plenary Keynote Program, and scheduled to give a 30-minute podium presentation about their project during the conference. The deadline for entry is March 3, 2023. For more details about the Awards and to submit an application, visit the official Bio-IT World Innovative Practices Awards page: https://www.bio-itworld.com/Award/.

4:20 pm Plenary Keynote Introduction

David Gosalvez, PhD, Executive Director, Strategy & Informatics Portfolio, Revvity Signals

4:30 pm PLENARY KEYNOTE PRESENTATION:

The Promise of Data, Analytics, and Technology: Fueling Scientific and Medical Breakthroughs

Anastasia Christianson, PhD, Vice President, Global Head of AI, ML, Analytics, and Data, Pfizer Inc.

Edward Cox, Head & General Manager, Digital Health & Medicines (DHM), Pfizer Inc.

The 21st century has been referred to as the Century of Biology. With 90% of the world’s 97 zettabytes of data generated in the past 2 years and 30% of today’s data being healthcare related, how are we using data technology and advanced analytics (artificial intelligence, machine learning, and deep learning) to advance our understanding of disease and deliver “breakthroughs that change patients' lives?”

Welcome Reception in the Exhibit Hall with Poster Viewing5:45 pm

Close of Day7:00 pm

Wednesday, May 17

Registration and Morning Coffee7:00 am

PLENARY KEYNOTE PROGRAM

8:00 am

Plenary Keynote Organizer's Remarks

Allison Proffitt, Editorial Director, Bio-IT World

8:05 am PLENARY KEYNOTE INTRODUCTION:

Life Science Automation Opportunities – So Many Options, So Little Time

Santanu Sen, Vice President, Healthcare & Life Sciences, Virtusa

The COVID pandemic has demonstrated that therapies and vaccines can be developed in 18 months with a high degree of safety and efficacy. Pioneering work done by companies involved has shed light to archaic processes that have been in existence for decades with little need for change.  In this presentation, we will discuss collaborative efforts, enabling technologies, regulation, and workflow to automate these processes to advance personalized medicine initiatives.

8:15 am PLENARY KEYNOTE PRESENTATION:

Federated Futures: How the Largest Federated Learning Effort in Medicine Will Inform Our Next Steps

Spyridon Bakas, PhD, Assistant Professor, Radiology & Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania

Raymond Y. Huang, MD, PhD, Division Chief, Neuroradiology, Brigham and Women’s Hospital; Associate Professor of Radiology, Harvard Medical School

Jason Martin, Principal Engineer AI Research Science, Security Solutions Lab, Intel Labs

Is a federated learning model sufficient to handle data from 71 institutions and more than 6,000 patients located on six continents? Researchers from Penn Medicine and Intel Labs say yes. An interdisciplinary team created the largest to-date global federated learning effort to develop an accurate and generalizable machine learning model for detecting glioblastoma borders. We will share what we learned about creating and maintaining such a federation, how the software infrastructure evolved over the course of the study, and how this work will empower the future of high-quality, precision clinical care worldwide.

Coffee Break in the Exhibit Hall with Poster Viewing9:30 am

Organizer's Welcome Remarks10:15 am

STRATEGY DEVELOPMENT AND OPTIMIZING DATA SCIENCE

10:20 am

Chairperson's Remarks

Ari E. Berman, PhD, CEO, BioTeam, Inc.

10:25 am PANEL DISCUSSION:

The Future of Data Science in Biomedicine: New Approaches to Make FAIR a Reality

PANEL MODERATOR:

Ari E. Berman, PhD, CEO, BioTeam, Inc.

Data science in life sciences and biomedical research has surged forward in the last few decades, and the community has seriously considered new data hygiene approaches to reach a more universal state of FAIR (Findable, Accessible, Interoperable, and Reproducible). Unfortunately, FAIR data remains a fleeting goal to reach. While there are pockets of excellence and portions of the greater scientific community are aligning on data strategies, governance, hygiene, and common formats, the community writ large is still quite resistant to making the changes necessary to reach those goals. In this session, a core group of experts from across the field will explore what practical technologies, approaches, and cultural approaches might lead the field to accomplish our FAIR goals sooner than later. 

PANELISTS:

Vivien R. Bonazzi, PhD, Managing Director & Chief Biomedical Data Scientist, Deloitte Consulting LLP

Kjiersten Fagnan, PhD, CIO, Data Science & Informatics, Lawrence Berkeley National Laboratory

Matthew Trunnell, Data Commoner, Pandemic Response Commons

11:25 am

The Hackensack Meridian Health & Carenostics CKD Collaboration, Nominated by Bayer G4A Project: AI for Earlier Clinical Intervention in Chronic Kidney Disease (Innovative Practices Awards Winner)

Kash Patel, Executive Vice President, Chief Digital Information Officer, Hackensack Meridian Health

Kanishka Rao, Co-Founder & COO, Carenostics

Bharat Rao, PhD, Co-Founder & CEO, Carenostics

Chronic Kidney Disease (CKD) is a leading cause of mortality, affecting >800M people worldwide. Early detection and intervention have been shown to slow disease progression, saving lives and reducing costs. Unfortunately, ~85% of CKD cases are estimated to be undiagnosed. To tackle this, Carenostics has developed machine learning (ML) models that have retrospectively identified 50% of the undiagnosed CKD population at 3x the specificity of current testing practices. Carenostics is deploying these models into clinical practice at Hackensack Meridian Health (HMH), an 18-hospital health system with >6M patient records. The solution identifies undiagnosed & untreated CKD patients using existing EHR data, addresses health inequities through bias-adjusted ML, and activates clinicians with an intuitive, EHR-integrated interface. In the next 12 months, Carenostics projects to help HMH diagnose 50,000 previously undiagnosed CKD patients and will expand its platform to help HMH identify and proactively treat patients with other chronic diseases.

11:55 am The best of both? Powering Scientific and Clinical Insights with NLP and ML

Constantinos Katevatis, Associate Director R&D NLP, Natural Language Processing, IQVIA

Advanced analytics using AI (ML, NLP and more) have the potential to transform drug discovery and development. We will present on the effectiveness of state-of-the-art natural language processing (NLP), combining rules-based and model-based approaches, in tackling real-world use cases in life sciences and healthcare, to improve drug development and human health. 

12:10 pm Computational Phenotyping and Analytics

Steven E. Labkoff, MD, FACP, FACMI, FAMIA, Global Head, Clinical and Healthcare Informatics, Quantori

One of the most important things in analytics is having the right cohort selected upon which to perform analyses. Without the proper computational phenotyping, quality analytics becomes difficult. Learn about the challenges with regards to computational phenotyping as they relate to doing data analytics. Dr. Labkoff will discuss challenges in sensitivity, and specificity in cohort generation and the downstream challenges that this brings to drawing conclusions from RWE data sets.

12:25 pm Conquering the Lab Data Deluge

Rob Brown, PhD, Senior Director Product Marketing, Product Services, Sapio Sciences

As the volume and variety of lab data continues to grow exponentially, scientific researchers can become consumed with the struggle to access and organize that fragmented data. Research productivity suffers, and the power to unlock insight and unleash AI is impacted.  We will present a new Science-Aware solution that empowers scientists to directly access all of their data in one convenient location, with science-context, and using science-aware applications. 

12:40 pm How Generative AI and Chat GPT Together with Intelligence Document Review Can Help Pharma Accelerate Filings

Dimitrios Mizantsidis, Director, Product Marketing, CDAS, IQVIA Technologies

Gary Shorter, Head of AI and Data Science, IQVIA Technologies

The pharmaceutical industry is experiencing pressure to speed up the process of bringing products to market. During this presentation, attendees will learn how Intelligent Document Review can help bring a document to “life”, by creating a “Digital Twin” and use this data to train Generative AI and Chat GPT like models to automate processes and reduce errors in near real time, accelerating clinical trials and high-quality filings.

Session Break and Transition to Luncheon Presentation12:55 pm

1:05 pm LUNCHEON PRESENTATION:Your Secret Weapon for Powerful Scientific Analytics

Josh Bond, Senior Director, Product Portfolio, Revvity Signals

Signals Inventa is designed to help scientists make better decisions faster and takes a unique approach to modeling scientific data, placing flexibility, simplicity, and usability at the forefront. Compared to traditional Oracle data warehouses, Signal Inventa's data models are easy to maintain, query, and scale. In this talk, we will share how the Inventa approach can be used as your secret weapon to unify vast amounts of data and accelerate decision-making.

Refreshment Break in the Exhibit Hall with Poster Viewing1:50 pm

DATA SCIENCE METHODS AND ANALYSIS TO SUPPORT KNOWLEDGE DISCOVERY

2:35 pm

Chairperson's Remarks

Rishi R. Gupta, PhD, Associate Director, Data Science, Novartis Institute for Biomedical Research

2:40 pm

Genome Citation Service: Identify Publications That Used Your Data

Kjiersten Fagnan, PhD, CIO, Data Science & Informatics, Lawrence Berkeley National Laboratory

The DOE Joint Genome Institute collaborated with NamesforLife, LLC to build a service that takes data set identifiers and returns publications that have a high likelihood of having used that data. In this talk I will give an overview of the service and the algorithms that we use to identify the relevant publications. I will also give an overview of the methods we use to process the publications to identify relevant metadata and associate it with the original data to improve discoverability. 

3:10 pm

Harnessing Free Text Insights through the Power of NLP

Alice Chung, Senior Analytics Manager, Strategic Analytics & Intelligence, Genentech, Inc.

We will present the journey to harness free text data for analysis and knowledge discovery. We discuss benefits of deploying an automated system generating easy-to-comprehend insights and share lessons learned along the way (including the implementation approach) to ensure that NLP is indeed able to assist in delivering business value – when it comes to medically oriented data sets. There's no one-size-fits-all solution, only "fit for purpose" approach can promise the greatest impact.

3:40 pm

AutoFocus: Automated Design Platform with Augmented Learning

Rishi R. Gupta, PhD, Associate Director, Data Science, Novartis Institute for Biomedical Research

AutoFocus is a centralized web-based compound design and visualization platform that allows medicinal chemistry teams working together with CADD experts to generate new ideas, refine them in 3D, and collaborate to provide chemistry design assessment and analysis in an interoperable and efficient manner. Several impactful tools are already available and being used across GDC such as Library Enumeration, 3D overlay of structures, Docking, property calculation, etc. One of the most recent features have allowed us to not only democratize design, but also put the focus on augmented learning for better compound design and optimize cycle times.

4:10 pm An Assessment of PubMed Bibliographic Records: The Importance of Data Quality on Knowledge Graph Construction

Stephen Howe, Principal Product Manager for Data and Analytics, CCC (Copyright Clearance Center)

Knowledge graphs nimbly connect diverse data sources and types in order to facilitate new insights. To have confidence in the output, it is imperative to understand the quality of our data in a quantifiable way. Drawing on our own experience building a knowledge graph of authors, we present an evaluation and assessment of the data quality of one, commonly used data source in the life sciences: PubMed bibliographic citations.

4:25 pm FAIR Data Platforms need FAIR PIPELINES

Dave Clifford, Head of Technology and AI, BDH Data, Biogen

In August 2022, BDH embarked on a journey to modernize its data platform leveraging AWS and Databricks as foundational elements. The initial effort focused on developing repeatable modular workflows for ingestion, curation and harmonization required to support Biogen’s real world research networks. Dave will share an overview of the approach and lessons learned through this ongoing initiative.  

Best of Show Awards Reception in the Exhibit Hall with Poster Viewing4:40 pm

Close of Day6:00 pm

Thursday, May 18

Registration and Morning Coffee7:30 am

PLENARY KEYNOTE PROGRAM

8:00 am

Plenary Keynote Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute

Plenary Keynote Sponsor Introduction (Opportunity Available)8:05 am

8:15 am PLENARY PANEL DISCUSSION:

Assessing Innovation: How Pharma Makes Tech Investment Decisions

PANEL MODERATOR:

Aaron Mann, CEO, Clinical Research Data Sharing Alliance

This panel session will assemble senior leaders who evaluate new technology adoption. We will hold an interactive discussion to help provide transparency in the evaluation and decision-making process for assessing and investing in new technologies. Themes we will cover include: 1) process for evaluating, piloting, and scaling new technologies and technology approaches; 2) how an organization evaluates an emerging technology vendor landscape; 3) when and how a formal buying process becomes required, and 4) identifying key stakeholders, decision-makers, and gatekeepers. 

PANELISTS:

April Bingham, Executive Director, Global Medical Compliance and Governance Chapter, Roche

Peter Mesenbrink, PhD, Executive Director, Biostatistics, Novartis Pharmaceuticals

Maria Palombini, Global Practice Leader, Healthcare & Life Sciences, IEEE Standards Association

Laszlo Vasko, Senior Director, Clinical Innovation R&D IT, Janssen Pharmaceuticals, Inc.

Coffee Break in the Exhibit Hall with Poster Viewing9:30 am

Organizer's Remarks10:15 am

REGULATORY CONSIDERATIONS: BIOINFORMATICS, HEALTH RESEARCH, AND DIGITAL HEALTH TECHNOLOGIES

10:20 am

Chairperson's Remarks

John M. Conley, PhD, William Rand Kenan Jr. Professor, Law, University of North Carolina

10:25 am

US and EU on Verge of Ending Data Transfer Crisis: What It Means for Bioinformatics Research

John M. Conley, PhD, William Rand Kenan Jr. Professor, Law, University of North Carolina

For more than two years, lawful data transfers between the EU and US have been difficult to impossible--and even harder for health data. The EU and US have now reached an agreement in principle, which should be finalized by Spring 2023. This presentation will describe the new arrangement and the ways that companies and institutions in bioinformatics and health research generally will be able to take advantage of it so as conduct their research more efficiently and cheaply, and without the current legal risks. 

10:55 am

Regulatory Considerations When Designing Digital Health Technologies

Wenjing Wang, Associate Director, Global Regulatory Affairs & Clinical Safety, Merck & Co., Inc.

Digital Health Technologies are playing a more critical role in patient care including remote patient monitoring, self administered injection, etc. In the US, the FDA started to increase regulations for digital products based on the risk level they posed to the patients. This presentation will cover some of the regulatory considerations when designing Digital Health Technologies.

SELF-DRIVING CHEMICAL OPTIMIZATION

11:25 am

Self-Driving Chemical Optimization

Cihan Soylu, Senior Expert I – Data Science, Novartis

Clayton Springer, PhD, Computational Chemist, Global Discovery Chemistry, Novartis Institutes for BioMedical Research, Inc.

We describe an algorithmic approach to lead optimization. We take inspiration from geostatistics methods which consider the uncertainty in predictions as well as the predictions themselves. This leads to an approach which has a concept of a chemical space. From a current dataset the approach will suggest both large and small changes aimed finding chemical matter that is better than what is known. This approach is also called "active learning." We use data from a project, as a retrospective example to show how the approach works.

DATA SCIENCES & ANALYTICS TECHNOLOGIES TO ADVANCE BIOMEDICAL RESEARCH

11:55 am Data Sciences & Analytics Technologies for Advance Biomedical Research

Sreeni Reddy, Associate Vice President, Life Sciences & Healthcare, Birlasoft

Data Science and Analytics Technologies. Best Practice Methods for large-scale data to advance biomedical research

 

12:10 pm Leveraging Commoditized Large Language Models and Data Ops for Rapid Biotech Innovation

Kartik Thakore, Entrepreneur in Residence, Elevance Health

This talk explores the evolving biotech industry and the importance of data operations and real-world evidence in driving innovation and delivering value. With Large Language Models becoming more commonplace, we'll examine the impact of innovative AI technologies. We'll discuss strategies to reduce the lag between data acquisition, and operational fine-tuning. We'll also explore how reimagining past real-world cases with advanced AI can inspire novel solutions and enhance the industry's growth.
 
12:40 pm Data-Centric AI, What is That?

Alex Wilson, Team Lead, Knowledge & Insights, Knowledge & Insights, DrugBank

A brief introduction to data-centric AI - an exciting shift that is taking place in how researchers, data scientists, and organizations are approaching AI, and in particular how these changes might impact the healthcare spectrum.

Session Break and Transition to Luncheon Presentation12:55 pm

1:05 pm LUNCHEON PRESENTATION:Unlock Omic Data Potential for Biomarker Discovery and Optimize Lab Operations

Anushka Brownley, Associate Director, Product Management, Illumina

Michael Edwards, PhD, Director of Discovery Analystics, MycoTechnology

Rohita Sinha, PhD, R&D Director, Bioinformatics, Eurofins-Viracor

Connecting omic data in a single ecosystem empowers users to unlock the true potential of their data. In this session you will hear from Rohita Sinha, PhD, Eurofins Viracor, a molecular and immunodiagnostic company, and Michael Edwards, PhD, MycoTechnology, a biotechnology company focused on increasing global protein supply through mushrooms on how each organization have accelerated lab operations and biomarker discovery to gain efficiencies through Connected Software. 

Refreshment Break in the Exhibit Hall with Poster Viewing1:50 pm

TRENDS FROM THE TRENCHES

2:35 pm

Chairperson's Remarks

Ari E. Berman, PhD, CEO, BioTeam, Inc.

2:40 pm

Trends from the Trenches

Ari E. Berman, PhD, CEO, BioTeam, Inc.

Adam Kraut, Director, Infrastructure & Cloud Architecture, BioTeam, Inc.

Anna Sowa, PhD, Senior Scientific Consultant, BioTeam, Inc.

Since 2010, the “Trends from the Trenches” presentation has been one of the most popular annual traditions of the Bio-IT Program. The intent of the session is to deliver a candid (and occasionally blunt) assessment of the best, the most worthwhile, and the most overhyped information technologies (IT) for life sciences. The presentation has helped scientists, leadership, and IT professionals understand the basic topics related to computing, storage, data transfer, networks, cloud, data science, and machine learning that are involved in supporting data-intensive science. In 2023, consultants from BioTeam will give an overview of the trending issues in life sciences. An interactive Q&A moderated discussion with the audience follows. Come prepared with your questions and commentary for this informative and lively session.

Close of Conference4:10 pm






Exhibit Hall and Keynote Pass

Data Platforms and Storage Infrastructure