2019 Hackathon - Bio-IT World Conference & Expo

2019 Archived Content

April 15-16, 2019 | Seaport World Trade Center | Boston, MA

Cityview 1

Bio-IT World is proud to bring together innovative data scientists and developers from across the industry to solve real-world data challenges using the principles of FAIR: Findable, Accessible, Interoperable and Reusable Data.

For the past two years, the Bio-IT World FAIR Data Hackathon has delivered a new level of collaboration to the annual Bio-IT World Conference & Expo in Boston. Facilitated in partnership with FAIR Data pioneers from the National Center for Biotechnology Information (NCBI), Dutch Techcentre for Life Sciences, and Ontoforce, the 2017 and 2018 Hackathons have resulted in numerous notable FAIR Data Projects. Learn more about them here:

The third annual Bio-IT FAIR Data Hackathon will continue in the tradition of uniting life science and IT teams to tackle actual genomic datasets with maximum impact potential. The hackathon will have an initial focus on evaluating the FAIRness of a range of different data sets. The second stage will involve teams, each working with a different data set, developing approaches to improve the FAIRness of that dataset through the use of unique identifiers, linking to additional data sets, collection of appropriate metadata, and other techniques that can be applied.

The two-day intensive hackathon precedes a full conference track dedicated to the effort: FAIR Data. Together, the conference and hackathon promise to deliver impactful projects, and significant insight into the exploding field of FAIR data in research. Consider extending your time at Bio-IT World to join this focused group of FAIR Data experts!

2019 Projects

Learn more about our 2019 FAIR Data Hackathon Projects.

BioAssay Express: Applying FAIR Principles to Bioassay Protocols

Collaborative Drug Discovery

Using BioAssay Express, we have created custom metadata annotation templates for a subset of 5 minimum information guidelines relating to experimental assay protocols for qPCR, microarray, RNAi, in situ hybridization/immunohistochemistry, and flow cytometry. We seek to determine how widely followed these guidelines are in the literature and provide a solution for faster, more complete, and FAIR methodology reporting.

BLAST, Pipelines, and FAIR

NCBI, NIH

We examine making bioinformatics pipelines more FAIR, starting with those that use BLAST. We will be using a workflow language and a dockerized version of BLAST.

FAIR Beyond Data – Applications as FAIR

The Jackson Laboratory

The Jackson Laboratory is working with the registration of an application through the specification of inputs and outputs and the expected transformation as a POC accomplishing two things. One, being agnostic to platform and two being FAIR. Extending the principles of FAIR to applications which transform input to another form by algorithms, e.g. machine learning algorithm, normalization or transformation algorithms.

The Broad Institute’s Single-Cell RNA-Seq Data Set

The Broad Institute

This project will visualize the cancer genome using FAIR single-cell RNA-seq data. We will develop a prototype web application that uses REST APIs for single-cell genomics to drive genome visualizations of cancer genomes. Overall, the project will demonstrate how FAIR principles can drive useful applications in cancer genomics.

Bringing the Power of Synthetic Data Generation to the Masses

The Broad Institute

Everyone from tool developers and educators to researchers publishing their own work or trying to build on someone else’s, is hamstrung by the lack of open access genomic datasets appropriate for reproducing biologically meaningful analyses at scale (as opposed to plumbing testing, ie « will it run? » for which we have data in spades). The solution involves generating custom synthetic datasets, but current tools to do so are complex and require a lot of computational work that ends up being redundant when applied to multiple studies. This project will build on existing tools to provide community resources and streamlined tooling for generating custom synthetic data efficiently.

DOE JGI Genomics Data Set

U.S. Department of Energy Joint Genome Institute

The DOE Joint Genome Institute has a wealth of environmental genomics data that is available for public use. The JGI has been working on a new search and download system to ensure the data are findable and accessible in order to address concerns from the community. This is an open project and hackathon participants will be able to help assess the 'FAIR'ness of the data access point, and link it to other community efforts.

Generating a Fungal Index for the SRA

Find Bioscience

This project aims to produce an index of all fungi in the Sequence Read Archive (SRA). In support of this exercise, we'll provide high-throughput compute resources necessary to the task -- e.g. {a} an implementation of BLAST linked to a distributed archive of the SRA (minus quality scores); and {b} a simple webUI for use of this service. Generation of a fungal index for the SRA would help guide researchers to datasets of interest. We also intend to show that distributed computing can be used to generate and maintain indices.

Integrating Globus into Galaxy to Enable FAIRifying Data

Globus, University of Chicago

Galaxy is a widely used workflow engine, with over 7,000 publications citing it. Galaxy is deployed at small and large scales by academic and commercial users. Integrating Globus Auth into Galaxy, in an open source manner, will provide a simple and direct mechanism for Galaxy users to integrate with their campus and other identity providers, simplifying identity management. This will enable the use of Globus within Galaxy tools for finding, accessing, and identifying data.

What is FAIR Data?
The volume of data continues to rise exponentially, but the capacity for fully employing this data is being hampered by a series of limitations. FAIR is a very powerful initiative that has taken root in Europe. The initiative has the potential to significantly increase the value of life science data sets. While the concept shares some commonality with the semantic web, FAIR Data goes further to expand opportunities for knowledge-sharing and value. Here are three foundation papers on this exploding field:

Thank You to our Underwriter
How to Get Involved

Click here to register
Once you have registered, you will receive a participation form. Fill out the participation form to finalize your registration
Team and project assignments will be communicated via email closer to the event start date
Stay up-to-date on the Bio-IT World FAIR Data Hackathon 2019 schedule and project outcomes at #BioIT19

Maximize Your Time at Bio-IT World!

Complimentary access to the Bio-IT World Exhibit Hall and Plenary Keynote Session is provided to all hackathon participants. Extend your time onsite and take part in the collaboration and knowledge-sharing!
Consider joining FAIR data leaders for Bio-IT World's FAIR Data conference track.

Conference Tracks

T1: Data Platforms & Storage Infrastructure