Intel Select Solutions: BigStack 2.0 for Genomics

Print Friendly, PDF & Email

In this video from SC17, Mark Bagley from describes the newest Intel Select Solution for HPC: BigStack 2.0 for Genomics.

BIGstack 2.0 incorporates our latest Intel Xeon Scalable processors, Intel 3D NAND SSD, and Intel FPGAs while also leveraging the latest genomic tools from the Broad Institute in GATK 3.8 and GATK 4.0. This new stack provides a 3.34x speed up in whole genome analysis and a 2.2x daily throughput increase. It is able to deliver these performance improvements with a cost of just $5.68 per whole genome analyzed. The result: researchers will be able to analyze more genomes, more quickly and at lower cost, enabling new discoveries, new treatment options, and faster diagnosis of disease.

Advancements in genomics are opening new doors for understanding human diseases, and they are increasingly informing innovative precision treatment plans. Discoveries are dependent on processing, storing, and analyzing a growing amount of genomic sequencing data. In 2015, worldwide sequencing storage capacity approached a petabyte per year, and it continues to double every seven months. At this rate, genomics sequencing will generate hundreds of petabytes per year in the next five years, and could require nearly a zettabyte of storage per year by 2025.

The Broad Institute of MIT and Harvard is one of the world’s largest producers of human genomic data, creating about 24 TB of new data per day. Currently, Broad Institute manages more than 50 PB of data.

Researchers require tools to analyze these enormous volumes of data in a timely manner to gain insights into disease and possible treatments. They need tools like the Genome Analysis Toolkit* (GATK*), a set of leading software methods created by the Broad Institute and trusted by the majority of genomics centers worldwide.

Broad Institute will release GATK 4.0 as its next major version, under an open source license for all users, including for commercial purposes. An open source license will make GATK available to a wider audience of scientists and researchers and will help accelerate and advance genomics analytics worldwide.
Intel-Broad Center for Genomic Data Engineering Intel and Broad Institute have collaborated on computing infrastructure and software optimization for years. In 2017, they launched a new effort—the Intel Broad Center for Genomic Data Engineering is a five-year collaboration between the two organizations to simplify and accelerate genomics workflow execution using GATK, Burrow-Wheeler Aligner (BWA), Cromwell, Intel Genomics Kernel Library (Intel GKL), GenomicsDB*, and other tools and techniques.

See our complete coverage of SC17

Sign up for our insideHPC Newsletter