BioScience gets a Boost Through Order-of-Magnitude Computing Gains

bioscience

In this guest post from Intel, the company explores how BioScience is getting a leg up from order-of-magnitude computing progress and gains.

bioscience

Intel is working with leaders in the field to eliminate today’s data processing bottlenecks in the life sciences.

The life sciences are approaching an inflection point. It took nearly three years and three billion dollars to sequence the first human genome. Today, it takes only a few hours and about USD 1,000[1] to accomplish the same task.[2] Meanwhile, technologies such as Cryo-Electron Microscopy (Cryo-EM) and Molecular Dynamics (MD) are shining new light onto biological processes at the atomic level, helping scientists illuminate complex molecular pathways with unprecedented clarity.

In combination, these advances have the potential to transform the way we understand life and how we treat injury and disease. Yet processing the mountains of data generated by today’s complex research tools has become a barrier to rapid progress. Scientists often have to wait days for experimental data to be transformed into useable information.

[clickToTweet tweet=”Intel is working with leaders in the field to eliminate today’s data processing bottlenecks. ” quote=”Intel is working with leaders in the field to eliminate today’s data processing bottlenecks. “]

Tackling the Data Challenge Through Software Optimization

Intel is working with leaders in the field to eliminate today’s data processing bottlenecks. The first step is to modernize key scientific algorithms to make better use of modern processors, such as multicore Intel Xeon Scalable processors and many-core Intel Xeon Phi™ processors. The benefits can be profound. For example:

  • Professor Knut Reinert at the Free University of Berlin is working to accelerate genome analysis. His SeqAn* library of genomic algorithms can help speed performance by as much as 55X through multicore scaling.[3]
  • Professor Simon Warfield of Harvard Medical School reduced image reconstruction times for his advanced brain scanning technique by up to 161X.3 His team can now generate high-resolution images of neural tracts in minutes, fast enough to support high-volume research and clinical emergencies.
  • Professor Erik Lindahl of Stockholm University worked with Intel to reduce Cryo-EM image reconstruction times from as long as 47 hours to as little as 6.6 hours.3 The work continues, with the goal of providing near-real-time images of biomolecules. This would allow researchers to adapt their Cryo-EM experiments on the fly, potentially shaving months off experimental timelines.

Faster, Simpler Hardware

Modernized codes not only make better use of modern processors. They also tend to scale more efficiently on the multiple server nodes of an HPC cluster. Depending on the amount of parallelism exposed in the code, cluster computing can potentially increase performance by additional orders of magnitude. Calculations that may have taken days or weeks to perform can often be completed in a matter of minutes or even seconds.

Of course, traditional HPC introduces another layer of cost and complexity, and typically requires specialized expertise to deliver high performance for real-world workloads. Intel is working to eliminate those barriers with the Intel Scalable System Framework for Life Sciences. This reference architecture brings together all the elements of an HPC cluster into a balanced system architecture optimized for bioscience workloads, including molecular dynamics, genomics, molecular imaging, machine learning and AI, data visualization, and more.

Intel’s framework is designed to make HPC simpler, more affordable, and more powerful. It includes advanced Intel technologies like Intel Omni-Path Architecture fabric and Intel Optane SSDs that work together to resolve performance bottlenecks at every layer of the HPC solution stack.  Just as importantly, Intel SSF is built to scale and adapt as new technologies emerge, because one thing is certain: biological data sets will continue to grow. It is essential to have a cluster that can expand as needed, so it continues to support fast, high-quality science.

To learn more about the hardware and software advances that are helping some of today’s leading scientists kick their research into a higher gear, read the Intel white paper, “Science without Constraints.”

[1] Source: National Institute of Health (NIH) National Human Genome Research Institute, The Cost of Sequencing a Human Genome, last updated July 6, 2016. https://www.genome.gov/27565109/the-cost-of-sequencing-a-human-genome/

[2] Source: Illumina introduces the NovaSeq Series—a New Architecture Designed to Usher in the $100 genome, Illumina press release, January 9, 2017. https://www.illumina.com/company/news-center/press-releases/press-release-details.html?newsid=2236383

[3] For experimental details and configurations, see the Intel white paper, “Science Without Constraints.” https://software.intel.com/en-us/articles/science-without-constraints