Hengeveld: Big Data Meets HPC to Solve Hard Problems and Improve Lives

Print Friendly, PDF & Email

By John Hengeveld

John Hengeveld is the HPC Segment Marketing Director for Intel’s Technical Computing Group.  His Intel Developer Forum session titled “Big Data Meets High Performance Computing” will take place at 3:30 p.m. Wednesday in Room 2002 of Moscone West, San Francisco.

I’ve been hearing a lot buzz about “Big Data” … people talking in terms of mining Facebook posts for marketing data. I didn’t take all the talk seriously at first, but I do now. … Let me tell you how Big Data might just save my life.

In March, I had a major appendix attack. And it turns out that within my appendix was a material called appendiceal mucinous neoplasm, which is a very rare type of cancer.  There is no cure for my cancer—not yet, anyway. I’m just hanging on and crossing my fingers and hoping things work out.

Now, the first time my doctor went over the pathology report, she told me I had a 30-60 percent chance of having less than seven years to live. But then I got some good news from my doctors. After a lot of study and analysis, they offered a more encouraging assessment. They reasoned that I had a better-than-average prognosis after all, given that I didn’t appear to have very much of the material or to have had a lengthy exposure to it. So I went back to work.

But it turns out there is a high likelihood that in the relatively near future Big Data and high-performance computing (HPC) might work together to unravel the mysteries of rare cancers like mine—and offer new hope to people like me.

I like to think of Big Data as an oil field with a lot of breadth and a lot of depth. To get value out of the field, you need a powerful pump, and that’s HPC. The HPC pump allows you to draw insights from the Big Data. Today, researchers are doing just this across a broad spectrum of fields. For me, the research being done in the field of genomics hits closest to home, because this research could eventually lead to a world of personalized therapies based on a genomic analysis of a patient’s cancer.

This is one of the topics we will dive into during a session I will lead Wednesday at the Intel Developer Forum. That session—titled “Big Data Meets High Performance Computing”—will include an appearance by Professor Michael Franklin, a computer scientist who directs the AMPLab at UC Berkeley, one of the leading teams working on applications of Big Data to a new generation of problems.

Professor Franklin will explore some of the latest innovations in five applications that combine Big Data with HPC. These applications range from genomics research to crowd-sourcing to increase battery life on your cell phone (yes, it works—I’ve done it). I, of course, will have a special interest in the discussion of the role that Big Data and HPC can play in helping researchers understand the genetics in cancers and formulate appropriate therapies.

Already, people at Berkeley are using HPC to study the public data on cancer genomes. They have accessed what’s called The Cancer Genome Atlas. This atlas shows the genomics of tumors and their hosts. The study is focused on finding the mutations that have derived the cancers from the hosts, and then using that knowledge to understand the nature of the mutations that are occurring and how they might be blocked or eliminated.

This kind of research is good news—not just for me but for many other cancer patients to come. In this sense, Big Data and HPC provide hope for the future.

From my perspective, Big Data is not about shifting through massive numbers of Facebook posts and seeing who the “likes” are. It’s really about generating insights to solve hard problems and improve the lives of people.