Fighting Cancer with Deep Learning at Scale with the CANDLE Project

In this episode of Let’s Talk Exascale, Mike Bernhardt discusses the CANDLE project for cancer research with Rick Stevens from Argonne National Lab. The CANcer Distributed Learning Environment (CANDLE) is an ECP application development project targeting new computational methods for cancer treatment with precision medicine.

Transcript:

What is CANDLE all about?

Rick Stevens from Argonne

It has to do with building a scalable deep-learning environment that can be applied to a variety of problems in cancer, initially. CANDLE is designed to run on the big machines that we have at the US Department of Energy (DOE). The goal is to have an easy-to-use environment that can take advantage of the full power of these big systems to search through large combinations of deep-learning models to find optimal models for making predictions in cancer. Eventually, we’ll use the same environment in many other areas of DOE research, such as materials science, cosmology, or climate analysis.

What will CANDLE enable researchers to accomplish that they cannot today?

Right now, we can run individual deep-learning models on the nodes of supercomputers. However, it’s very difficult to sweep through thousands or tens of thousands of model configurations to look for the optimal results of the models and to database all of that and to visualize it. So the CANDLE environment will really enable individual researchers to scale up their use of DOE supercomputers on deep learning in a way that’s never been possible before.

Is this research area taking advantage of some of the computer time available through the ECP?

Yes, we have a significant allocation as part of the ECP to use for development of the algorithms, but also to do some science. So as we debug our algorithms and debug the software stacks, assuming we have enough time left over, we’ll use our allocation to do real science.

What new collaborations have you developed through this project?

CANDLE is great in the collaboration sense because it involves Argonne, Oak Ridge, Lawrence Livermore, and Los Alamos, as well as the National Cancer Institute (NCI) and the Frederick National Laboratory for Cancer Research. The four DOE labs plus the NCI lab are working together on software infrastructure that plugs into the frameworks, and also those labs are partnering on different cancer research problems. They’re bringing those cancer problems as the test cases for this new environment.

What milestones has CANDLE reached so far?

We made a major software release in July, and before that, we released seven benchmark problems that kind of represent our design targets. We’ve also released in the last couple of months state-of-the-art problems that use this environment to advance learning research in cancer. Further, we’ll do another major software release in the spring. Another accomplishment is that we have the system running now at Argonne, the National Energy Research Scientific Computing Center, Oak Ridge National Laboratory (ORNL), and the National Cancer Institute. So we have a large base of people who are collaborating on developing the software, testing it, and benchmarking it. I have a great set of collaborators in Fred Streitz of Lawrence Livermore National Laboratory, Gina Tourassi of ORNL, Frank Alexander of Brookhaven National Laboratory and formerly of Los Alamos National Laboratory (LANL), and Marian Anghel of LANL. We’ve been able to bring together people at each of the labs who are really thinking about deep learning and how it’s going to apply to problems in DOE.

How do you think a project such as CANDLE will affect research within DOE?

I believe in the next 5 years, the number of research projects at DOE that are using machine learning to augment simulation will increase dramatically to take advantage of the large-scale data that DOE collects. That growth will require hundreds of people at each laboratory to come up to speed on how to use deep learning and advanced deep-learning research, network types, and research methods to make it possible to run these large-scale deep-learning problems on the supercomputers. Deep learning has not been the traditional application type for these supercomputers, and it has different requirements in terms of how it uses data, how it produces outputs, and how you need to scale up and use the resources in different ways than traditional simulation applications.

What do you hope to achieve during calendar year 2018?

I think we will have made major progress in different areas of cancer research and this idea of predictive oncology, being able to predict which types of drugs would be most appropriate for given tumor types. We are using deep learning to analyze millions of medical records and to pull that data out in a way that we can then compute on it, and I think we’ll make major progress in that area. ORNL is already showing really positive results. We also want to ultimately steer simulation with deep-learning supervisors where the deep-learning system is actually analyzing the data as the supercomputer is running and steering the computation in a way that humans could never do. The system can absorb more information about what’s happening in the simulation. It can make decisions about where to take that simulation.

Could CANDLE achieve its objectives without exascale?

No. All of these problems are really aimed at big, overarching challenges that require exascale, and we’re climbing up the exascale mountain, so to speak. The first machine that will get us substantially closer to that will be Summit at Oak Ridge. It will have thousands of GPUs and be extremely well-suited to deep learning. But we’re also using the Intel processors at Argonne and Berkeley to advance this as well, so the system is not specific to any given architecture. It’s designed to run the problems across multiple architectures, but it’s definitely forward-looking toward exascale. Many of the problems we have in our imagination that we want to do over the next few years will absolutely require exascale.

Is there anything you would like people to know about CANDLE that we’ve not discussed?

It’s pretty amazing how much interest there is in applying high-performance computing and deep learning at scale to a problem like cancer. You can talk a lot about materials, cosmology, or even climate, and people are interested in those. But when we start talking about problems like cancer, which affects basically everybody, the level of interest is just off the charts, and it’s not just interest in DOE or our NCI collaborators. It’s interest from the vendors. So every one of the major vendors that provides computing capability—whether it’s Cray, Intel, NVIDIA, IBM, HPE, AMD, ARM, and so on—all of them are interested in trying to figure out how they can help make this kind of problem work extremely well for the entire community. It really is a problem that touches people in a completely different way than most of what we’ve worked on in the past.

Rick Stevens is a professor at the University of Chicago in the Department of Computer Science and holds senior fellow appointments in the University’s Computation Institute and the Institute for Genomics and Systems Biology, where he teaches and supervises graduate students in the areas of computational biology, collaboration and visualization technology, and computer architecture. He co-founded and has co-directed the University of Chicago/Argonne Computation Institute, which provides an intellectual home for large-scale interdisciplinary projects involving computation. He is also Associate Laboratory Director responsible for Computing, Environment, and Life Sciences research at Argonne National Laboratory.

Download the MP3

Subscribe to RSS

Sign up for our insideHPC Newsletter

Sponsored Guest Articles

Hammerspace Unveils the Fastest File System in the World for Training Enterprise AI Models at Scale

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA