Discussing the ORNL Titan Supercomputer with ORNL’s Jack Wells

Print Friendly, PDF & Email

Very recently, The U.S. Department of Energy’s (DOE) Oak Ridge National Laboratory held the official launch of the Titan supercomputer. This incredibly powerful system, boasting 20 petaflops of computational muscle, is the extension of Jaguar, and will overcome much of the power and space limitations of the previous generation.

While Titan has yet to prove what it can do, it is a milestone that holds much promise for U.S. technology leadership.

We tracked down Jack Wells, Director of Science for The National Center for Computational Sciences (NCCS) at ORNL to look behind the announcement and technical specs for an insider’s discussion of Titan.

The Exascale Report: When did the first discussion of “Titan” take place?

The plan to install a 20 petaflops system in the Oak Ridge Leadership Computing Facility (OLCF) in 2012 began in 2005. The heterogeneous architecture was finalized in 2010 and the name “Titan” came last, being decided in 2011.

The Exascale Report: Approximately how many man hours of research time have gone into Titan, to date, if you factor in researchers, designers, engineers … everyone who has been part of the ‘bring Titan to life’ effort?

Wells: This question has no firm answer. Titan leverages many technologies from Jaguar and from the DARPA HPCS research project. It also leverages years of development and research at NVIDIA on how to build and program GPGPUs. The number of man-hours can be as big or as small as an individual guesser wants to make it.

The Exascale Report: Will Titan eventually grow to be an exascale-class system?

Wells: No. The Oak Ridge Leadership Computing Facility plans to bring in a new system in 2016 with hundreds of petaflops and an exascale-class system around 2020.

The Exascale Report: What has been the most significant accomplishment or milestone achieved to date in the development of Titan?

Wells: At the time of this writing, Titan has just been delivered and is still going through acceptance tests. There are a number of science teams who have been preparing for Titan’s arrival that are excitedly awaiting its availability so they can make their important science runs.

The Exascale Report: What is unique about Titan?
Wells:

  • Largest machine for open scientific research.
  • Largest GPU-accelerated supercomputer.
  • First deployment of NVIDIA Tesla K20 GPU accelerators.

The Exascale Report: What is the biggest challenge you have faced in ensuring Titan will be a highly functional and high performance system for multiple applications.

Wells: Two big challenges come to mind: The first challenge is that the greatly increased parallelism needed by Titan and all future supercomputers requires rethinking how applications solve their science problems. A bonus is that the increased parallelism in the applications we have already prepared for Titan is enabling them to run faster on all other classes of supercomputers, GPU-accelerated or not. As for the second challenge, Titan can do arithmetic so fast that data movement is becoming a significant factor in the application run time. Scientists now have to think about reducing data movement as well as arithmetic in order for Titan and other systems of this class to be highly functional.

The Exascale Report: Remind us of what the ultimate goal is for Titan?

Wells: The ultimate goal for Titan is to satisfy the Department of Energy mission’s need for leadership computing in the time frame 2013 to 2016.

The Exascale Report: How important is Titan to establishing and maintaining U.S. technology leadership?

Wells: Computational science and engineering in general and the mission of leadership computing, in particular, is very important for U.S. technology leadership. The impact is both very broad and deep. It is difficult to imagine that a country can have leadership in science and engineering, and, as a result, define the leading edge in innovative technologies without being truly excellent in computational science. This importance has been realized in many ways, from the Department of Energy’s national leadership in computational science and engineering, to the broad bi-partisan support for supercomputing in federal R&D budgets, to the growing utilization of supercomputing by industry. Titan, specifically, is a major step on the road to maintain expected growth rates in performance while “changing the game” with respect to energy efficiency for supercomputers based on commodity hardware components.

The Exascale Report: We know it’s early, but what is the most impressive achievement for Titan? Has it been technology or process or something else?

Wells: To date, the most impressive achievement for Titan is its construction on scope, on schedule, and on budget. We expect many more impressive results as we hand Titan over to our users.

The Exascale Report: What has been the most frustrating aspect of working on this program?

Wells: The decision to perform a rolling upgrade of Jaguar “in place” to produce Titan saved many millions of dollars. But, it did result in taking away large fractions of leadership computing capabilities from our ongoing user program for significant amounts of time. Ideally, we would field our new machines alongside our existing machines without disrupting our ongoing user program.

The Exascale Report: What would you like our readers to know about Titan?

Wells: That this large, scalable, accelerated supercomputer is an excellent machine for computational science and engineering, and that we have a growing list of well-known scientific applications that are ready to utilize its capabilities to the fullest extent possible.

* For related stories, visit The Exascale Report Archives