…and how the NSF plans to help catalyze change.
In late September the Chronicle of Higher Education attended the 20th anniversary celebration of the Coalition for Academic Scientific Computation (CASC), and filed a story called Supercomputers Often Run Outdated Software, which included this quote among several others that I took the Chronicle to task for in this posting
Supercomputers keep breaking records for processing speed, but software to operate them has not kept up with that increasingly zippy hardware. The often-rickety supercomputing computer code is becoming an obstacle to making better weather models, medical simulations, and other applications of high-performance computers, said experts at a conference here Wednesday on the future of academic supercomputing.
As I said in my response, the ideas of the speakers at the event are sound — for example, Ed Seidel, director of the National Science Foundation’s Office of Cyberinfrastructure, said that legacy supercomputing codes for large scale science are going to have to be “retooled or rethought” to take full advantage of the latest supercomputers. Right on.
But the article around the quotes painted a broad picture of “rickety software” written in languages that are no longer “stylish,” such as (brace yourself) Fortran. I argued that the article is inaccurate and unhelpful in broader context of the absolute criticality of HPC to national and international research agendas.
Following the Chronicle article and my response to it, Ed Seidel agreed to talk with us in detail about his comments, the meeting, and how the NSF is positioning itself to spearhead change in a new generation of scientific software.
Seidel described an NSF that is acutely aware of the central role that computing plays in the process of scientific discovery. As a reflection of the cross-cutting impact of computing on all the disciplines that the NSF supports, Seidel told me the agency supports an Advisory Committee on Cyberinfrastructure (ACCI) comprised of representatives from all the discipline areas of the Foundation. The committee is organized into six task forces
- Grand Challenge Communities and Virtual Organizations, aimed at helping disparate science communities create products that integrate with each other to solve a total problem (hurricane satellite observations input to atmospheric models whose output goes into wave models, and so on)
- HPC and Advanced Computing, which covers all aspects of the computing pyramid, not just apex systems
- Software
- Data and Visualization, which considers tools, algorithms, and policies for data-driven scientific applications
- Learning and Workforce Development
- Campus Bridging, which considers ways to connect researchers on university campuses to remote computing resources
Together these task forces will spend the next 18-24 months conducting a series of workshops and gathering ideas and recommendations before submitting final reports on their focus areas. These reports will be gathered together and serve as input for the NSF’s strategic plan for cyberinfrastructure, which should be complete within the next 3 years.
A point that Seidel was at pains to emphasize is that the Office of Cyberinfrastructure isn’t just about apex computing hardware. This isn’t a recent shift in focus for the agency when it comes to computing — Revolutionizing Science and Engineering Through Cyberinfrastructure, a report published by the NSF in 2003 (also known as the Atkins Report after its chair, Dan Atkins), connected all of these components together into a complete cyberinfrastructure to support science discovery. But the dramatic drop in the costs/FLOPS that our community has enjoyed over the past five to ten years has certainly made it more feasible to equitably distribute available funds among all components, rather than just focusing on the hardware. “The investments are out of balance in terms of hardware and software,” says Seidel as he considers how cyberinfrastructure has been funded. “Hardware is very tangible, and is just easier to fund.”
He notes that there have been 100s of millions of dollars invested in hardware in the recent past, without anything close to that going into software. And hardware certainly used to be a bigger proportion of the challenge than it is going to be in the near future. But as we are prepare for the exascale, software has become a first-class concern on hardware architectures that will support billion-way parallelism.
“Software is really becoming the broader language of science,” says Seidel. “Even broader than mathematics, but we don’t really know how to fund it.” He notes that we have decades of experience funding hardware, and we now have a culture that knows how build and run very large scale datacenters. By contrast, software efforts to date have been very individual, and “there is less and less efficiency in that model” he says. “Software needs to be treated like a first-class citizen. So much is riding on the software side that it is really time to rethink how we build, fund, and maintain it.”
Seidel identifies issues like reproducibility, and bringing software engineering disciplines out of business applications and into scientific software as key issues. Work is also needed on abstraction layers and documentation, and researchers need to be taught as students how to contribute to existing codes.
The NSF at SC09
Interested in learning more about what the NSF is doing in software and more? Of course NSF-funded researchers will be contributing to the SC09 program throughout the week, but you’ll also want to consider attending some of the Birds-of-a-Feather session they are organizing this year (all of these are on Tuesday)
- NSF Strategic Plan for a Comprehensive National CyberInfrastructure (details)
- Accelerating Discovery in Science and Engineering through Petascale Simulations and Analysis: The NSF PetaApps Program (details)
- NSF High End Computing University Research Activity (HECURA) (details)
“As we look forward to architectures of the near future,” argues Seidel, “the number of cores in these systems will surpass the number of transistors on the Motorola 68000 processor. Parallelism at that scale completely changes the game. There is a lot of work to be done.”










After decomposing and eliciting requirements for a recent supercomputer project, the ability to run legacy code presents severe constraints affecting the hardware construction and configuration. The reason is simple: Even at hundreds of millions of dollars, inefficient hardware configuration is still cheaper than re-engineering the code and retraining the scientists to program their experiments and use current technology.
In supercomputing, Moore’s law prevails. The reality of effective and efficient compute capability depends upon the close coupling of current hardware and software technologies. These current technologies will not be generally seen in supercomputing for 5 – 10 years, perhaps longer, until both current and emerging paradigms can be adopted by the user community. Flynn’s taxonomy, including MIMD is simply inadequate to this task, IMO.
Kenneth A. Lloyd
Watt Systems Technologies Inc.