DARPA calls for petaflops-in-a-rack proposals

Print Friendly, PDF & Email

Darpa logoWe talked about this back in June of last year when DARPA issued an RFI for its Ubiquitous High Performance Computing program. That RFI was intended to gauge the community’s thinking on how we’d get to a system that was easy to program, computationally efficient, and computed at 50 GLOPS/w. That’s a big jump, considering that the #1 slot on the Green500 in November of 2009 weighed in at only .7 GFLOPS/w.

Here’s a link to the BAA [PDF] for details; here are some of the specs for the effort

  • Cabinet: width < 24 inches; height < 78 inches; and depth < 40 inches
  • 50 GFLOPS/W LINPACK (HPL)14 benchmark
  • Peak Performance of 1 PFLOPS (HPL)
  • Maximum Cabinet Power of 57 kW including: UHPC System, storage system, fans, self contained cooling, high bandwidth I/O, etc.
  • Cooling: Self contained within cabinet. All approaches not requiring external resources are allowable.

Many of the requirements address the usability of the system. For example

The UHPC software effort spans operating systems; runtime systems for scheduling and lower level resource management; memory management; communication; performance monitoring; power management; self-aware operation; and prototype compilers. It is anticipated that a new system software stack will be developed for a UHPC System….

A significant problem is managing parallelism and locality. OS-related challenges include parallel scalability, spatial partitioning of OS and application functionality, direct hardware access for inter-processor communication, and fault isolation. There are additional challenges in runtime systems including scheduling, memory management, communication, performance monitoring, power management, and dependability. All of these must be solved by future ExtremeScale operating systems. The OS should be a self-aware system that “learns” to favorably respond to user goals and adapting to changing goals, resources, models, operating conditions, attacks and failures. Self-aware OS will take active measures to mitigate the effects of attacks and failures, closing exploited vulnerabilities.

They’ve also decided to release respondents from any support for legacy compilers (like C and FORTRAN), and to encourage the development of “revolutionary programming models.” And, unlike DARPA’s HPCS program, there is no requirement that the design have a market

The final UHPC System Designs and supporting technologies developed under this program must have a demonstratable path to future products that could be used within a DoD mission scenario. There is no requirement that a UHPC System Design be based on current economically viable technologies or that the resulting prototype UHPC System become a product.

The document is remarkable in its breadth and scope, and I could keep excerpting for pages. Its well worth a read. One thing I am concerned about though is the very low level of funding, described well by Timothy Prickett Morgan in his article on the BAA at The Register

Interestingly, DARPA is not ponying up the hundreds of millions of dollars you might expect with the UHPC effort. In phase one and two, there are teams that will design UHPC systems and another set of teams that will design the benchmarks and data sets to test the machines. DARPA is allocating $3.25m for the first year of phase one and $5.25m for the second year for the developers; the UHPC testers get $1.75m per year. (Clearly, it is easier to come up with a test than come up with a system design.) In phase two, UHPC developers get $8.65m per year and testers get $2m per year.