The third of the four task area 1 teams has done its UHPC press release. Sandia has added its name to the list of teams that have told us at least a little about what they are planning to do (see the news on NVIDIA and Intel). Sandia’s Richard Murphy is the PI.
To accomplish the mission, Sandia Labs is leading a team of industry partners, including Micron Technology, Inc. and LexisNexis Special Services, Inc. Academic partners include Louisiana State University, University of Illinois at Urbana-Champaign, University of Notre Dame, University of Southern California, University of Maryland, Georgia Institute of Technology, Stanford University and North Carolina State University.
The Sandia team, which includes two universities in my neighborhood (Georgia Tech and LSU, congrats guys!) included a picture of a notional design for a compute node — although neither the picture nor the compute node is mentioned in the press release, I suppose it’s reasonable to assume that the X-caliber node will be a central part of the Sandia team’s efforts. From the press release (hi res image available from the original press release, link above):
This is the notional design of an X-caliber compute node. This design features two processors (Ps) that handle high temporal locality processing. Each P is supported by eight stacked memory cubes (Ms) that also include embedded memory units (EMUs) that handle low temporal locality processing. Finally, each node also includes a pair of network interface card / routers for inter-node communications.
The inclusion of LexiNexis is interesting. Based on my previous reporting on them, I know that the company has its own large scale data processing hardware and software called the Data Analytics Supercomputer (the hardware is all standard components, or was last year, but the configuration is specialized to process large data streams). From that previous article
A DAS is comprised of some combination of both data refinery and data delivery nodes. These nodes handle the processing of data queries and presentation of results, and up to 500 of them can be connected together by a non-blocking switch (from Force10 Networks, again a standard commercial part) that allows the nodes participating in an operation to cooperate directly with one another. A DAS larger than 500 nodes can be assembled by linking together multiple 500-node sub-assemblies. The system runs a standard Linux kernel with non-essential services turned off to reduce OS jitter and improve performance.
How does it all perform? The company says that in 2008 one of its DAS systems was 14 percent faster than the then TeraSort champion, Hadoop, on a cluster that used less than half of the hardware. Interestingly, LexisNexis also claims that its approach needed 93 percent less code that the Hadoop solution, and this is a big part of the system’s appeal.
LexisNexis had also developed its own declarative language for the system (called Enterprise Control Language, or ECL), and that experience may play some role in the project since the UHPC tasks include specific mention of making the hardware easier to use
LexisNexis productivity studies show that ECL is about five times more efficient than SQL for specifying the same tasks, and Simmons gave me an example of a specific data function that was coded in 590 lines of assembly, 90 lines of C, and just two lines of ECL.
IBM has been working in the area for a while through their System S data processing hardware and software, but the offering must not be as strong as what LexisNexis has developed for this paticular application. We do know that Sandia has been test driving the DAS for at least a couple years, and that relationship and experience no doubt played a part in their inclusion on the team.