Seagate SSDs Boost Analytics on Comet Supercomputer

Print Friendly, PDF & Email

cometThe San Diego Supercomputer Center is adding 800GB Seagate SAS SSDs to significantly boost the data analytics capability of its Comet supercomputer. To expand its node-local storage capacity for data-intensive workloads, device pairs will be added to all 72 compute nodes in one rack of Comet, alongside the existing SSDs. This will bring the flash storage in a single node to almost 2TB, with total rack capacity at more than 138TB.

“Comet is a Dell-integrated cluster using Intel’s Xeon Processor E5-2600 v3 family, with two processors per node and 12 cores per processor running at 2.5GHz. Each compute node has 128 GB of traditional DRAM and 320 GB of local flash memory. Since Comet is designed to optimize capacity for modest-scale jobs, each rack of 72 nodes (1,728 cores) has a full bisection InfiniBand FDR interconnect from Mellanox, with a 4:1 over-subscription across the racks. There are 27 racks of these compute nodes, totaling 1,944 nodes or 46,656 cores.”

The installation process of the Seagate drives began in October under a donation arrangement with SDSC and its Center for Large Scale Data Systems Research, an industry-university collaboration that focuses on issues including “big data” system architectures and software, analytics, and performance, with the goal of understanding the full value that can be extracted from voluminous amounts of data now becoming available to organizations. User access to the drives will begin before January 2016.

Under the partnership, Seagate and SDSC/CLDS are deploying a lightweight framework for extracting metrics suitable for the analysis of data I/O patterns and overall drive performance. These results and other metrics will be used to further develop best practices and reference HPC architectures, while leading to more precise analyses of SSD performance in operational HPC workloads.

The addition of the Seagate solid state drives on Comet continues the work we pioneered in the area of system versatility around flash-based SSDs on our Gordon supercomputer,” said SDSC Director Michael Norman, also principal investigator for the Comet project. “It complements Comet’s other dimensions, such as its fast Lustre parallel file storage systems and large memory nodes. With such a wide range of workflows in both traditional and emerging science domains such as genomics, the greater research community will benefit from these heterogeneous but integrated capabilities.”

The new drives will also extend the abilities of Comet’s upcoming virtualized HPC clusters. “Currently, some virtual machines on Comet can have large disk images that take advantage of the fast local storage on the compute nodes hosting them,” said Rick Wagner, SDSC’s manager of HPC systems. “Groups using virtual machines will be able store more data inside of their virtual machines, closer to their custom application stacks.”

Comet is capable of an overall peak performance of almost two petaflops, or two million billion operations per second. It has the ability to perform 10,000 research jobs simultaneously. Like the tail of a comet, SDSC’s newest HPC cluster is intended to serve what’s called the ‘long tail’ of science: the idea that the large number of modest-sized computationally-based research projects represent, in aggregate, a tremendous amount of research that can yield scientific discovery.

The goal of our partnership with SDSC is to inform the wider HPC community via papers and workshops on how to select the most appropriate, high-performance components suitable to their architectures and workloads, while gaining insight into how Seagate SSDs are used in domains that are relying on advanced computation and storage, such as genomics and the social sciences,” said Tony Afshary, director of ecosystem solutions and marketing for flash at Seagate.

The result of a National Science Foundation grant worth nearly $24 million including hardware and operating funds, Comet is available for use by U.S. academic researchers through the NSF’s eXtreme Science and Engineering Discovery Environment (XSEDE) program, a national collection advanced, integrated digital resources and services.

SDSC and Seagate will be discussing this collaboration in booth #823 at SC15 in Austin.

Sign up for our insideHPC Newsletter