Today Cray unveiled its new Shasta supercomputing architecture for Exascale.
Shasta is an entirely new design and is set to be the technology that underpins the next era of supercomputing, characterized by exascale performance capability, new data-centric workloads, and an explosion of processor architectures. With sweeping hardware and software innovations, Shasta incorporates next-generation Cray system software to enable modularity and extensibility, a new Cray-designed system interconnect, unparalleled flexibility in processing choice within a system, and a software environment that provides for seamless scalability.
The DOE announced today that NERSC has chosen a Cray “Shasta” supercomputer for its NERSC-9 system, named “Perlmutter,” in 2020. The program contract is valued at $146 million, one of the largest in Cray’s history, and will feature a Shasta system with Cray® ClusterStor storage.
Shasta’s hardware and software innovations tackle the bottlenecks, manageability, and job completion issues that emerge or are magnified as core counts grow, compute node architectures proliferate, and workflows expand to incorporate AI at scale. Shasta eliminates the distinction between clusters and supercomputers with a single new breakthrough supercomputing system architecture, enabling customers to choose the computational infrastructure that best fits their mission, without tradeoffs. With Shasta you can mix and match processor architectures (X86, Arm, GPUs) in the same system as well as system interconnects from Cray (Slingshot), Intel (Omni-Path) or Mellanox (InfiniBand).
With Shasta, Cray is also announcing Slingshot, a new high-speed, purpose-built supercomputing interconnect. Slingshot advances Cray’s industry leadership in scalable network performance and adds capabilities that broaden Cray’s market reach. The Cray-developed Slingshot interconnect will have up to 5x more bandwidth per node and is designed for data-centric computing. Slingshot will feature Ethernet compatibility, advanced adaptive routing, first-of-a-kind congestion control, and sophisticated quality-of-service capabilities. Support for both IP-routed and remote memory operations will broaden the range of applications beyond traditional modeling and simulation. Quality-of-service and novel congestion management features will limit the impact to critical workloads from system services, I/O traffic, and co-tenant workloads, to increase realized performance and limit performance variation. Reduction in the network diameter from five hops (in the current Cray® XC™ generation) to three will reduce latency and power while improving sustained bandwidth and reliability.
We listened closely to our customers and dug into the future needs of AI and HPC applications as we designed Shasta,” said Steve Scott, senior vice president and CTO of Cray. “Customers wanted leading-edge, scalable performance, but with lots of flexibility and easy upgradeability over time. I’m happy to say we’ve nailed this with Shasta. The Shasta infrastructure accommodates a wide variety of processor and network options, allowing customers to run diverse workloads on a single system. And it’s got the headroom to accommodate increasingly power-hungry processors and accelerators coming in future years. The Slingshot network tightly binds the compute and storage resources in the system, with groundbreaking congestion control to isolate applications from other network traffic, and Ethernet compatibility for datacenter and storage integration. We’re immensely excited to bring this new network to market to help accelerate our customers’ discoveries.”
Shasta systems are expected to be commercially available in Q4 of 2019.