Numascale Teams with Supermicro & AMD for Large Shared Memory System

Print Friendly, PDF & Email

numascaleLast week at SC14Numascale announced the successful installation of a large shared memory Numascale/Supermicro/AMD system at a customer datacenter facility in North America. The system is the first part of a large cloud computing facility for analytics and simulation of sensor data combined with historical data.

The Numascale system, installed over the last two weeks, consists of 108 Supermicro 1U servers connected in a 3D torus with NumaConnect, using three cabinets with 36 servers apiece in a 6x6x3 topology. Each server has 48 cores in three AMD Opteron 6386 CPUs and 192 GBytes memory, providing a single system image and 20.7 TBytes to all 5184 cores. The system was designed to meet user demand for “very large memory” hardware solutions running a standard single image Linux OS on commodity x86 based servers.

“Supermicro’s 1U 4-way servers (AS-1042G-LTF) with HyperTransport (HTX) connectivity provide users with flexible computing resources, including capabilities for handling very large data sets such as those found in applications for handling near real-time sensor inputs, combined with massive amounts of historical data,” said Tau Leng, Vice President of HPC at Supermicro. “Our 4-way systems with can be used as a single system or partitioned into smaller systems where each partition runs one instance of the OS. With proper Numa-awareness, applications with high bandwidth requirements will be able to utilize the combined bandwidth of all of the memory controllers and still be able to share data with low latency access through the coherent shared memory.”

Systems equipped with NumaConnect provide high performance shared memory programming capabilities with the same cost structure as a cluster. Numaconnect also eliminates the difficulty of MPI coding for big data problems and can increase programmer productivity. This alternative represents a compelling solution for scientists who currently work with their shared memory codes on x86 desktops and laptops. These users can now scale up their data sets without any extra effort within a familiar, standard Linux OS environment.

AMD is proud to have been selected to participate in this exciting project,” said Suresh Gopalakrishnan, General Manager and Corporate Vice President, Server Business Unit, AMD. “This collaboration with Numascale and Supermicro brings together the right technologies to solve the challenging problem of large shared memory implementation.”

Einar Rustad, CTO of Numascale, agrees. “The single memory image cluster provides both shared memory — including threads and OpenMP — and MPI programming options,” he said. “The scalable system takes advantage of low cost commodity x86 hardware to offer significant savings when compared to conventional shared memory systems. With NumaConnect, system administration is identical to that of a single server because there are no separate node images to maintain and distribute. We are very proud to have been part of the project that enabled us to build this groundbreaking system in close cooperation with Supermicro and AMD.”

NumaConnect works with AMD Opteron-based servers and provides up to 256 TBytes of system-wide shared memory using cache coherency logic with a directory-based protocol that scales to 4096 nodes. The cache coherency logic is implemented in an ASIC together with interconnect fabric circuitry with routing tables for multi-dimensional Torus topologies. This type of fabric is very scalable and the same topology is used in the world’s largest supercomputers.

See our complete coverage of SC15 * Sign up for our insideHPC Newsletter