Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:

From Forty Days to Sixty-five Minutes without Blowing Your Budget Thanks to Gigaio Fabrex

In this sponsored post, Alan Benjamin, President and CEO of GigaIO, discusses how the ability to attach a group of resources to one server, run the job(s), and reallocate the same resources to other servers is the obvious solution to a growing problem: the incredible rate of change of AI and HPC applications is accelerating, triggering the need for ever faster GPUs and FPGAs to take advantage of the new software updates and new applications being developed.

Inspur Launches 5 New AI Servers with NVIDIA A100 Tensor Core GPUs

Inspur released five new AI servers that fully support the new NVIDIA Ampere architecture. The new servers support up to 8 or 16 NVIDIA A100 Tensor Core GPUs, with remarkable AI computing performance of up to 40 PetaOPS, as well as delivering tremendous non-blocking GPU-to-GPU P2P bandwidth to reach maximum 600 GB/s. “With this upgrade, Inspur offers the most comprehensive AI server portfolio in the industry, better tackling the computing challenges created by data surges and complex modeling. We expect that the upgrade will significantly boost AI technology innovation and applications.”

NVIDIA EGX Platform Brings Real-Time AI to the Edge

NVIDIA announced two powerful products for its EGX Edge AI platform — the EGX A100 for larger commercial off-the-shelf servers and the tiny EGX Jetson Xavier NX for micro-edge servers — delivering high-performance, secure AI processing at the edge. “Large industries can now offer intelligent connected products and services like the phone industry has with the smartphone. NVIDIA’s EGX Edge AI platform transforms a standard server into a mini, cloud-native, secure, AI data center. With our AI application frameworks, companies can build AI services ranging from smart retail to robotic factories to automated call centers.”

Paperspace Joins NVIDIA DGX-Ready Software Program

AI cloud computing Paperspace announced Paperspace Gradient is certified under the new NVIDIA DGX-Ready Software program. The program offers proven solutions that complement NVIDIA DGX systems, including the new NVIDIA DGX A100, with certified software that supports the full lifecycle of AI model development. “We developed our NVIDIA DGX-Ready Software program to accelerate AI development in the enterprise,” said John Barco, senior director of DGX software product management at NVIDIA. “Paperspace has developed a unique CI/CD approach to building machine learning models that simplifies the process and takes advantage of the power of NVIDIA DGX systems.”

Video: Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze Research Breakthroughs

Nick Nystrom from the Pittsburgh Supercomputing Center gave this talk at the Stanford HPC Conference. “The Artificial Intelligence and Big Data group at Pittsburgh Supercomputing Center converges Artificial Intelligence and high performance computing capabilities, empowering research to grow beyond prevailing constraints. The Bridges supercomputer is a uniquely capable resource for empowering research by bringing together HPC, AI and Big Data.”

Perlmutter supercomputer to include more than 6000 NVIDIA A100 processors

NERSC is among the early adopters of the new NVIDIA A100 Tensor Core GPU processor announced by NVIDIA this week. More than 6,000 of the A100 chips will be included in NERSC’s next-generation Perlmutter system, which is based on an HPE Cray Shasta supercomputer that will be deployed at Lawrence Berkeley National Laboratory later this year. “Nearly half of the workload running at NERSC is poised to take advantage of GPU acceleration, and NERSC, HPE, and NVIDIA have been working together over the last two years to help the scientific community prepare to leverage GPUs for a broad range of research workloads.”

NVIDIA A100 Tensor Core GPUs come to Oracle Cloud

Oracle is bringing the newly announced NVIDIA A100 Tensor Core GPU to its Oracle Gen 2 Cloud regions. “Oracle is enhancing what NVIDIA GPUs can do in the cloud,” said Vinay Kumar, vice president, product management, Oracle Cloud Infrastructure. “The combination of NVIDIA’s powerful GPU computing platform with Oracle’s bare metal compute infrastructure and low latency RDMA clustered network is extremely compelling for enterprises. Oracle Cloud Infrastructure’s high-performance file server solutions supply data to the A100 Tensor Core GPUs at unprecedented rates, enabling researchers to find cures for diseases faster and engineers to build safer cars.”

AMD Wins Slot in Latest NVIDIA A100 Machine Learning System

Today AMD demonstrated continued momentum in HPC with NVIDIA’s announcement that 2nd Generation AMD EPYC 7742 processors will power their new DGX A100 dedicated AI and Machine Learning system. AMD has an impressive set of HPC wins in the past year, and has been chosen by the DOE to power two pending exascale-class supercomputers, Frontier and El Capitan. “2nd Gen AMD EPYC processors are the first and only current x86-architecture server processor supporting PCIe 4.0, providing up to 128 lanes of I/O, per processor for high performance computing and connections to other devices like GPUs.”

Atos Launches First Supercomputer Equipped with NVIDIA A100 GPU

Today Atos announced its new BullSequana X2415, the first supercomputer in Europe to integrate NVIDIA’s Ampere next-generation graphics processing unit architecture, the NVIDIA A100 Tensor Core GPU. This new supercomputer blade will deliver unprecedented computing power to boost application performance for HPC and AI workloads, tackling the challenges of the exascale era. The BullSequana X2415 blade will increase computing power by more than 2X and optimize energy consumption thanks to Atos’ 100% highly efficient water-cooled patented DLC (Direct Liquid Cooling) solution, which uses warm water to cool the machine.

Lenovo to deploy 17 Petaflop supercomputer at KIT in Germany

Today Lenovo announced a contract for a 17 petaflop supercomputer at Karlsruhe Institute of Technology (KIT) in Germany. Called HoreKa, the system will come online this Fall and will be handed over to the scientific communities by summer 2021. The procurement contract is reportedly on the order of EUR 15 million. “The result is an innovative hybrid system with almost 60.000 next-generation Intel Xeon Scalable Processor cores and 220 terabytes of main memory as well as 740 NVIDIA A100 Tensor Core GPUs. A non-blocking NVIDIA Mellanox InfiniBand HDR network with 200 GBit/s per port is used for communication between the nodes. Two Spectrum Scale parallel file systems offer a total storage capacity of more than 15 petabytes.”