Accelerated HPC for Energy Efficiency with AWS and NVIDIA

Print Friendly, PDF & Email

Over the last two years, global electricity demand reached the highest peak on record, increasing by 6 percent in 2021 and by 2.4 percent in 2022. According to the International Energy Agency (IEA), data center workloads account for almost two percent of global energy. 

As HPC simulations continue to grow in complexity and size, traditional compute infrastructure is challenged to provide increasingly large computational resources and energy to run HPC workloads. 

HPC scientists and engineers need to serve high-volume user requests with low latency, high throughput infrastructure and build high network performance with fast storage and large amounts of memory, while maintaining or reducing total energy consumption.

Convergence of Cloud, HPC, and AI/ML

HPC workloads have been experiencing a shift with a new category emerging. As HPC users are increasingly integrating artificial intelligence (AI) and machine learning (ML) technologies into their workloads the interest in methods and models existing with large language models (LLMs) and foundation models (FMs) is growing.

In a recent survey, Hyperion Research found that nearly 90 percent of HPC users surveyed are currently using or plan to use AI to enhance their HPC workloads. These enhancements can be implemented on multiple levels including hardware (processors, networking, data access), software (data management, queueing, developer tools), AI expertise (procurement strategy, maintenance, troubleshooting), and regulations (data provenance, data privacy, legal concerns).

As a result, the cloud, HPC, and AI/ML are converging with two simultaneous shifts. The first one is towards workflows, ensembles, and broader integration; and the second shift is toward tightly coupled, high-performance capabilities. The outcome is tightly integrated massive-scale computing accelerating innovation across industries from automotive and financial services to healthcare, manufacturing, and beyond.

Increase energy efficiency and lower workload carbon footprints with AWS and NVIDIA

Amazon Elastic Compute Cloud (Amazon EC2) P5 instances, powered by NVIDIA H100 Tensor Core GPUs, improve energy efficiency by running more HPC simulations and AI/ML applications at scale using on-demand AWS compute infrastructure.

Drug discovery, protein target identification, structure discovery (cryoEM), seismic processing, molecular dynamics, computational fluid dynamics (CFD), risk modeling, and fraud detection are common use cases for HPC.

AWS continues to develop HPC-optimized instances with the latest GPU-powered Amazon EC2 P5 instances increasing how quickly solutions can be developed, speeding up solution development by up to six times previous generations.

In addition, AWS and NVIDIA announced a strategic collaboration to offer new supercomputing infrastructure, software, and services to supercharge HPC, design and simulation workloads, and generative AI. This includes NVIDIA DGX Cloud coming to AWS and Amazon EC2 instances powered by NVIDIA GH200 Grace Hopper Superchip, H200, L40S and L4 GPUs.

Start now with AWS and NVIDIA

HPC solutions from AWS and NVIDIA provide organizations with accelerated computing infrastructure to gain insights faster, enhance energy efficiency, and run HPC and AI/ML scale.

Learn more about how AWS and NVIDIA can help accelerate HPC workloads