Microsoft and NVIDIA Together Advance AI

[SPONSORED GUEST ARTICLE]  Think of the Microsoft Azure cloud platform as analogous to a general contractor who brings together the most skilled and knowledgeable artisans, technologically speaking, offering the latest and most powerful advanced AI capabilities. This would include the latest NVIDIA AI hardware and software, such as the new NVIDIA Blackwell platform, announced at NVIDIA’s recent GTC 2024 conference March 18 – 21 in San Jose.

Throughout the GTC extravaganza, Microsoft and NVIDIA made a series of announcements and innovations – from AI infrastructure to new platform integrations and industry breakthroughs marking advances across a host of AI fronts.

Much of the news from Microsoft and NVIDIA’s collaboration centers around the new NVIDIA Blackwell Platform. Azure will be one of the first cloud providers to offer the NVIDIA GB200 Grace Blackwell Superchip, designed for large-scale generative AI workloads, data processing and high performance workloads. The NVIDIA GB200 offers up to a massive 16 TB/s of memory bandwidth and up to an estimated 30 times faster real-time inference on trillion-parameter models over the previous generation NVIDIA Hopper GPUs.

Microsoft engineers worked closely with NVIDIA to ensure their GPUs, including the GB200, can handle the latest large language models (LLMs) trained on Azure AI infrastructure. These models require enormous amounts of data and compute to train and run, and the GB200 will enable Microsoft to help customers scale these resources to new levels of performance and accuracy.

The Microsoft and NVIDIA collaboration on AI extends beyond processing power. They are deploying an end-to-end AI compute fabric with the recently announced NVIDIA Quantum-X800 InfiniBand networking platform. By taking advantage of its in-network computing capabilities with SHARPv4, and its added support for for advanced AI techniques, NVIDIA Quantum-X800 extends the GB200’s parallel computing tasks into massive GPU scale.

Combined, the new Azure instances based on the GB200 and Quantum-X800 InfiniBand are designed to accelerate the new generation of frontier and foundational models for generative AI workloads such as natural language processing, computer vision and speech recognition.

“Taking advantage of the powerful foundation of Azure AI infrastructure using the latest NVIDIA GPUs, Microsoft is infusing AI across every layer of the technology stack, helping customers drive new productivity gains,” said John Lee, Microsoft Azure, Principal Lead, AI Platforms & Infrastructure. “We have more than 53,000 Azure AI customers, and we’re providing them access to the most advanced foundation and open-source models, including both LLMs and small language models (SLMs), all integrated deeply with infrastructure data and tools on Azure.” (For more on Azure AI, see this video.)

Lee expanded on these themes in this video interview video with insideHPC at GTC.

Also during GTC, Microsoft announced Azure NC H100 v5 virtual machines (VM), a platform for AI and HPC powered by NVIDIA H100 NVL Tensor Core GPUs. The VMs deliver substantial computational power, large high-performance GPU memory per VM and memory bandwidth to accelerate AI inference and mid-range AI training workloads. Along with AI, the Azure NC H100 v5 is suited for extreme modelling and simulation workloads like computational fluid dynamics, molecular dynamics, quantum chemistry, weather forecasting and climate modeling, and financial analytics.

The NC H100 v5-series offers two classes of virtual machines, ranging from one to two NVIDIA H100 94GB NVL Tensor Core GPUs. The H100 NVL PCIe support NVLink v4, which provides 600GB/s bi-directional communication speeds between the GPUs, reducing latency and overhead of data transfer for faster and more scalable AI and HPC applications.

Microsoft is also expanding its collaboration with NVIDIA to accelerate healthcare and life sciences innovation with advanced cloud, AI and accelerated computing capabilities.

The partnership brings the power of generative AI, the cloud and accelerated computing, combining Microsoft Azure with NVIDIA DGX Cloud and the NVIDIA Clara suite of computing platforms, software and services. The combination is designed to help healthcare and life sciences organizations accelerate clinical research and drug discovery, enhance medical image-based diagnostic technology, and increase access to precision medicine.

The collaboration will enable healthcare providers, pharmaceutical and biotechnology companies, and medical device developers to innovate across clinical research, drug discovery and care delivery with improved efficiency and effectiveness. (More details on this joint effort can be found here.)

In the chatbot arena, NVIDIA GPUs and NVIDIA Triton Inference Servers will now help serve AI inference predictions in Microsoft Copilot for Microsoft 365, soon available as a dedicated physical keyboard key on Windows 11 PCs. The solution combines the power of large language models with proprietary enterprise data to deliver real-time contextualized intelligence, enabling users to enhance their creativity, productivity and skills.

In addition, NVIDIA NIM inference microservices are coming to Azure AI for AI deployments. Part of the NVIDIA AI Enterprise software platform, also available on the Azure Marketplace, NIM provides cloud-native microservices for optimized inference on more than two dozen foundation models, including NVIDIA-built models that users can access at ai.nvidia.com.

Microsoft is also working with NVIDIA Omniverse Cloud APIs, which will power industrial digital twin software tools. The five new Omniverse Cloud APIs are designed to enable developers to integrate core Omniverse technologies directly into existing design and automation software applications for digital twins, or their simulation workflows for testing and validating autonomous machines, such as robots or self-driving vehicles. (For more details, see this video.)

“Everything manufactured will have digital twins,” said Jensen Huang, founder and CEO of NVIDIA. “Omniverse is the operating system for building and operating physically realistic digital twins. Omniverse and generative AI are the foundational technologies to digitalize the $50 trillion heavy industries market.”

Learn more about AI solutions from Microsoft and NVIDIA: