Improving AI Inference Performance with GPU Acceleration in Aerospace and Defense

Print Friendly, PDF & Email

The aerospace/defense industry often must solve mission-critical problems as they arise while also planning and designing for the rigors of future workloads. Technology advancements let aerospace/defense agencies gain the benefits of AI, but it’s essential to understand these advancements and the infrastructure requirements for AI training and inference.

The area of machine perception is exploding today, with deep learning and machine learning having the potential to impact mission-critical objectives for aerospace/defense agencies. Once you get past the first stage of training your neural network with all that data, the next step is to use it to accomplish useful, predictive tasks with it, like recognizing images, RF signals, shipping patterns, and more.

Advantages of GPU-Accelerated Neural Net Training and AI Inference

Getting to and running the predictive, or inference, part of the neural network can be time-consuming. Saving as much time as possible during processing will deliver a better application experience. Running neural network training and inference on CPUs today will no longer provide the performance needed to ensure an acceptable experience.

These AI workloads have ushered in new hardware standards that rely heavily on GPU acceleration. GPU-accelerated neural network training and inference is now advantageous. With the latest version of NVIDIA’s inference server software, the Triton 2.3, and the Ampere architecture with the introduction of the A100 GPU, it is now easier, faster, and more efficient to use GPU acceleration.

When AI workloads are optimized, unmatched performance and seamless scalability can be achieved. You get the kind of flexibility not found in out-of-the-box solutions.

Tools to Handle Increased Throughput Demands

The aerospace/defense industry compiles mountains of data for research, data modeling, and neural network training. Ingest and distribution of all this data can present its own set of challenges. GPUs are capable of disseminating data much faster than CPUs, but due to this, GPUs can actually put a strain on I/O bandwidth, causing higher latency and low bandwidth issues.

A platform that can keep up with these increasing throughput demands is necessary and several tools are required to build that platform. Two tools that will help us build this platform are GPUDirect Storage and GPUDirect RDMA. These tools are part of a group of technologies developed by NVIDIA called Magnum IO.

GPUDirect Storage essentially eliminates the memory bounce. This is when data is read from storage, copied in the system memory, then copied down to GPU memory. GPUDirect Storage allows for a straight path from local storage, or a remote storage like NVMe over Fabric (NVMe-oF), directly to GPU memory, thus removing the additional reads and writes from system memory and reducing the strain on I/O bandwidth.

GPUDirect RDMA provides direct communication between GPUs in remote systems. This eliminates the system CPUs and the required buffer copies of data via the system memory, which can result in much improved performance.

Removing Roadblocks to AI

These kinds of innovations, along with the high-speed network and storage connections, like InfiniBand HDR with a throughput of up to 200Gb/s (and the soon-to-come InfiniBand NDR 400Gb/s), enable a multitude of storage options, including NVMe-oF, RDMA over Converged Ethernet (RoCE), Weka fast storage, and almost anything else available today.

These technologies will also remove the hurdles that AI modeling is encountering today, and Silicon Mechanics can help remove the roadblocks to help you meet your GPU-accelerated goals.

To learn more about how the aerospace/defense industry can get the advantages of AI, watch our on-demand webinar.