Overcome Form Factor and Field Limitations with AI/HPC Workloads on the Edge

Delivering AI and HPC workloads at the edge has historically been a challenge. Form factor, latency, and power can all lead to key limitations on the edge. For this discussion, the edge means any compute workloads taking place outside of both cloud and traditional on-prem data centers.

Recently, however, key advancements in technology will allow higher performance at the edge. Powerful new technologies, including NVIDIA GPU, InfiniBand, and Ethernet, can deliver the performance required for AI and HPC at the edge, while still allowing strong ROI.

In addition, 5G networking will both drive the explosion of new connected devices that generate more valuable real-time data and improve the abilities of AI deep learning (DL) systems. Accomplishing DL in the field without latency or bandwidth issues can become a reality with the advancement of 5G.

Reference Architectures to Achieve AI/HPC on the Edge

Strong design teams can create reference architectures that consider both the accelerated computing needs and edge form factors involved in these types of workloads. A reference architecture is a starting point for customized design, with critical elements already thought through and tested.

Each reference architecture addresses engineering, testing, and optimization for power, latency, and related concerns of resource-hungry applications. They also consider space, size, ruggedization, and other unique issues facing in-field deployments.

Here are several examples of reference architecture designs that allow you to optimize infrastructure to accommodate high-performance edge workloads.

Example 1: Hyperconverged Infrastructure (HCI) on the Edge

This first cluster is a small-footprint, air-gapped cloud environment-in-a-box designed for performance, reliability, portability, and security at the edge. This solution provides a high-performance computing environment to support critical operations at the edge, such as secure development environments, remote location computing, and more.

The design is a small rack unit configuration (about 6U) to ensure the solution is portable. The cluster makes expansion simple, allowing you to add compute and storage resources and even scale up to a full rack if you choose. It also allows pooled resilient storage and features a configurable network speed to meet your needs.

Benefits:

Power (single Intel CPU per node)
Density (8TB per node)
Fast networking (10GB)

This solution supports numerous users in remote, air-gapped areas offering all the performance they need to deploy workloads at the edge without latency going back to the cloud. It’s ideal for operating environments that simply can’t be connected to the cloud, featuring dramatically simplified development and management without sacrificing performance or security.

Ideal Use Cases

Edge cloud
Geospatial Intelligence (GEOINT)
Edge AI/Inference (object detection, image recognition, image processing)
Geographic information systems (GIS)

Example 2: Edge Appliance

An edge appliance allows you to cost-effectively access the storage and compute power you need via a local resource ruggedized for the edge, capable of running complex operations in harsh, non-data center environments. This appliance uses standard hardware but provides a complete software solution stack already bundled together to meet the needs of the deployment.

Key Features:

Turnkey and preconfigured for storage optimized, compute and storage optimized, or GPU compute optimized workloads
Ruggedized to MIL-SPEC for edge
Limited power, broad temperature range, and resists dust and moisture
Suited for edge environments that may be air-gapped for security and/or due to lack of infrastructure
Protects equipment from environmental hazards
Fast networking with 2x NVIDIA 100 Gbps Ethernet

The key differentiator for these edge appliances is the ruggedized chassis, built to MIL-SPEC attributes, able to run in any difficult operating environments.

Ideal Use Cases

Edge cloud
Geospatial Intelligence (GEOINT)
Edge AI/Inference (object detection, image recognition, image processing)
Geographic information systems (GIS)

Example 3: Deep Learning (DL) on the Edge

This cluster can process AI inference workloads at the edge in real time while protecting equipment from environmental hazards. It operates on limited power, in small footprints, in broad temperature ranges, and it resists dust and moisture.

Key Features:

Pre-configured for edge AI and deep learning (DL) inference workloads
More cost effective than public cloud options without vendor lock-in
Ruggedization or MIL-SPEC can be matched to your requirements
Supports NVIDIA A100 GPU for optimal performance
Operable in limited power environments

Ideal Use Cases

Edge inference
Computer vision
Object detection
Autonomous sentry

Example 4: Composable Infrastructure at the Edge

This cluster is purpose-built to support cloud and accelerated workloads at the edge without the need for virtualization. This allows you to get bare metal performance of HPC, AI, and even ML at the edge.

It’s also very flexible because composable disaggregated infrastructure (CDI) allows you to reconfigure systems as needs dictate. Resources such as GPUs, FPGAs, NVMe storage, and more, are connected via PCIe-connected resources so you can scale each element independently. Your deployment can be reconfigured based on the workload without losing performance because of the flexibility with your hardware. If you’re in the field and looking to deploy several different workloads, CDI will allow you to dynamically reconfigure the hardware so that each workload has a very specific optimized solution.

Key Features:

Limited footprint
Prepared for environmental concerns (vibration, heat, dust, moisture) and power envelope limitations
CDI for flexible system configs and bare metal performance within physically limited environments like edge
Fast networking with NVIDIA 200Gbps HDR InfiniBand

Ideal Use Cases

Edge (space, power, environmental)
Artificial Intelligence/Deep Learning/Inference
Weather modeling
Image processing
Processing sensor data
Geographic Information Systems (GIS)

Custom Engineered Edge AI/HPC Reference Architectures

Silicon Mechanics is an engineering firm providing custom, best-in-class solutions for HPC/AI, storage, and networking, based on open standards. The experts at Silicon Mechanics understand that implementing HPC and AI on the edge requires a strong understanding of both computing and form factor. That’s why we created a series of reference architectures for specific types of edge deployments and workloads.

Get a more comprehensive understanding of Silicon Mechanics edge reference architectures and what they can do for your organization at www.siliconmechanics.com/edge.

Sponsored Guest Articles

Lenovo Maximizes HPC Resources via Partnership with SchedMD and Slurm Workload Manager

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA

Sponsored Guest Articles

Lenovo Maximizes HPC Resources via Partnership with SchedMD and Slurm Workload Manager

White Papers

Energy efficiency drives HPC to the cloud

Join Us On Social Media

Related Posts

Featured RSS Feed

More News from insideBIGDATA