Sponsored Post
By Braden Cooper, Product Marketing Manager at One Stop Systems
The explosion of AI applications in edge environments has changed the fundamental framework for storage in HPC. The question of an optimized HPC edge workflow is no longer how best to use the limited data available, but how to scale the data storage and communication with the massive capture and processing rates of modern FPGA and GPU technologies. As the size of storage captured at the edge grows, the limitations of cloud-based storage and networking quickly arise in the form of data security, throughput, and cost. Advances in NVMe storage density and throughput enable new physical point-to-point capture and transfer options to bypass the inherent weaknesses of cloud-based workflows.
The first step in the process, the information capture, relies on a storage infrastructure that does not bottleneck the flow of the data. For HPC scale applications this requires that the storage has sufficient capacity to meet the resolution and duration of capture without significant system-level downtime to swap storage media. The storage throughput also must align with the rate of data ingest such that the flow of data into the workflow is smooth; without lapses when operated at capacity. The latest NVMe storage devices meet both the scalable capacity and high-speed throughput needs. With PCIe Gen 4.0 M.2/E1.S drives available which have capacities of 8TB or more, scale-out storage platforms meet the capacity needs of any application with speeds up to 8 GB/s per storage device.
After the data is captured, it must be transported to the processing node which is often located in a different city, state, or country than the source. Cloud-based data communication can be useful for smaller or intermittent data uploads but loses efficiency at scale due to restrictions in ethernet upload speeds and the cost associated with the transmission of massive datasets. The cybersecurity restrictions and risks associated with cloud-based storage also limit remote uploads for many government and commercial applications. To mitigate these data transportation challenges, a resurgence of sneakernet (physical transportation of data between locations) workflows has become an attractive solution. Small form factor, high-capacity storage devices built into scale-out, transportable nodes represent an efficient combination of maximum capture speed with a secure, rapid, and inexpensive transportation per byte of data.
OSS AI on the Fly® Storage takes advantage of the latest NVMe storage devices in products optimized for physical point-to-point transportable storage. For peak storage capacity and environmental ruggedization, OSS partners with TSecond, using their BRYCK product which offers up to 1PB of NVMe data in a lightweight, rugged form factor. Two BRYCK units can be installed into an OSS 4U Pro system, supporting up to 100 GB/s of sustained data throughput to up to 2PB of storage. The rugged enclosure has a toolless hinged lid, allowing rapid removal and insertion of additional BRYCKs to minimize downtime of the data capture or upload. For lower capacity applications, OSS offers the OSS PCIe 4.0 Dual (or Quad) M.2 Carrier – supporting up to four PCIe 4.0 NVMe M.2/E1.S drives. The M.2 Carrier board uses lightweight removable carriers to hot-swap storage between systems quickly and securely.
At the current rate of data expansion, Exabyte storage and communication requirements are an inevitability in the not-so-distant future. The most rapid and secure path to sending this scale of data from source to processing center is in locked transportation cases by truck and plane. OSS uses the latest in NVMe storage technologies and rugged environmental design to meet the need of next-gen HPC storage applications.