Spend Less on HPC/AI Storage (and more on CPU/GPU compute)

Print Friendly, PDF & Email

Sponsored Post

According to Gartner, the best-run organizations are constantly searching for opportunities to realize efficiencies in the Run category of Gartner’s Run—Grow—Transform model in order to shift budgets toward accelerating their digital transformation.

If you are looking to realize cost efficiencies or prevent a looming cost explosion, one opportunity to examine is new parallel storage designed to feed unstructured data to high-performance clusters of GPU- and/or CPU-powered rack servers from Hewlett Packard Enterprise.

The long list of typical enterprise workloads running on these high-performance clusters includes training of machine learning, computer-aided engineering, crash test simulations, wind tunnel simulations, automated driving systems, digital twins, mechanical design, chemical engineering, electronic design automation, precision medicine, genome sequencing, drug discovery, cryogenic electron microscopy, fraud and anomaly protection, quantitative pre-trade analytics, algorithmic trading, cybersecurity, seismic exploration, reservoir simulation, and many more.

All these workloads have one thing in common: they all require massive data movement of massive data sets during processing. For that, they all depend on low-latency, high-speed 200 Gbps networks such as InfiniBand or Gigabit Ethernet or HPE Slingshot to connect compute nodes with each other and with shared high-speed storage.

The convergence of computational modeling and simulation (often referred to as classic high-performance computing, or classic HPC) and artificial intelligence (AI) is changing everything. The impact of this convergence is especially disruptive on the HPC storage layer, as the I/O profiles of both methods could not be more different.

The storage architectures that have served us well in the past, when simulation and AI were deployed on separate infrastructure stacks, are breaking architecturally and economically in the new era.

The costs of staying with legacy storage architectures in the new era add up quickly:

  • Delayed new product introduction and development due to long job run times in business-critical R&D business processes
  • Low asset utilization of expensive CPU/GPU compute infrastructure due to I/O bottlenecks that keep the compute nodes idling waiting for their data
  • Regrettable attrition of top talent such as key engineers and data scientists due to frustration over job pipeline congestion that prevents them from working productively
  • Frequent, unbudgeted spending requests due to ad hoc upgrades of the file storage infrastructure needed to cope with the architectural deficiencies—at the expense of CPU/GPU compute budgets

As the market share leader in HPC servers [1], Hewlett Packard Enterprise saw the convergence of classic modeling and simulation with AI methods such as machine learning and deep learning coming and now offers you a new portfolio of parallel HPC/AI storage systems that are purpose engineered to address all of the previously mentioned challenges—in a cost-effective way.

Interested in learning about the three approaches that can help you to feed your CPU- and GPU-accelerated compute nodes without I/O bottlenecks while creating efficiencies in Gartner’s Run category? Read the white paper!

[1] Hyperion Research, HPC Market Update During ISC21, June 2021