Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Successfully Managing Workload Convergence in Your Data Center

This is the first a series of features exploring new resource management solutions for workload convergence, such as Bright Cluster Manager by Bright Computing. The full report focuses on how scheduling solutions can help your business better manage workload convergence in your data centers. 

workload convergence

Download the Full Report.

Modern enterprises are being challenged to implement new technologies and produce results faster without increasing their IT budgets. Diverse workloads like big data, deep learning, and cloud computing add to this challenge. System administrators need processes and easy-to-use tools to optimize and manage the array of available storage resources, whether in the data center or in conjunction with a cloud computing environment.

A powerful scheduling and resource management solution, such as Bright Cluster Manager, can slot other workloads into those idle clusters, thereby gaining maximum value from the hardware and software investment, and rewarding IT administrators with satisfied users.

Solutions like these can slot other workloads into those idle clusters, thereby gaining maximum value from the hardware and software investment, and rewarding IT administrators with satisfied users.

Computing Support for Competitive Enterprises

Today’s enterprises are taking advantage of the teraflops of compute and terabytes of storage contained in a modern cluster to gain a competitive advantage in the global economy. Early adopters in the IT departments are also embracing artificial intelligence (AI) to up their data science game. By combining machine learning with big data, high-performance computing and an abundance of compute power, today’s enterprises can capably pursue projects that drive innovation.

Managing this complex computing and storage infrastructure and establishing workload convergence, however, can be a daunting task. IT departments must determine where workloads should be run in the most cost-effective manner while managing the entire computer footprint and delivering maximum benefits to the user. As projects expand, it’s rarely an option to purchase additional servers or add a new cluster. Software can provision clusters for top performance; but it remains a challenge for today’s leading IT organizations to effectively utilize available computing, storage and networking equipment in order to deliver the best possible experience to the end user and achieve the highest results per dollar possible.

To optimize capital expenditure of acquired hardware, the more the systems can be used for computation to run simulations or analyze data, the higher the return on investment.

Modern Workloads and the Data Center

High Performance Computing applications are ravenous. They will, at times, consume all of the compute resources in a given system. Machine learning applications also require significant computing resources in order to power their algorithms. While domains that include manufacturing, energy exploration, life sciences and basic research can often scale their applications, either within an SMP server or across systems using communication protocols such as MPI, that scaling is often limited to a few thousand cores. But the benefits of getting results faster cannot be overlooked. As more applications have been re-coded to take advantage of GPUs, performance has increased significantly compared to traditional CPU-based implementations.

workload convergence

System administrators need processes and easy-to-use tools to optimize and manage the array of available storage resources, whether in the data center or in conjunction with a cloud computing environment.

The performance of deep learning algorithms has increased greatly due to the expanded use of GPUs and co-processor accelerators. In most cases, the more data that a machine learning system can use, the better the results. Accelerators facilitate the creation of even faster and more scalable deep learning systems; more data is processed in less time, leading to new and innovative applications for a variety of domains.

Currently, computing power comes in three forms:

  1. Older servers
  2. Newer servers
  3. Servers with accelerators

While the primary goal of building out new IT infrastructure is increased capability and more easeful workload convergence, no IT project comes with an unlimited budget. To optimize capital expenditure of acquired hardware, the more the systems can be used for computation to run simulations or analyze data, the higher the return on investment. Since newer machine learning algorithms can take advantage of many of the same hardware profiles as HPC applications, IT departments can keep the servers busier by running machine learning applications on them.

Many apps have been developed to take advantage of the economical computing power of accelerators. #hpcClick To Tweet

Newer CPUs contain multiple, independent cores that can be utilized by applications designed to spread their workloads over these cores. Modern CPUs may contain as many as 30 cores in a single socket. An application that is designed for parallel execution might see a linear speedup, as long as memory contention is well managed. In a traditional two socket server, almost 60 cores are available for compute, which can satisfy many applications today. Although the CPUs can operate independently, contention for memory from up to 60 threads (assuming no hyperthreading) can slow down a large simulation.

Many applications in both the HPC space as well as in machine learning have been developed to take advantage of the economical computing power of accelerators. These applications are able to use hundreds to thousands of cores simultaneously. Typically attached through a PCIe bus, accelerators are excellent for certain classes of application where the algorithms lend themselves to vectorization.

The insideHPC Special Report on scheduling solutions for easier workload convergence in data centers will also cover the following topics over the next few weeks:

  • Clusters for Faster Results

  • How Cloud Computing Delivers Flexibility and Speed

  • Scheduling to Optimize Infrastructure

  • Resource Management Across the Private/Public Divide

Download the full report, “insideHPC Special Report: Successfully Managing The Convergence of Workloads In Your Data Center,” courtesy of Bright Computing. 

Leave a Comment

*

Resource Links: