A Simpler Path to Reliable, Productive HPC

This sponsored post outlines how Intel products, including the Intel HPC Orchestrator, can work to simplify some of the complexities and challenges that can arise in high performance computing environments. 

HPC is becoming a competitive requirement as high performance data analysis (HPDA) joins multi-physics simulation as table stakes for successful innovation across a growing range of industries and research disciplines. Yet complexity remains a very real hurdle for both new and experienced HPC users.

An HPC cluster is inherently more complex than a single server, and subtle problems—such as a wrong fabric configuration or an incompatible software version—can lead to suboptimal performance or downtime.

An HPC cluster is inherently more complex than a single server, and subtle problems—such as a wrong fabric configuration or an incompatible software version—can lead to suboptimal performance or downtime. Identifying such problems can take hours, in some cases days. New application requirements add additional challenges. Special tools and strategies are needed to avoid the kinds of bottlenecks that throttle application performance in clustered environments.

HPC

Intel HPC Orchestrator simplifies the design, deployment, and use of HPC clusters, and includes optimized development and runtime support for AI applications.

Solving the Complexity Challenge with Intel HPC Orchestrator

Intel HPC Orchestrator offers comprehensive solutions to these common problems, while simplifying the entire process of deploying, using, and maintaining an HPC cluster. This complete system software stack is based on OpenHPC, which is managed by the Linux Foundation. It includes provisioning tools, resource management, I/O clients, development tools, and a variety of scientific libraries. Intel adds a number of proprietary software components, additional testing, and validated support for clusters with up to 2,000 nodes.

Using Intel HPC Orchestrator, organizations have found they can get a small to mid-size cluster up and running in a matter of hours instead of days or weeks, and that’s just the beginning. Two capabilities offer particularly high value for organizations that are looking to get higher performance and more productivity from their HPC cluster.

Checkups with Intel Cluster Checker

Intel Cluster Checker combines one of the most comprehensive cluster health checking tools available today with a rules-based engine that provides expert recommendations. Issues that might have taken hours or days to identify can potentially be fixed in a matter of minutes.

Intel Cluster Checker is based on more than 10 years of research and development. The current version is the third generation of production code, and Intel continues to enhance this tool to better serve the needs of new HPC users, seasoned experts, system vendors, and application developers.  A typical health check assesses more than 100 characteristics at both the node- and cluster-level, and built-in benchmarks provide high-level performance verification. A cluster can also be validated against a baseline state or a specification to avoid the kinds of subtle configuration drift that so often compromise performance and uptime in HPC environments.

Faster Development of Fast Applications

Most applications today cannot take full advantage of all the processor cores in a single workstation or server, much less in a multi-node HPC cluster. Intel HPC Orchestrator provides a complete development environment for HPC applications, including a comprehensive set of performance tools that have helped many developers boost application performance more than a hundred-fold.

Developers can use the integrated tools to analyze their code and identify hotspots, bottlenecks, and inefficiencies across virtually all layers of the solution stack, from hardware utilization to MPI communications.

  • Lightweight application profiling tools, such as PAPI, mpiP and TAU (Tuning and Analysis Utilities) provide rich information about hardware, software and their runtime interactions.
  • Expert tools, such as Intel Advisor, Intel Trace Analyzer and Collector, and Intel VTune Amplifier, provide clear visualizations and recommendations for identifying and resolving problems in threading, vectorization, memory structures, and much more.

Simpler Deployment, Enduring Value

Intel Cluster Checker is integrated with Intel HPC Orchestrator at no additional cost. The Intel performance tools are also integrated with Intel HPC Orchestrator and are provided with a 90-day evaluation license.

This comprehensive software stack can help HPC users and vendors design, build, and deploy clusters faster, and keep them running better and more reliably. Just as importantly, it can help them optimize their applications so they can take full advantage of the performance and scale of their HPC cluster.

It’s a good way to improve productivity in almost any HPC environment.

Find more information on Intel HPC Orchestrator or Intel Cluster Checker.