With High Performance Computing (HPC) supercomputers system that comprise tens, hundreds, or even thousands of computing cores, users are able to increase application performance and accelerate their workflows to realize dramatic productivity improvements.
The performance potential often comes at
the cost of complexity. By their very nature, supercomputers comprise a great number
of components, both hardware and software, that must be installed, configured, tuned, and monitored to maintain maximum efficiency.
In a recent report, IDC lists downtime and latency as two of the most important problems faced by data center managers.
While HPC has its roots in academia and government where extreme performance was the primary goal, high performance computing has evolved to serve the needs of businesses with sophisticated monitoring, pre-emptive memory error detection, and workload management capabilities. This evolution has enabled “production supercomputing,” where resilience can be sustained without sacrificing performance and job throughput.
Production supercomputing can be roughly defined as the convergence of HPC and high performance analytics. IT departments expect their HPC system to run smoothly with all major elements – hardware, software, and networking – totally integrated.