The all-new Journal of Supercomputing Frontiers and Innovations has published published a new paper entitled: Toward Exascale Resilience – 2014 Update. Written by Franck Cappello, Al Geist, William Gropp, Sanjay Kale, Bill Kramer, and Marc Snir, the paper surveys what the community has learned in the past five years and summarizes the research problems still considered critical by the HPC community.
Hadoop configuration and management is very different than that of HPC clusters. Develop a method to easily deploy, start, stop, and manage a Hadoop cluster to avoid costly delays and configuration headaches. Hadoop clusters have more “moving software parts” than HPC clusters; any Hadoop installation should fit into an existing cluster provisioning and monitoring environment and not require administrators to build Hadoop systems from scratch. Learn about managing a Hadoop cluster from the insideHPC article series on Successful HPC Clusters.
In late 2010 and throughout 2011, however, we noticed a shift in the HPC market as new workloads such as digital media, various financial services applications, new life sciences applications, on-demand cloud computing services and analytics workloads made their way onto HPC servers. We are now seeing another new trend developing in the HPC space with the introduction of ultra-dense servers.
Make sure you use Cloud services that are designed for HPC applications including high-bandwidth, low-latency networking, exclusive node use, and high performance compute/storage capabilities for your application set. Develop a very flexible and quick Cloud provisioning scheme that mirrors your local systems as much as possible, and is integrated with the existing workload manager. An ideal solution is where your existing cluster can be seamlessly extended into the Cloud and managed/monitored in the same way as local clusters. Read more from the insideHPC Guide to Managing HPC Clusters.
Everything from life sciences to the financial industry are relying on HPC clusters to perform complex and critical operations. Moving forward, there will be a lot more reliance on various HPC systems. So the all-important question comes in – How do you select, deploy and manage it all? Fortunately, IBM, Intel and NCAR have teamed up to explain their view on best practices selecting an HPC cluster using the process behind building the NCAR Wyoming Supercomputing Center.
High performance technical computing continues to transform the capabilities of organizations across a range of industries—helping them to tackle unprecedented big data analysis, generate competitive business advantage, and expand the limits of science and medicine. To keep pushing those boundaries, organizations are continually seeking ways to get more out of their technical computing systems.
Heterogeneous hardware is now present in virtually all clusters. Make sure you can monitor all hardware on all installed clusters in a consistent fashion. With extra work and expertise, some open source tools can be customized for this task. There are few versatile and robust tools with a single comprehensive GUI or CLI interface that can consistently manage all popular HPC hardware and software. Any monitoring solution should not interfere with HPC workloads.
“A successful HPC cluster requires administrators to provision, manage, and monitor an array of hardware and software components. Currently, there are many trends in HPC clustering that include software complexity, cluster growth and scalability, system heterogeneity, Cloud computing, as well as the introduction of Hadoop services. Without a cogent strategy to address these issues, system managers and administrators can expect less-than-ideal performance and utilization. There are many component tools and best practices to be found throughout the industry.”