Entries filed under “System Management”

News related to batch schedulers, installers, and operating systems

Bright Cluster Manager Speeds Chinese Climate Change Research Efforts

This week Bright Computing announced that Tsinghua University in China is using the company’s software to manage its Hadoop-based cluster for climate modeling. The university selected Bright Cluster Manager because it provided a powerful solution for deploying, testing, provisioning, monitoring and managing its cluster while minimizing staffing requirements.

We needed a solution that would provide deep insights and better visibility into every aspect of our cluster. Bright’s highly intuitive interface gives us a complete view, including the ability to drill down to examine specific issues,” said Dr. Xue of the Center for Earth System Science at Tsinghua University. “In addition, Bright provides a multi-OS image that gives us more control over every aspect of our cluster’s operations. As a result, our researchers can develop benchmark software more quickly because we minimize downtime for maintenance and troubleshooting performance issues.” Paratera, a leading professional HPC software and services provider in China, led the cluster project.

The Center for Earth System Science was established in 2009 to develop an earth system science discipline with a focus on global change issues. The center currently concentrates on four broad academic fields: earth system science, earth system modeling, earth observation technology, and global change economics. Read the Full Story.

Also posted in HPC, HPC Software | Leave a comment

Webinar Preview: Talking to your CFO about HPC – Measuring Profitability & ROI

As members of the HPC community, we often find ourselves needing to communicate the value of high performance computing to management teams who don’t quite “get it.” How we can do that more effectively is the subject of an upcoming insideHPC webinar sponsored by X-ISS, a leader in HPC implementation services.

Webinar Title: How to Talk to your CFO about HPC: Measuring Profitability and ROI
Date: January 22, 2013
Time: 8:30am PST

Panelists:

  • Merle Giles, NCSA
  • Sharan Kilwani, Industry Expert
  • Ramesh Krishnan, ATK Aerospace Systems

Register Now

To set the stage for this topic, I caught up with Deepak Khosla, founder and CEO of X-ISS.

insideHPC: It seems like the HPC community doesn’t always speak the same language as enterprise executives, who seem to be focused on cost and ROI.
With all the broad cloud adoption going on, why do you think that cloud is gaining traction in the business world while HPC seems to remain flat?

Deepak Khosla: What we see in HPC is that everything is about Performance. And while we are seeing growth in both the HPC market and the cloud space, in many cases the cloud is a better fit for the Enterprise, which is where we are seeing more growth. There are more issues limiting HPC growth in the cloud, such as:

  • Security –this is a common issue with both HPC & Enterprise, and the perception about data security uncertainty limits growth in the cloud.
  • Big Data Sets – We see in HPC there often are very large data sets, and due to the bandwidth and latency issues in the cloud, access to that data with acceptable performance becomes an issue.
  • Proximity – With most cloud implementations, there isn’t a guarantee of how ‘close’ the machines can be to each other. Low latency, high performance networks are highly utilized for in-house HPC environments today – but this isn’t always easily available with cloud environments. Even with 10 Gig technology, you have better bandwidth, but latency is still a concern. This leads to lower performance in clouds.
  • Virtualization – whether it’s based on legitimate performance issues, or just tradition, many HPC sites apps tend to run better on hardware vs. virtual servers. Enterprise applications in general have been running on virtual environments for quite a while.

In the end, quite a bit is about performance – and ROI. If users can achieve the performance they need and address the above concerns, the cloud will become more widely adopted in HPC. But there is a break-even point in which having your own cluster system makes more sense too.

Finally, because X-ISS works with a diverse set of customers, we also agree with the model that this industry can be broken into two categories:

  1. High Performance Technical Computing, or a more traditional HPC user; and
  2. High Performance Business Computing.

We’re seeing growth in both categories, but High Performance Business Computing seems to be growing faster and since it is closer to the Enterprise ‘production’ business, it will be interesting to see the variance in cloud adoption rates between the two.

insideHPC: On your site, you talk about how there is a major shortage of skills and software available to provide management, analytics and ROI on HPC systems. What does X-ISS do to help mitigate this problem?

Deepak Khosla: These are two different problems that the HPC industry is facing. When you look at the issue of shortage of qualified people with HPC management skills – this has been an ongoing problem for years. Every organization is focused on keeping their HPC system tuned, running, and highly available to the users and this issue impacts their ability to produce timely results. X-ISS offers a turnkey outsource system management service called ManagedHPC that takes care of that challenge, by providing experienced people, and the right monitoring and reporting systems. This is a cost-effective solution especially for those sites where skills are not available or where running efficiently with best practices is important.

Regarding the shortage of analytics software, we see today that most commercial options are just providing technical analytics, and these solutions are also tied to a single vendor stack. Not only is this insufficient, it also sets up the undesirable vendor-lock. The reality is most HPC systems are very heterogeneous, made up of a variety of hardware and software stacks. Through our DecisionHPC software, we provide both business & system analytics on diverse systems, even geographically dispersed environments. The software was built to help improve performance, identify the overall cost, and to help organizations create strategies to track and allocate costs, thus understanding all of the performance and ROI implications.

insideHPC: Your DecisionHPC product is all about providing visibility into what is really going on inside the datacenter. What are your clients able to do with that information in terms of business advantage?

Deepak Khosla: At the high-level, accurate information is powerful. The ability to make changes and quickly react to improve execution, be more competitive, and more timely, is invaluable today.

For example, some organizations are in the business of charging for HPC cycles. Clearly, if this is your business model, then you need to know how past jobs were done, so you know how to price effectively. With this information you can react in a timely manner and improve your ability to perform successfully.

Or if you want to be able to charge back for projects to internal or external customers, you should be able to easily do that in a heterogeneous environment, with the right tools.

If high performance computing is an important aspect to your business success, it’s vital to have the information on both system costs and performance to maximize your HPC investment. It’s necessary to have the insight into how your systems and applications are performing and producing results in a production environment. These are business metrics like any other Key Performance Indicators – and better tools are needed to help businesses track, measure, and make better decisions today.

insideHPC: Large-scale clusters have existed for years. What is changing and why is it more important than ever to look at system management in a new light?

Deepak Khosla: One important change that we discussed earlier is that HPC is moving more and more into the business/commercial space. Here cost management, improving efficiencies, and productivity have a direct impact on business. Therefore it becomes vitally important to have the technical and business historical information; some of this has not been as important in the research space in the past. However, even in traditional HPC environments, there is more pressure to measure, track, report on, and make better decisions, based on the analytics created.

insideHPC: How does X-ISS turn instrumentation into business insight?

Deepak Khosla: Firstly we are now able to collect data from disparate sources – not just hardware, but also schedulers and other sources — and then we can associate those to business reasons, which allow us to formulate real business cost or performance. In other words, X-ISS can help customers identify and analyze how projects or business endeavors are performing, and the impact to the business.

Secondly, we also allow for the creation of business metrics, based on unique business needs. Because every business is different, the ability to track, store, and visualize these custom metrics lead to better insights, allowing for better decision-making.

insideHPC: Why sponsor a panel like this that focuses on how to talk about HPC to Execs?

Deepak Khosla: HPC is now being leveraged for commercial uses – more than ever before – this trend will grow. The purpose of this panel discussion that addresses the ‘Business of HPC’, is to help educate technical professionals, such as computer scientists and engineers, on how to have a dialog that relates to the CxO world, and how to identify solutions that can facilitate this dialog. An important part of this dialog are relevant metrics – how do we have a conversation on both technical and business metrics critical to success? HPC in the past, has relied on metrics more related to pure system performance and hardware and software purchase costs and not necessarily business performance. While many of the traditional tools used in research environments are not adequate for the next generation HPC use, there are solutions to this changing need. These new analytics tools are available today to help them manage HPC within metrics and KPI’s more associated with commercial businesses. The reason for this panel is our industry is ready to have this conversation.

We see from our perspective there is a growing need to treat all HPC systems as a business – and the language of business is numbers. This panel will open the discussion up, as to how a technical research and development manager, or research computing professional, can speak successfully to a CFO, or a committee evaluating ROI, and performance of any given HPC program.

Whether you’re fighting for grants, or fighting for internal funding, or fighting for customers, it’s time our community understands what metrics are necessary to be successful.

Register Now for the Webinar to be eligible to win a $100 AMEX gift card.

Also posted in Business of HPC, Events, HPC, Webinar | Leave a comment

Altair Releases PBS Professional 12.0 with 40 Percent Faster Scheduling

Today Altair released PBS Professional 12.0, the latest version of its widely used workload management and job scheduling solution for HPC.

PBS Professional 12.0 advances Altair’s technology significantly farther along the road to exascale systems, in which clusters will grow from today’s tens of thousands of processors to a million or more,” said Bill Nitzberg, chief technology officer for PBS Works at Altair. “Our latest release is faster, is more robust and offers better utilization. The HPC landscape is changing faster than ever around Clouds, Green, GPUs, and co-processors making customization key. Our new plug-ins provide the agility needed to keep up with this ever changing landscape.”

Read the Full Story.

Also posted in HPC, HPC Software | Leave a comment

Video: What’s New in Moab HPC Suite 7.2?

In this video from SC12, Brady Kimball from Adaptive Computing presents: What’s New in Moab HPC Suite 7.2?

These latest editions of Moab demonstrate our continued commitment to improving the user experience of Moab as well as the back-end functionality,” noted Michael Jackson, president of Adaptive Computing. “By integrating the latest technology from other industry leaders into our solutions, we are making HPC systems run more effectively, which means manufacturers and researchers can more quickly bring their discoveries to the world.”

Read the Full Story and be sure to check out our SC12 Video Gallery featuring over 50 interviews from the show floor.

Also posted in Events, HPC, HPC Software, SC12, Video | 2 Comments

Slurm Workload Manager Built for Speed

Based on the most recent release of the Top500 List, Slurm Workload Manager continues to be the most widely used on the fastest of the fast: 33 per cent of the top 15 supercomputers use the product.

Slurm, an open-source workload manager designed for the most demanding HPC environments, originated at Lawrence Livermore National Laboratory (LLNL) 10 years ago and has evolved over time with the contributions of more than 100 developers. It remains an important workload manager at LLNL, providing scheduling and other functionality to their Sequoia supercomputer, currently number two in the Top500 and ranked the fastest in the previous Top500 List.

The other supercomputers in the 15 fastest supercomputers using Slurm are Stampede at TACC; Tianhe-1A in China; Curie at the CEA in France; and Helios at Japan’s International Fusion Energy Research Centre. Beyond the top 15 systems, SchedMD, the organisation overseeing the code base for Slurm, estimates that as many as 30 per cent of the supercomputers in the Top500 list are using the open-source workload manager.

We built Slurm to schedule efficiently resources for the world’s biggest systems and, through simulation, have proven its scalability to an order of magnitude higher than the currently largest systems,” said Moe Jette, CTO of SchedMD. “It’s now one of the most widely used workload managers in the Top500. As we move to Exascale computing requirements, Slurm is the workload manager best positioned to schedule jobs at that scale.”

Outside of the large supercomputer centres, Slurm is gathering momentum. HPC computer manufacturers Bull and Cray frequently provide Slurm as part of their solutions, and Bright Computing now offers Slurm as the default workload manager in Bright Cluster Manager.

This story appears here as part of a cross-publishing agreement with Scientific Computing World.


Also posted in HPC, HPC Software | Leave a comment

IBM Replaces LoadLeveler with Platform LSF on x86 Clusters

Over at The Register, Timothy Prickett Morgan writes that is mothballing its own LoadLeveler workload manager for x86 clusters in favor of Platform LSF, which the company acquired a little more than a year ago.

In other HPC software news, IBM has announced that it is going to put all of its weight behind the Platform LSF workload scheduler on x86-based clusters and withdraw its own Tivoli-branded LoadLeveler program for x86-based machines. IBM will sell LoadLeveler for x86-based machines until March 15 of next year and support the software until April 30, 2015. The LoadLeveler V5 for both AIX and Linux on Power will continue to be sold and supported on Power Systems servers and the variant for the BlueGene/Q will also still be available, too. That said, IBM is telling customers that Platform LSF is the workload scheduler of choice for its System x, PureFlex, and Power Systems clusters and grids, so take that into consideration when you are planning.

In related news, Morgan writes that Red Hat Enterprise Linux 6 is IBM’s Linux of choice for its massively parallel BlueGene/Q supercomputers and the Power 775 machines. Read the Full Story.

Also posted in HPC, HPC Software | Leave a comment

Video: Moab/TORQUE Support for Intel Xeon Phi

In this video from SC12, Gary Brown from Adaptive Computing presents: Moab/TORQUE Support for Intel Xeon Phi.

The latest version of Moab was designed to recognize and work with the new Intel Xeon Phi coprocessors, based on the Intel Many Integrated Cores (MIC) technology. This ability to automatically detect Intel Xeon Phi coprocessors– and determine their location and availability — improves processor utilization to more intelligently schedule jobs and removes the need for extensive reprogramming to integrate Intel Xeon Phi coprocessors into existing systems. It also allows for policy-based scheduling, optimizing the choice of accelerators and coprocessors. As Intel Xeon Phi coprocessors are introduced into existing systems, this keeps costs and management efforts at a minimum, while maximizing utilization to ensure the most efficient job processing — by utilizing metrics including the number of cores and hardware threads, physical and memory available (total and free), max frequency, architect and load.”

Read the Full Story.

Also posted in Co-processors, Events, HPC, HPC Hardware, HPC Software, SC12, Video | Leave a comment

Configuring Moab to Fairly Share a Supercomputer in a University Setting

In this video from the Adaptive Computing booth at SC12, Jenett Tillotson from Indiana University presents: Configuring Moab to Fairly Share a Supercomputer while Preventing Starvation in a University Setting.


Also posted in Datacenter operations, Events, HPC, HPC Software, SC12, Video | Leave a comment

Demo: All-Spark Cube at Adaptive Computing SC12 Booth

In this video from the Adaptive Computing booth at SC12, Ian Nate from Adaptive demonstrates the flexibility of the company’s Moab software through a custom-built All Spark Cube.

Adaptive Computing, a cloud management and high performance computing outfit in Utah, needed something really cool to bring to their trade shows. Something that makes order out of chaos, and demonstrates their attention to detail in the midst of miles of wiring. They decided building the largest non-commercial LED cube would be a good project, and thus the 16x16x16 All Spark Cube was born. The All Spark Cube was constructed using 10 mm RGB LEDs wired together with three-foot lengths of 16 ga pre-tinned copper wire. In this video, [Kevin] shows off the process of constructing a single row; first the LEDs are placed in a jig, the leads are bent down, and a bus wire is soldered to 16 individual anodes per row.”


Also posted in Events, HPC, HPC Software, SC12, Video | Leave a comment

Video: Using Moab Node-sets to Improve Job Scheduling

In this video from the Adaptive Computing booth at SC12, Gabriel Mateescu from Virginia Tech presents: Using Moab Node-sets to Improve Job Scheduling.

Also posted in Events, HPC, HPC Software, SC12, Video | Leave a comment

Video: Community HPC Clusters at Purdue University

In this video from the Adaptive Computing booth at SC12, Andrew Howard from Purdue discusses how the community cluster program has moved forward with the help of Moab software at the Rosen Center for Advanced Computing.

Also posted in Events, HPC, HPC Software, SC12, Video | 1 Comment

Realizing Energy Efficient Scheduling in a Network of Data Centers

In this video from the Adaptive Computing booth at SC12, Dr. Bastian Koller from the HLRS HPC Center in Stuttgart presents: Realizing Energy Efficient Scheduling in a Network of Data Centers.


Also posted in Datacenter operations, Events, HPC, HPC Software, SC12, Video | Leave a comment

Adaptive Computing Leverages TORQUE to Manage Intel Xeon Phi Resources

In this video from SC12, Nick Ihli from Adaptive Computing demonstrates how the company’s Torque resource manager works with Intel Xeon Phi. By relaying Intel Xeon Phi instrumentation such as memory availability to the company’s Moab workload manager, the system is able to schedule coprocessor resources efficiently.

With the amazing capabilities of the latest supercomputing coprocessors such as the Intel Xeon Phi coprocessor, it’s vital to make it as simple as possible to integrate them into existing supercomputers,” noted Robert Clyde, CEO of Adaptive Computing. “The latest iteration of Moab was designed to maximize the investment being made by today’s HPC providers.”


Also posted in Events, HPC, HPC Software, SC12, Video | Leave a comment

ATK Aerospace Taps PBS Pro Supercomputing at SC12

In this video from SC12, Ramesh Krishnan from ATK Aerospace describes how the company uses supercomputing resources managed by PBS Pro to simulate, test, and build better and safer products.

Also posted in Events, HPC, HPC Software, SC12, Video | Leave a comment

Video: IBM Platform HPC Speeds Applications with Intel Xeon Phi

In this video from SC12, Jie Wu from IBM Platform Computing describes how the company’s Platform HPC software manages computing resources from the new Intel Xeon Phi.

Also posted in Co-processors, Events, HPC, HPC Hardware, HPC Software, SC12, Video | Leave a comment

Video Archive

insideHPC.com is a production of insideHPC, LLC. © 2006-2013 Sitemap