Entries filed under “Featured Stories”

The historical archive of exclusive in-depth articles written by insideHPC’s editorial staff that you’ll find only at insideHPC.com.

Video: Dynamic GPU Reassignment with NextIO vCORE Appliance

I had grand designs to tape a bunch of demos at ISC10, but time just wouldn’t allow. So after scouting around a bit, I decided to film the best demo that I could find.

In this video, Kyle Geisler from NextIO demonstrates the company’s vCORE Appliance, the “world’s first flexible platform for GPU reassignment in the HPC datacenter.” Watch as he moves GPU resources around on-the-fly, even as they continue to run applications. Consider me impressed, and I can just imagine how powerful this capability will be for putting GPUs to work in the cloud.


Also posted in Datacenter operations, Events, GPUs, HPC, HPC Hardware, System Management, Video | Leave a comment

Video: For HPC, The Russians are Coming!

There was something very new this year at ISC. At the very front of the exhibit hall was a large booth from T-Platforms, a Russian HPC vendor that drew a lot of attention with their hospitality and innovative approach to high performance computing. They’re young, they’re enthusiastic, and HPC is all they do.

In this video, T-Platforms VP of Marketing Alexey Nechuyatov describes the company’s focus on HPC and how they plan to bring their solutions to market in Western Europe.


Also posted in Events, HPC, Video | Leave a comment

Video: My iPad vs. IBM Blue Waters

Thanks to John West, I was able to embed this video from the ISC video blog coverage. I was watching the video and there I was on my iPad, which is kind of an odd feeling. Just so I don’t have to settle for that one second of fame, a couple of people told me that my iPad was the coolest new technology they saw at the conference.

In the beginning montage, you’ll see a brief shot of me working with my iPad. Then Heike Jagode, a computer scientist at the University of Tennessee, talks with Jack Dongarra about exascale software. After that, Stuttgart Supercomputing Chief Michael Resch gives his views on the conference. Later on, from Professor Thom Dunning from NCSA shows off the prototype “IH Server Node” from their 10 Petaflops Blue Waters project.

Heike Jagode and John Shalf, head of NERSC’s Advanced Technology Group, did a great job of covering the show. The rest of their videocasts are now posted for viewing at the ISC conference site.


Also posted in Events, HPC, Video | 1 Comment

The Rich Report: Five Minutes with Cray’s CTO, Steve Scott

Now that ISC has turned 25 and I’m seeing all these familiar faces, I’ve been getting a little nostalgic for the Old Cray Days. And as I look around the show floor at all this capitalism going on, I think about how it all goes back to Seymour Cray, the guy who created the supercomputer industry. So this week it was a great pleasure for me to meet with Cray CTO, Steve Scott.

insideHPC: You know, I started my career at Cray back in 1986, so I just wanted to say how wonderful it is to see the company doing so well and making money again.

Steve Scott

Steve Scott

Steve Scott: It is nice. A lot of the credit I think goes to our CEO, Peter Ungaro. He knows the industry inside and out and he put together a management team that got the finances in order. That was a big deal because people would go, “We like what you’re saying, but what about the financial viability of the company?” So we don’t get that any more.

insideHPC: So I’d like to start out by talking about Exascale. I was having a discussion earlier and someone asked which company is going to get there first. It seems to me it’s down to Cray or IBM.

Steve Scott: You certainly can’t count them out. But there’s actually a record to consider. If you go back and look at not the first peak number or linpack result, but the first sustained application Gigaflop, Teraflop, and Petaflop, they were all on Crays. So the first sustained application Gigaflop was on a CRAY Y-MP in 1988. The first Teraflop was in 98 on a CRAY T3E. And the first sustained application Petaflop was in 2008 on Jaguar and the CRAY XT-5. So I’ve gone out on a limb and said publically that the first sustained application Exaflop will be on a Cray in 2018. So that’s our internal target. And this one is going to be harder than the last one. So now that Cascade is kind of in the bag, most of my time is spent thinking about how we get to Exascale.

insideHPC: So what is the superscale user community doing right now to get things going?

Steve Scott: There’s a lot of stuff heating up in the Exaflop race or whatever you want to call it. The DOE Exascale program is getting closer to reality and the DARPA Ubiquitous High Performance Computing program just got off the ground. They’re not specifically targeting Exascale, but UHPC actually grew out of all the study teams that DARPA sponsored: one on hardware, one on software, and one on resilience.

So UHPC is focused on a Petascale in a box, but at power efficiency targets that are good enough for an Exascale. They target 50 sustained Gigaflops per Watt and if you scale that up, that turns into Exascale for 20 Megawatts. So if we can hit the UHPC target, we will be on our way.

Interestingly, the biggest datacenters today are north of 100 Megawatts. That’s the big Internet datacenters, you know Google, Amazon, and Microsoft. So this Exascale system would be 20 Megawatts, which is bigger than any single system today. Jaguar (#1 on TOP500) is the biggest HPC power consumer today at roughly 7 Megawatts, but it’s also the greenest x86 system in the TOP50 in terms of sustained flops per watt.

insideHPC: So if they can hit these targets, the Exascale system would consume roughly three times as much power as Jaguar?

Steve Scott: Yes, and that’s a big factor because you have to consider the budget to run the system and what’s practical from a political perspective. The power budget for a 20 Megawatt system would be roughly $20 Million per year, since power costs about a million dollars per Megawatt year.

insideHPC: It sounds like the next 8 years are going to be very exciting.

Steve Scott: It is fun. And the past five or so years have been pretty interesting as well. We had this huge technology inflection point, you know, with power and delay and transistor counts are going up, a bunch of things came together which said we’ve got to do things completely differently. That’s why we had the whole multicore phenomenon happening.

But when we look out to Exascale, it’s clear that just straight multi-core x86 is not going to get us there. Fundamentally we have to do something different underneath the covers to get the energy efficiency. So that’s why we’re pretty interested these days in looking at forms of accelerated computing. We formed a partnership with NVIDIA and we’re working with them on some future stuff. It’s not going to take over the world overnight, but it’s headed in the right direction.

insideHPC: So what are the most important things you would like potential customers to know about Cray?

Steve Scott: I would say number one is Efficiency. We’re all about power efficiency and system efficiency. Our focus is entirely on sustained computing, sustained results for real scientists. So we work very, very closely with our end users. We have a relationship with them that is not typical. We know their codes and we work on things that don’t make the peak of the machine any better, but things that help them get sustained results. So the fact that the Jaguar system is sitting at number one on the list is nice, but we really don’t care how fast it runs linpack. There are three applications on Jaguar that are achieving over a Petaflop, sustained, and that’s what it’s all about.

I would say the third thing is that Cray now has complete coverage of the HPC space, from the desk-side up to the biggest supercomputers. We’ve got a collaboration with Microsoft where we offer Windows HPC on our low-end, desk-side boxes. And we have now introduced the CX1000, which is a single rack system that uses Intel processors. It has SMP nodes as well as distributed-memory nodes. It also has graphic acceleration, all integrated and ready to run. So HPC is all we do and we can get you started.

Also posted in Events, HPC | 3 Comments

Exclusive: Russia’s biggest HPC provider talks Rusnano and the bid to spur innovation with HPC

insideHPC talks with Vsevolod (Seva) Opanasenko, CEO of T-Platforms Group

T-Platforms logoT-Platforms Group, headquartered in Moscow, has quietly been expanding its installed base throughout Russia and CIS, now approaching 200 supercomputer installations, and starting to build a presence in Western Europe. The company’s next-generation operating system, Clustrx, combined with an innovative architectural approach, is getting significant attention in the HPC community.

Last week they announced the award of a $6 million nanotechnology and supercomputing enablement program funded by the Russian Corporation of Nanotechnologies, Rusnano. The program is somewhat unique and could stand out as a model for other countries in that it is designed to create real-world adoption of advanced HPC and nanotechnologies in production environments.

Neither T-Platforms nor Rusnano are very well known throughout this community, so we wanted to give our readers more insight to both this announcement and the organizations behind it. We were fortunate to catch up with Vsevolod (Seva) Opanasenko, CEO of T-Platforms Group, and as a result, are pleased to bring you this feature interview.


insideHPC: Can you tell us a little more about Rusnano?

Vsevolod (Seva) Opanasenko: Rusnano is the Russian Corporation of Nanotechnology and was established by the Russian government in 2007 as a state corporation chartered to co-invest in nanotechnology and supercomputing industry projects that have high commercial potential or social benefit. The corporation also provides scientific and educational programs to help ensure the success of the various investment projects.

Collaborative funding from Rusnano is very specific toward the development of actual real-life simulations to bring companies immediate value and benefits — accelerating the evaluation and analysis of the impact of supercomputing to industrial and scientific discovery. Overall, the Rusnano program goals have been established to eliminate technical barriers to adoption and to improve production and productivity.

insideHPC: What are some of the specific deliverables from this program, and what level of support will organizations receive?

Seva: Well first, T-Services will oversee the project management for this program. We will accept and review the many submissions to determine their feasibility based on a number of criteria such as potential importance, practicality, possible ROI, and applicability to spawning commercially sound results or products.

There are two engagement scenarios we anticipate as being ideally suited for this program: organizations that have appropriate computational tasks identified but have no expertise or computational capacity, or those who have the expertise in software and need only the computational capacity.

We will provide extensive consulting expertise from helping to fully define the computational problems being addressed, to preparing the task budgets, and finally selecting the most promising tasks to be presented to the Rusnano Expert Council. Then, once an organization is engaged in the program, T-Services will provide, as necessary, the actual modeling, simulation and analysis of the computational tasks. This in itself makes this program quite unique — setting it apart from other programs in which organizations are granted use of computers — but nothing at this level of deep technical support.

It’s actually quite amazing to think that the program will fund the efforts for 40 different tasks — half of the selected tasks will be directly involved with nanotechnology, or nanoscience, and half will be in support of engineering efforts in areas such as shipbuilding, aerospace, oil and gas, chemistry, pharmaceuticals, energy and construction.

insideHPC: Could you expand on that just a bit. What kind of tasks?

Seva: Well, think of this as a nanotechnology / HPC industry watershed. Essentially, half of the tasks should come from industrial companies and organizations engaged in actual, real-life production or manufacturing, while the other 50% can come from the organizations involved in science and R&D. The bottom line though is that these task — or projects — should be aimed at research and development of technologies that can subsequently be commercialized. So, for example, an organization might have a very creative approach requiring a new computational process that could greatly assist in the search for new oil or gas deposits. But, the type or capability of computer power may be way out of their reach, or they may not even have the in-house expertise to develop the software for modeling and simulating the process. If the potential for this task looks promising and fits the program criteria, they could in theory receive funding that would provide both the computational cycles and the deep technical expertise necessary to proceed to a proof of concept stage — and even on to production.

insideHPC: Will this program have an impact on HPC adoption outside of Russia?

Seva: HPC adoption, in general, and having nothing to do with geography, faces many barriers when it comes to commercial organizations. HPC systems, historically, have not been practical for organizations without specific expertise in this area of computing. The lack of talent such as the skilled resources needed to develop modeling and simulation software is a deal breaker for many companies. The high Total Cost of Ownership (TCO) often associated with HPC systems has made it extremely difficult for smaller and mid-size companies to justify moving in this direction — especially without a proof of concept, which is essentially impossible to provide in the absence of professional end-to-end technical services.

Programs such as the U.S. Department of Energy’s INCITE program, will give away approximately 1.6 billion supercomputer processor hours in 2011. This is an amazing commitment on the part of the U.S. Federal Government to accelerate scientific discovery. However, and not to downplay the vital importance of this program in any way, but this type of program only helps those organizations who already have extensive modeling and simulations expertise in place. In contrast, the Rusnano program is designed to engage companies with little or no expertise in these areas of advanced computing. An organization only has to have an idea of how relatively large-scale computation, modeling and simulation could help. The organization doesn’t have to boast expertise in software or even have CAD models. In Russia, this will help to ignite many innovative tasks with companies who would not get there on their own. But, beyond Russia, this is an example of market building on a much broader scale. The Rusnano program will bring HPC into many new companies to advance discovery in nanotechnology, a number of scientific research disciplines, and critical production areas requiring advanced computation — with both the supercomputing physical resources and the technical expertise necessary to make it happen.

insideHPC: So, the contract described in this announcement is between Rusnano and T-Services, which I understand is a T-Platforms Group company. Can you tell us a little more about T-Services?

Seva: You are correct in that T-Services is one of the T-Platforms Group companies. The easiest way to describe the unique value proposition of T-Services is to discuss it in the context of T-Platforms.

T-Platforms provides solution integration consulting to assist with basic HPC enablement such as hardware integration and system configuration as well as software optimization focused on helping customers improve application performance. The technical benchmarking and integration specialists at T-Platforms work with customers to develop optimum hardware and software configurations, matching the unique requirements of the user, along with application software optimization to achieve improved application performance where necessary.

T-Services however provides end-to-end simulations and modeling for customers using a wide range of traditional HPC applications. The company’s deep technical expertise includes disciplines such as CFD and structural analysis. T-Services also offers supercomputer center management services to deliver maximum effective use of all HPC systems to improve an organization’s ROI, as well as on-demand computing services from simply allocating compute cycles to implementation, operations and administration of commercial software packages.

The powerful combination of T-Platforms’ flexible hardware architecture and technical system management expertise, combined with the end-to-end modeling, simulation and technical services available from T-Services, enables organizations to achieve time-to-solution advantages while avoiding pre-packaged, pre-configured systems ill-suited to their unique challenges.

T-Services delivers value to organizations who only need HPC occasionally — or would rather outsource certain parts of the HPC simulation workflow.
Another level of support offered by T-Services is management of supercomputer sites. T-Services can act as a management company, taking responsibility for all resources and providing not only the operation and administration talent, but also managing the workflow with customers, from marketing and sales to actually doing the technical tasks.

insideHPC: And can you elaborate just a bit on the working process of this program and the role T-Services will play in the overall program management?

Seva: Sure. This program is actually very well structured and organized. The first stage is the initial evaluation and selection phase. T-Services will interface with all the applicants to help define the proposed tasks, identify and choose appropriate software, establish the proposed working budgets for each task, and generally organize the selected applicants and task descriptions, along with all the requirements for submission to the Rusnano Expert Council.

Then, for the next stage, members of the Rusnano Expert Council will review the submissions based on the defined criteria which I can break down into six pieces.

  • Does the project fit the scope of industries authorized for this program?
  • Is HPC really needed and applicable?
  • Is the project innovative — and could it lead to creation of an original technology?
  • Can this project be co-financed by the author of the problem?
  • Specific to industry — how would any potential profit from solving this computational problem be measured for the organization?
  • Specific to research organizations — might this research lead to creating a technology with a good commercialization potential?

insideHPC: What organizations are eligible for consideration within this program and how would they apply? What do they need to know?

Seva: This program is being launched within Russia only. Any organization, either research or industrial, nano or other qualified industries, may apply, but their ideas should meet the criteria of Rusnano that we’ve mentioned. Organizations should note that the Rusnano Expert Council includes representatives from Rusnano, experts in modeling in engineering and nano applications (not ISVs but users), experts from industry and nanoscience (to evaluate the potential benefits to industry and the degree of innovation of the projects), a representative from the government, and even a representative of a T-Services competitor, to control budget evaluations and methods chosen by T-Services. The organizations apply first to T-Services through the web-site, submitting their ideas and explaining their correspondence to the criteria, and then after the evaluation process mentioned earlier, T-Services submits the chosen projects to the Rusnano Expert council.

insideHPC: How long will this program run? What will happen when the program ends?

Seva: Overall, 10 months. The expert council gathers several times through the project but it should select all the 40 tasks within 5 months from the project start — otherwise some simulation tasks that take a couple of months to complete might not be fulfilled in the program’s identified window of ten months. The preliminary results should be presented and evaluated seven months from the project start.

So, there should be at least 40 computational problems corresponding to criteria and approved by the council. The level of co-financing from the participating organizations will be an important measure of success. And finally, upon completion, Rusnano will consider whether it is worthwhile to build a supercomputer center for collective use, for nano and other industries.

Also posted in Events, National and Legislative Action | 1 Comment

Microsoft unwraps technical computing initiative, leaves most to the imagination

In an email posted on the “executive e-mail blog” earlier this month Microsoft Server and Tools Business president Bob Muglia announced Microsoft’s Technical Computing Initiative

Microsoft logoOur goal is to unleash the power of pervasive, accurate, real-time modeling to help people and organizations achieve their objectives and realize their potential. We are bringing together some of the brightest minds in the technical computing community across industry, academia and science at www.modelingtheworld.com to discuss trends, challenges and shared opportunities.

They say it’s good to have a vision, and Microsoft is long on vision in the announcement

One day soon, complicated tasks like building a sophisticated computer model that would typically take a team of advanced software programmers months to build and days to run, will be accomplished in a single afternoon by a scientist, engineer or analyst working at the PC on their desktop. And as technology continues to advance, these models will become more complete and accurate in the way they represent the world. This will speed our ability to test new ideas, improve processes and advance our understanding of systems.

In order to realize this vision Microsoft says it will be investing in three core areas: cloud tools for technical computing (this is the angle that Microsoft evidently talked up as it was pre-briefing reporters), tools for writing parallel apps, and tools that facilitate technical computing. Since it’s not clear from my shorthand how that last one is different from the second one, here’s what Muglia had to say

Develop powerful new technical computing tools and applications: We know scientists, engineers and analysts are pushing common tools (i.e., spreadsheets and databases) to the limits with complex, data-intensive models. They need easy access to more computing power and simplified tools to increase the speed of their work. We are building a platform to do this. Our development efforts will yield new, easy-to-use tools and applications that automate data acquisition, modeling, simulation, visualization, workflow and collaboration. This will allow them to spend more time on their work and less time wrestling with complicated technology.

So what’s at www.modelingtheworld.com? The homepage (which takes a long time to load) is a heavy silverlight app with a bunch of marketing videos about why science and computing are Good. Useful as far as it goes, if you are trying to inspire young people. Nota bene: HPC Rock Star Thomas Sterling is featured, as are HPC luminaries like Tony Hey, Horst Simon, and Dan Reed; HPC Rock Star Bill Kramer is scheduled to debut later, along with Gordon Bell, Jack Dongarra, Burton Smith, and others. That’s part one of the site.

Part two is the “social ecosystem,” which gathers tweets related to technical computing. Two columns of three tweets each that scroll. With the promise of more in the future — things like abstracts from journals. So…that’s part two.

B for effort, for content…not so much

To be honest, the launch left me wanting to know more, and irritated that Microsoft wasn’t putting it out there, especially given that they launched a new web site dedicated specifically to the idea. They obviously put in a lot of effort creating a snazzy site and building all those videos (but, seriously, no copy and paste? Of any text? What the hell?). And kudos for the possible good the videos might do in inspiring someone to pick a science and engineering field.

But with some of the smartest computational people of this and the past two generations on Microsoft’s payroll, I thought surely there was more involved. What’s really going on here?

So yesterday I talked with Microsoft’s Kyril Faenov about the announcement. Kyril leads the Technical Computing Group, which includes Parallel Computing Platform and Windows HPC Server. One of the first things I asked him was, essentially, “where’s the beef?

Faenov says that this announcement put a “stake in the ground” for Microsoft (his words), and that it marks the “beginning of a conversation” that Microsoft will be having with the community over the coming months and years as it looks to develop tools to bring more technical computing to more people.

15 million users are a big target

According to Faenov, Microsoft’s own analysis indicates that there are 15 million “technical computing” users out there — domain specialists, analysts, and so on — who could potentially benefit from more powerful tools. As the most ubiquitous computing platform on the planet, Microsoft wants to be the provider of choice of tools that build an easier to use infrastructure and workflow for these users.

But with so much diversity in the types of work that are being lumped together in that 15 million, where is the opportunity for one-size-fits-most software? “We will be helping users with specialized requirements through our partners,” Faenov says, “but there is still a lot of commonality in basic tasks across those user groups, and those are situations in which Microsoft can add value. Areas like visualization, and tools for prototyping mathematical models.” When I started to dig in and look for specific examples, Faenov would only say that they are still working on the strategy and they hope to be able to say more later this fall.

The pain points, and a bottle of Azure salve

“Microsoft sees three top pain points for technical computing users,” explains Faenov. “Skill sets and tools for parallel programming, cost efficiency for computing at large scale, and infrastructure challenges.” And here is where we get to the part of this technical computing initiative where Microsoft has something to offer in the very near term.

According to Faenov Microsoft will be integrating support for burst computing — grabbing cycles from somewhere else to satisfy a transient computing demand — into this summer’s release of HPC Server. Initially this support will be for scavenging cycles from the other Windows computers on your enterprise’s network, but next year this will be expanded to include integration with Azure, Microsoft’s cloud computing offering. More of an Amazon EC2 kind of gal? Just point your cycle scavenger in that direction and you can use Amazon’s cloud instead of machines scattered around your own network.

Same mistake, second verse

When I first dug into Muglia’s executive letter, I was dramatically underwhelmed — upset even — by the degree to which Microsoft seemed to have missed the boat. After having dug into it with Kyril, I’m at least cautiously optimistic that Microsoft has a plan and is executing against it. But for the amount of attention this release got, perhaps unintentionally from Microsoft’s perspective, I would suggest in hindsight that they should have waited a year to make this push. It really needs to have a few accomplishments and an articulated strategy in order to be taken seriously.

Microsoft made this same mistake with the first round of their HPC operating system development. Big fanfare, with not much to show for it at the time. Over subsequent years they’ve managed to build out a capability that people are finally ready to take seriously.

It appears they didn’t take any lessons at all from that first experience and the backlash from the HPC community. Again they are making a big announcement about something they are going to start doing, real soon now.

I find this frustrating, but I’m still listing myself as a Microsoft supporter in technical computing. Of all the companies currently in the market, only Microsoft and Intel have the wherewithal to bring forth a complete, mature solution for the technical computing ecosystem that dramatically expands the number of people who do technical and high performance computing. I hope at least one of them succeeds.

Also posted in Computing Research, HPC, Tools | Leave a comment

More on SGI’s new ICE system

A few weeks ago SGI launched the successor to its workhorse ICE 8200 systems, the Altix ICE 8400. The fourth generation of the ICE offering, the 8400 is a blade-based, scale-out solution that offers a distributed memory solution based on Intel’s and AMD’s compute technology. ICE is distinguished from the forthcoming Altix UV system (and by SGI’s current generation Altix 4700 systems) by the lack of shared memory.

SGI logoSGI has tried to build a lot of choice into this line by offering 5 different compute blade configurations (both Intel and AMD) and a host of storage and interconnect offerings. A big change for the 8400 is that they’ve increased the bandwidth on the backplane and now offer a ratio of 3 switch fabric ports per node port, an advantage that SGI claims over competing solutions that are closer to a 1:1 ratio. The new configuration led SGI to a new record on the SPECmpiL_2007 benchmark — the 8400 scored 51.3, outpacing the previous ICE 8200 as the record holder with 43.3.

In terms of blade choices, your five options are mostly distinguished by InfiniBand networking choices. The Intel blades can host either Nehalem or Westmere with up to 96 GB of memory. Systems can be configured with Mellanox QDR IB HCAs that are single port, dual port/single channel, or two single ports each with a dedicated PCI-e channel. Obviously your choice would depend upon the degree to which your particular workload is constrained by the cluster interconnect, but if you need lots of bandwidth or need to segregate your application and storage traffic, the last option is the way to go. Your options for the AMD Opteron 6100 blade are slightly more limited, with choices of either the dual port/single channel HCA or two single ports. The AMD blade is interesting in that it features 8 DIMMS per CPU socket, or a total of 16 DIMMS per blade (the Appro AMD solution, for example, is an 8+4 configuration).

NASA saves 2 million hours of downtime

The 8400 also features a couple interconnect topology options with both the common hypercube and fat tree topologies. In addition SGI offers an all to all option and their own unique enhanced hypercube topologies. Either hypercube topology choice has an interesting capability versus typical competitor topologies (fat tree and 3D Torus) in that SGI can enable a system to be upgraded with additional nodes without having to shut the entire system down to re-cable the network. NASA took advantage of this feature in a recent upgrade of its Pleiades system to add a new 512-core rack into its existing system while it was running a production workload. NASA Ames estimates that it saved 2 million node hours of downtime from this live integration feature. SGI stated that while this is not a standard product offering, they can certainly work with customers to enable this at other sites as well. NASA added eight additional racks worth of SGI Altix ICE 8400 gear to their system using this same approach.

Another interesting feature of the NASA installation is that they have integrated a total of 32 racks of ICE 8400 into their existing 8200 system, a nice feature if you aren’t quite ready to toss out your old super.

Also posted in Compute, HPC Hardware, New Installations | Leave a comment

IBM adoption moves NVIDIA into the big leagues

nVidia logoSince the first day that NVIDIA spotted a few researchers using its cards for computation, not graphics, and decided to get into the computing business the company has been focused on building an entire ecosystem around its product. Unlike accelerator vendor ClearSpeed (the darling of SC several years ago), NVIDIA was able to leverage mass market economics (inexpensive-ish hardware) alongside a community of partners enabled by its own investments in programming interfaces, documentation, and “community” to turn good hardware into a phenomenon. It is now enjoying the fruits of those investments and riding a network effect into even deeper penetration in the HPC market. Today the company estimates that there are over 200,000 CUDA developers world wide. While its not clear that NVIDIA’s accelerator hegemony will last forever — both Intel and AMD are pursuing plans to include accelerators into their base silicon — the company is in an outstanding position in the near term.

IBM logoAnd more evidence of that came this week as IBM announced that it had integrated both NVIDIA’s current and last generation GPU technology into its scale-out iDataPlex server line. The configuration puts 2 CPUs in a node plus 2 GPUs, either the M2050 Fermi technology or the last generation M1060. The “M” series are shipped without heatsinks and are designed to be embedded into servers that will handle the cooling. Other vendors are working with NVIDIA, of course, including Appro and SuperMicro. But in the top-tier HPC vendors much of those efforts are in low-end systems like the Octane III from SGI or Cray’s CX-1. HP is said to be working with customers on custom builds that include GPUs, but it isn’t talking about that work publicly.

The move by IBM quadruples the peak FLOPS customers can stuff into a rack to 49 TFLOPS (double precision FLOPS using the M2050 “Fermi” technology), even though the number of Intel Westmere-powered nodes in that rack falls from 84 to 42 to accommodate the GPUs. Potential customers should also be aware that the power per rack also increases slightly, from 27.55 kW to 31kW, but this is not bad at all when you look at the change in FLOPS/kW: from .42 TFLOPS/kW without GPUs to 1.6 TFLOPS/kW with GPUs. Of course, all FLOPS are not created equally, and the the GPU-enabled comparison most of the performance comes from the GPUS — 43.3 TFLOPS from the GPUs versus 5.9 from the Westmeres. If your application isn’t well-suited to GPUs you obviously won’t see much benefit from all those added FLOPS. You can learn more about the new iDataPlex option in this video from IBM at YouTube.

NVIDIA’s Sumit Gupta briefed me earlier this week on the announcement, and he took the opportunity share some performance results. The graph below compares the performance of a 1U, 2 socket, 2.66 GHz Nehalem server with 48 GB of memory to that same server plus 2 Tesla C2050 Fermis. On the HPL at least the difference is an 8x improvement. Gupta also walked me through an example he had built to show that $1M gets you about 10 TFLOPS in a CPU-only IB-based cluster, and 50 TFLOPS if you are able to do most of your computation courtesy of GPUs.

Fermi performance on HPL

To demonstrate the size of its reach into the research application space, NVIDIA examined the major codes in use by the Barcelona Supercomputing Center, and found that of the 11 applications that make up the majority of that center’s work, 5 already have CUDA ports (NAMD, AMBER, GADGET, GROMACS, and CHROMA), and 4 have viable alternatives that are ported to CUDA (ABINIT, TeraChem, PHMC, and CDOCK). Notice that there is a lot of molecular dynamics there, and NVIDIA is clearly picking an example center that is dominated by applications its does especially well on, but I don’t think this changes the value of the message. If you fall into a certain class of applications then not only might GPUs be a good option for you, but the application work may already be done.

There is one item of interest that NVIDIA wasn’t talking about this week, though: the ORNL system. A deal was implied during the Fermi launch in the fall of last year through which ORNL would build an ORNL-scale super using NVIDIA’s GPUs, but so far neither side has been willing to spill the beans on any details.

Also posted in Business of HPC, GPUs, HPC, HPC Hardware | 1 Comment

QLogic Aims for the Fences with Infiniband Fabric Suite

qlogicBefore you mark this as “just another Infiniband press release,” you might want to reconsider.  I had the pleasure of speaking with Phil Murphy this week, VP of QLogic’s Network Solutions Group.  The Network Solutions Group heads up the goodness that is QLogic’s TrueScale Infiniband product suite.  Those who have been around the Infiniband block before remember that this group was formerly their own company called PathScale.  QLogic acquired the startup and pumped them full of funding and corporate clout with the fabs.  After several years of work, what they have is a high bandwidth, low latency interconnect that looks like Infiniband, smells like Infiniband but runs like a scalded cat.

Our conversation got off to a quick start with a bit of Infiniband history.  Infiniband was originally designed as a data center consolidation product.  Ethernet, fibre channel and even PCI carried over the same phy was the ambitious dream of the early adopters.  As such, the early protocol stacks reflected the idea of encapsulating multiple frame or packetized network layers over a single interconnect.  Exactly the sort of design that most HPC network gurus cringe at.

Fast forward to 2010.  QLogic has decided to change the face of their Infiniband network stack.  Rather than barreling down the path of “queue-pair” style Infiniband communication [Verbs for those in the know], they have implemented a new connection-less and state-less communication primitive.  The new software layer allows applications to send millions [literally] of concurrent messages without paying a terrible amount of setup penalty.  Who many millions?  According to Phil, traditional Infiniband products will peak at around 7 million messages per second.  QLogic’s new stack will hit 30 million messages per second.

QLogic accomplishes all this by going down into the guts of Infiniband routing and QoS metrics in order to tune the fabric for a myriad of different message classes.  Hammering a disk sub system will large blocks?  They can do that.  Hitting a neighboring node will billions of small messages?  They do that too.  With IFS 6.0 they’ve wrapped up the following additional features:

  • Virtual Fabrics combined with application-specific CoS, which automatically dedicates classes of service within the fabric to ensure the desired level of bandwidth and appropriate priority is applied to each application. In addition, the virtual fabrics capability helps eliminate manual provisioning of application services across the fabric, significantly reducing management time and costs.
  • Adaptive Routing continually monitors application messaging patterns and selects the optimum path for each traffic flow, eliminating slowdowns caused by pathway bottlenecks.
  • Dispersive Routing, which load-balances traffic among multiple pathways and uses QLogic® Performance Scaled Messaging (PSM) to automatically ensure that packets arrive at their destination for rapid processing. Dispersive Routing leverages the entire fabric to ensure maximum communications performance for all jobs, even in the presence of other messaging-intensive applications.
  • Full leverage of vendor-specific message passing interface (MPI) libraries to maximize MPI application performance. All supported MPIs can take advantage of IFS’s pipelined data transfer mechanism, which was specifically designed for MPI communication semantics, as well as additional enhancements such as Dispersive Routing.
  • Full support for additional HPC network topologies, including torus and mesh as well as fat tree, with enhanced capabilities for failure handling. Alternative topologies like torus and mesh help users reduce networking costs as clusters scale beyond a few hundred nodes, and IFS 6.0 ensures that these users have full access to advanced traffic management features in these complex networking environments

QLogic has gone well out of their way to make Infiniband even more HPC-friendly.  So much so that Dell, IBM, HP and SGI have already signed up to resell/OEM the new gear.  Keep an eye of the continued change via the QLogic Infiniband landscape.  This could prove to change HPC interconnects as we know it.

Correction: SGI remains a Voltaire customer for Infiniband products.

Also posted in Enterprise HPC, HPC, HPC Hardware, Network | 1 Comment

insideHPC talks with Wolfgang Gentzsch about ISC and the world of HPC

As we prepare for the 25th anniversary of the International Supercomputing Conference (ISC’10), we once again tracked down one of our favorite commentators, Wolfgang Gentzsch, for some inside perspective.


insideHPC: ISC is celebrating its twenty-fifth anniversary. How many of the ISC conferences have you attended?

W. Gentzsch

Wolfgang Gentzsch: I’ve been fortunate to have attended all the ISC conferences since the first event. It seems hard to believe that it has been 25 years.

insideHPC: What are some of the most significant changes you have seen with the ISC conference over those years?

Wolfgang Gentzsch: Watching the growth of the conference has been just amazing. And, not only did ISC grow from about 150 attendees in 1986 to almost 2,000 today, the unique, genial atmosphere that this conference has become known for managed to survive that growth.

Looking back over the years, some of us certainly miss the unforgettable wine tours to the Mannheim hinterland in the Summer time, but of course, today that would just not be practical. Can you imagine trying to load 2,000 participants into 40 buses for a winery tour?

insideHPC: What do you think will be some of the hottest topics being discussed at ISC’10?

Wolfgang Gentzsch: HPC in the context of Multicore, Green IT, Cloud Computing and, certainly, the organizer of this conference for the past 25 years, Hans Meuer himself. And, purely from the networking standpoint, what are all my HPC friends doing this year?

insideHPC: While the ISC conference is on a much different scale than the ACM/IEEE SC conference, it is seen by many as being just as important — and in many ways — even more important for HPC companies looking to do business in Europe. How would you describe ISC to a company that is new to HPC?

Wolfgang Gentzsch: In such a fast changing area such as HPC, it makes a lot of sense to me to have two supercomputing conferences per year, every 6 months, one in the US and one in Europe. Change happens fast in this community — and once a year is just not enough to keep up with all the advancements. While the annual SC conference attracts a growing number of international attendees, there will always be some number of HPC professionals in Europe who just can’t do the travel to the annual US event. ISC brings a good balance and affords the vendors an opportunity to get close to their European customers.

ISC attracts various levels of decision makers in research, government and industry. So, for any HPC vendor, new or established, ISC is a great venue for meeting existing and future customers.

insideHPC: What advice do you have for first time attendees coming to ISC’10?

Wolfgang Gentzsch: Before you come to ISC, look at the rich online program. There is something of interest for everyone. My advice would be to make a detailed plan for the talks you wish to attend, the booths you would like to visit, the people you want to meet, and of course the parties where you can relax and network with your colleagues. Without some forethought and planning, the show can be overwhelming.

insideHPC: Tell us a little more about your role at DEISA — and can you explain the role of DEISA to our readers?

Wolfgang Gentzsch: DEISA is the Distributed European Infrastructure for Supercomputing Applications, providing access to HPC resources for individuals and teams of researchers, supporting them with solutions for solving their grand challenge big science applications. Hundreds of researchers in Europe and around the world have so far benefited from this Grid of HPC resources residing at 12 of the largest HPC centers in Europe. DEISA has now been in production for 5 years. My main role in DEISA is to work with the dissemination team, to spread the message widely, and to invite scientists and researchers to make use of this precious e-infrastructure.

insideHPC: Stepping back to take a much larger view of the global HPC community, what is your perspective on the HPC business climate for 2010 — and how does it compare to 2009?

Wolfgang Gentzsch: I would sum up the climate for 2010 as “evolution everywhere.” At one end (the smallest scale) we have increasing cores on the chip for lower power consumption, and at the other end (the large scale) we see the trend to consider Cloud services even for HPC applications. And in general, continuous commoditization with more user-friendly access, and more sectors using simulations on HPC systems.

So, how does this compare to the climate for 2009? I’d say it’s like a battery that’s been recharged and running at full speed. The activity level is exciting and HPC is experiencing renewed vigor and growth.

insideHPC: This brings up an interesting question. Some critics say we are putting way too much hype on Cloud services, and that it is really just a new label for Grid computing. What is your perspective on this?

Wolfgang Gentzsch: Cloud Computing is the result of a natural evolution of our community’s work on distributed computing. 10 years ago, when we talked about Grid Computing, our goal indeed was what Cloud Computing promises today: remote, secure, dependable, consistent, pervasive, and inexpensive access to computing, as just another (new) utility. That’s Cloud today. Grid remained more in the realm of the scientists, with their complex workflows which a Grid can accommodate, and with their needs to adjust (and match) applications and resources. And still, a Cloud can be a service node in a Grid, whenever a scientific workflow component is suitable for the simpler architecture of a Cloud.

Finally, in my opinion, Clouds will serve a useful purpose in a number of areas, but I don’t see Clouds, at least not in the midterm, for those environments where you need the highest performance computing while requiring low latency and powerful interconnect architectures for tightly coupled algorithms.

insideHPC: What do you see as the key enablers to us being able to really advance scientific discovery, and what are the key barriers?

Wolfgang Gentzsch: From my perspective, the most important key enablers will be easy access for every scientist to any HPC system, along with supporting interdisciplinary collaboration in virtual organizations.

A key barrier is that we still do not have enough scientists and engineers, compared with the challenges we face. Unfortunately, programs designed to attract many more students early on to science education is a topic still strongly neglected by many governments. I have some hope now with great technologies such as Web 2.0 and Cloud Computing, that we can start bringing science simulations and other content to our children and students, making it easy for them to access, and delivering programs in an “edutaining” way, such as what is being demonstrated already on the Web, for example, with the GridwiseTech e-School prototype and other projects.

insideHPC: Looking out to the next 3-5 years, how will HPC make a difference in our lives?

Wolfgang Gentzsch: HPC already makes a profound difference. Today, we are enjoying many amenities to make our lives easier, such as the advancements in travel, living, health, leisure, and knowledge.

Many advanced products and technologies have been designed on HPC systems, and HPC systems have allowed us to gain deeper insight into the secrets of nature. I have no doubt that this progress will continue and even accelerate in the coming years.

However, for HPC enthusiasts, this means: no hope for early retirement.

Wolfgang Gentzsch is currently Advisor to the EU funded project DEISA, a member of the Board of Directors of the OGF Open Grid Forum standards organization, and a senior consultant to HPC, Grid, and Cloud companies and governments. Before that he directed the German D-Grid Initiative and was an adjunct professor of computer science at Duke University in Durham and at NC State in Raleigh, and visiting scientist at the RENCI Renaissance Computing Institute at UNC Chapel Hill, North Carolina; Vice Chair of the EU e-Infrastructure Reflection Group e-IRG; and a member of the US President’s Council of Advisors for Science and Technology, PCAST.

Also posted in Events, HPC People | 1 Comment

Adaptive Computing introduces smarter Moab suite

Last month Adaptive Computing announced the latest release of the Moab Adaptive Computing Suite and a new product, Moab Viewpoint. The new features are aimed principally at banking, financial services and enterprise customers, but that doesn’t mean that Adaptive is walking away from their connections with HPC. I talked with Petter ffoulkes (yes, it’s supposed to be lower case), Adaptive’s Vice President of Marketing, to find out what’s in the new release and where the company is headed.

Adaptive Computing logo

In 2009 Adaptive Computing adopted its new name and embarked on a strategy that reflected the growth in its customer base from HPC to a mix of HPC and enterprise business. Since then the company has been building a name for itself among enterprise customers who need to manage their infrastructure as a service.

An important new feature for this crowd that the new Moab 5.4 brings is the notion of a transactional workflow that allows customers to build complex chains of action and reaction in response to automatically detected events that impact the enterprise IT infrastructure.

For example, let’s say you are a web company that sees a spike in activity following the launch of a new product. Moab 5.4 will see the surge and know that it needs to dynamically re-provision servers and add them to the web server pool. But let’s say that your product includes video instructions that all those new customers are going to watch — in this case Moab will not only avoid stealing resources dedicated to the video pool when it’s looking for new web servers, it will also watch that load and add to video serving resources in response to the spike in web load. This gives customers a way to watch and respond to events as a system, rather than in isolation. A smart addition.

Also new in this version of Moab is support for dynamically provisioning and migrating virtual machines. Adaptive’s Peter ffoulkes says that this is in direct response to conversations they’ve had with their customers and a reflection of the dramatic increase in the use of virtual servers to get closer to full utilization out of their pricey infrastructure. Moab’s Services Manager now hooks into IBM’s open source xCAT cluster manager to provision virtual machines based on VMware, KVM, and Xen (with support for Hyper-V coming). One possible use? As the load varies throughout any given day Moab may migrate VMs from several physical servers on to one central server, either shutting down or redeploying the newly freed resources. When the VMs support it, Moab will use live migration to make the move transparent to users of that VM’s services.

ffoulkes says that Adaptive has also spent a lot of time tuning the internals of Moab: the 5.4 release using 80% less memory than previous versions. This has the real impact of allowing a single instance of Moab 5.4 to manage much larger environments.

Adaptive Computing is also expanding their portal ambitions with the release of Moab Viewpoint 1.0, a webby interface for the Moab Adaptive Computing Suite of products. The Java-based Access Portal and command line interfaces are still there as well, but Viewpoint introduces what ffoulkes described as a “Web 2.0-like” feel to the creation and management of virtual private clouds.

HPC-ers, fear not.

But the Moab Adaptive HPC Suite is still available, and ffoulkes was mindful to sketch out the lines from these features to the HPC community. Certainly the dramatic reduction in the memory footprint is a bonus for everyone, and this release included other tuning of the internals according to ffoulkes.

We also talked about some potential new uses of HPC where the dynamic resource provisioning and workflow management that the new Moab offers could be a real benefit. For example, some are experimenting with the deployment of crisis response HPC centers that have to be able to turn on a dime to provide decision support in emergency situations: earth quakes, fires, and the like. Adaptive’s software could be used to manage that infrastructure, automatically shifting it from operations to support one type of calculation to address the emergency of the moment.

South African HPC

And speaking of HPC, Peter and I had talked the week before about how South Africa’s largest supercomputing facility, is using the Moab Adaptive HPC Suite to manage its “zoo” of architectures.

CHPC has a variety of architectures including AMD Opteron, Intel Xeon, IBM Power PC, IBM Power 4+ and Sun Microsystems SPARC processor based systems running a mixture of operating systems including UNIX (Solaris), Linux (SLES) and Microsoft Windows HPC Server 2008. The center uses Moab Adaptive HPC Suite to integrate and manage all of these resources (and their respective batch systems) as one pool, automatically directing tasks to resources as they become available and alleviating users from the burden of having to track what processors are available on which machines. They are also taking advantage of Adaptive’s capabilities to dynamically re-provision portions of their clusters from Windows to Linux, avoiding the need to guess ahead of time what the demand for either operating system is going to be.

Also posted in Datacenter operations, System Management | 1 Comment

Rock Stars of HPC: Thomas Sterling

This series is about the men and women who are changing the way the HPC community develops, deploys, and operates the supercomputers we build on behalf of scientists and engineers around the world.

As part of the team that explored the Beowulf computing concept, Thomas Sterling has already revolutionized the high performance computing community once. But the author of six books and a raft of journal and conference publications isn’t ready to leave the hard work of change to someone else. His sights are set now on changing the way we use and build the supercomputers of today and the exascale monsters of tomorrow.


insideHPC: You have such a rich history in this community and have been involved in so many milestone activities — what would you call out as one or two of the high points of your career — some of the things of which you are most proud?

Thomas Sterling

Thomas Sterling: In the broadest sense, I am most proud — or perhaps I should just say “grateful” — for being allowed to continuously stay in and contribute to the field of HPC for three decades since my graduate work at MIT. But if I had to pick two high points they would be Beowulf and ParalleX; the former more than a decade ago and the latter of current active engagement.

Beowulf was an experiment that explored the potential of low cost hardware and software in ensembles to perform real world science computation at unprecedented performance to cost. It was one of several projects in what has become known as “clusters” but I think had significant impact on the community. Today on the Top-500 list 1) the most widely used processor architecture is X86, 2) the most widely used network is Ethernet, 3) the most widely used OS is Linux, 4) the most widely used system configuration is the commodity cluster, and 5) the most widely used programming model is distributed memory message passing. Beowulf was the first research project to implement and explore this synthesis of elements, many in their inchoate phase. We recognized that community education was as critical as technology in accomplishing this paradigm shift and therefore through a series of tutorials and books we provided easy access to this approach. Admittedly I had no idea that it would dominate HPC. In some sense I was just lucky to be at the right place at the right time with a budget, a need, and a good idea.

ParalleX reflects the next HPC paradigm shift or more accurately is an exploratory project of key system elements from semantics through functionality to structures and mechanisms that in synthesis is catalyzing the 6th Phase of HPC.

insideHPC: As the co-author of six books, (an accomplishment in its own right), would you talk a little bit about your commitment to education: how we can both attract the next generation of HPC professionals into the community, and provide them with the experience-based training that they will need to be successful?

BookSterling: As a faculty member in a computer science department at a major state university, I have become keenly aware of the challenge of education to attract and train the next generation to the field of HPC.

I teach an introduction course to HPC at the first year graduate and senior undergraduate level (LSU CSC-7600). This course is a bit different from your conventional parallel programming course because it provides strong cross cutting themes of parallelism, performance, and system structures that determine the effectiveness of scalability on real-world systems. The students learn how to program in three different modalities (capability, cooperative, capacity) and corresponding programming interfaces (OpenMP, MPI, Condor) along with related system architectures (SMP, clusters, farms/clouds).

But such courses are not readily available at all of the thousands of colleges across the country. I hate the fact that a young man or women is robbed of the choice of personal goals and opportunities simply due to their demographics and socio-economic circumstances. To counter this truly unfortunate aspect of the American condition, I have been working, perhaps inadequately, in the realm of distance learning, exporting the course in real-time in high definition to a few remote campuses. My external partner in this has been Professor Amy Apon at University of Arkansas and my colleagues at LSU have been Chirag Dekate and Hartmut Kaiser (among others), in combination with a staff of technology types led by Ravi Parachuri. Petre Holub at Masaryk University in the Czech Republic was the force behind the high definition stuff strongly driven by Ed Seidel and Steve Beck, directors at the Center for Computation and Technology at LSU.

My next book will be my first textbook for this course; it’s a lot harder to write a textbook than my previous efforts. Jose Munoz has been an advocate of this work, and we hope to expand this to other communities. We have run a “Beowulf Bootcamp” for two summers that involved high school students to get them excited about going to college, and therefore (hopefully) to finish high school. With a dropout rate of one third such as that in Louisiana, we need to find ways to motivate our kids to aspire and excel. We should do an entire interview just on this topic. It’s so important and so inadequately addressed at this point.

insideHPC: Anyone researching your background can’t help but notice the long list of volunteer activities through which you have selflessly served this community. Why do you do that? I know from personal experience, there is seldom any real recognition for this type of service — the reward has to be internal. What motivates you to be so active in supporting the HPC community through these activities?

Sterling: Community engagement is critical to the success of the HPC field. In a sense, any discipline that is system oriented requires a community-system approach as it must engage the diversity of talents, expertise, and resources reflected by the many sub disciplines defining the diversity of components, technologies, and methodologies integrated within the single complex system. Therefore, it is out of necessity that one participate in, and sometimes lead, community-led forums that focus on the many enabling leading-edge issues.

When I and my colleagues conducted a number of meetings and tutorials on early Beowulf cluster implementation and application, we were doing this out of a necessity for technology transfer. When I and colleagues conducted a number of workshops related to Petaflops computing, then orders of magnitude away from contemporary capability, this was genuine pursuit of knowledge, perspectives, and concepts. Today, there is a dramatic surge in the domain of Exascale computing led by DOE and DARPA with strong NSF participation as well. These are important exploratory meetings both devising and guiding future work towards this challenging goal less than a decade away. I have been fortunate to be included in these initiatives.

Finally, I am honored by the number of presentations at conferences and workshops I am invited to give, and it is a pleasure to serve the community to the best of my ability in this way. In particular, I have enjoyed providing a presentation every year at the ISC in Germany on summarizing the accomplishments in the field of HPC during the previous year. This June will mark the seventh such talk, and I am grateful to Hans Meuer and the other program organizers for this opportunity. I guess, to be honest, it’s part of the fun.

insideHPC: Are there any people who have been an influence on you during your years in this community?

Sterling: Nothing is more humbling than reflecting on all of the colleagues who have contributed to one’s own accomplishments and in my case there have, and continue to be, many. To note any would be to fail to identify so many others. But with that acknowledgement of inadequacy, allow me to recognize a few who have had appreciable impact in chronological order:

Bert Halstead was my doctoral thesis advisor at MIT, and it was from him that I learned the critical importance of deep-thinking in the intellectual arenas of abstraction and models of computation, not just as a mental exercise but as important tools to innovation.

Jim Fischer of NASA Goddard Space Flight Center taught me the importance of enlightened but responsible management as a key element of collective achievement in HPC system advancement to serve science applications. It was he who empowered the Beowulf Project in the face of strong resistance, and prevailed.

Paul Messina, formerly of Caltech and currently at Argonne National Laboratory, has been my mentor in working within the HPC community, supporting it and being supported by it, and complementing individual accomplishment through the leveraging of group engagement. He led the CCSF initiative that in 1991 deployed the Intel Touchstone Delta at Caltech, which was the fastest open HPC system in the world at the time and the prototype of the successful line of Intel Touchstone Paragon computers of the 1990s. He and I co-authored my first two books together.

Larry Bergman of the Jet Propulsion Laboratory has been among my most important collaborators over almost a decade of accomplishment, performing at different times as my boss, my program manager, and my research partner. Without Larry, a decade of accomplishment in my professional career would most likely not have occurred. It was from him that I learned my limitations as he seamlessly complemented my strengths with his own to form an effective working partnership driving pursuit of advancement in HPC.

Ed Seidel created the new LSU Center for Computation and Technology that embodied a unique melding of resources at the state and federal level enabling aggressive and innovative HPC research both in end computational science and systems technology (software and hardware). It was within the context of this environment and the opportunities that it afforded that I have been able to conduct my most recent explorations and endeavors. At LSU CCT I am fortunate to work with a small group of research scientists who are making possible these researches of the new horizon of HPC: Maciej Brodowicz, Hartmut Kaiser, and Steve Brandt.

As pivotal as these people have been to me at different stages of my career, in many cases providing real role models as well, there is a group of colleagues who have and are both contributing to the field of HPC and to my own work as well. Since they are all very well known to your readership I identify them in no particular order without explanation: Bill Gropp, Bob Lucas, Dan Reed, Guang Gao, John Salmon, Al Geist, Jack Dongarra, Bill Carlson, Kathy Yelick, Horst Simon, Almadena Chtchelkanova, Thomas Zacharia, Burton Smith, Marc Snir, Pete Beckman, Hans Meuer, Jose Munoz, Peter Kogge, Fred Johnson, George Cotter, Bill Dally, Rusty Lusk, Bill Harrod, and Paul Saylor.

Oh, yes, and of course there was Don. What can I say?; without him you would not be conducting this interview in all likelihood. It’s one of those strange things, a chance meeting (at MIT, he was a freshman and I a finishing doctoral student) which could easily never have occurred and yet one’s life changes. Don Becker (now CTO at Penguin) and I collaborated on a number of projects but it was he who developed the first Beowulf systems with a group of young highly motivated implementers to realize my system concept and architectural strategy.

insideHPC: What “non-HPC” hobbies or activities do you have? If you ever really have time off — how do you spend it?

Thomas SterlingSterling: There is very little time for extra-professional pursuits but I do, when time permits, engage in three activities beyond HPC:

Sailing is the only activity that I can get involved in during which I truly forget about work. Of course I achieve this also during very scary airplane landings in thunderstorms, but this is not by choice so it doesn’t count as a hobby.

The study of history, in particular the 3rd Millennium BCE which is the late Bronze Age. I am fascinated with how small groups of people catalyzed in to large aggregations of what we would recognize as civilization enabled or driven by technological advances and ad hoc experiments in political science.

Got to love it; the brain fascinates me and for no useful purpose over the last decade I have found myself pursuing knowledge related to brain structure, function, and emergent behavior. Thinking about thinking, nature’s own recursion. I suppose Cognitive Science thinks about the process of thinking about thinking. But I’m not there yet.

insideHPC: Approximately how many conferences do you attend each year? What would you say is your percentage of travel?

Sterling: While I don’t travel anywhere nearly as much as Jack Dongarra, I average about two and a half trips per month although peak travel can reach four in any given month. I limit the number of general conferences to four or five a year but attend half a dozen focused workshops a year which I find far more useful and productive. Of course then there are the plethora of program and projects meetings. I get much of my work done in the Admiral’s Club at DFW.

insideHPC: How do you keep up with what’s going on in the community, and what do you use as your own “HPC Crystal Ball?”

Sterling: Your electronic publication, and that of your competition, proves very useful in keeping up with the day by day incremental advances and offerings of the industrial community with some valuable information on academic near-term accomplishments as well.

At this time of phase change in the field, the rate of advancement is too fast to be adequately represented by conventional professional society journals. These are valuable for archiving, but not for timely communication. I find that direct contact with contributing scientists and institutions both at organized forums like workshops and through unstructured side-bar private communication. I am amused by the conventionality of technical program committees of even small workshops on focused topics and their frequent fear of including new work-in-progress research.

My “HPC Crystal Ball” is more of a lens focused in deep space at a narrow part of the sky rather than the total space. I exploit blinders to narrow my scope of interest. I could, of course, completely miss a major important development. But I use foundational challenges of the logic and physics of the space of concern to inform about future directions of the field. It doesn’t always work but it has provided a unique viewpoint.

insideHPC: There are people in our community that are motivated by the science and discovery that we enable others to make, and people that are motivated by the science and engineering of HPC itself. Where do you fall on that spectrum?

Sterling: Like many, the answer for me is “both” but in selective areas in either case.

On the end science and engineering side, two major classes of problems are of interest to me. The first are two examples of classical supercomputing that I feel are essential to the advancement of civilization. These are the development of controlled fusion and the control at the atomic level of molecular dynamics. In the first case, we are challenged by the sheer scale of computation required at Exascale and beyond. In the second case, we are challenged by the need for dramatic improvements in strong scaling.

The second class of problems that absolutely fascinates me is the broad family of dynamic directed graphs as applied to knowledge management including but not limited to machine intelligence. Of course I am strongly focused on your latter domain of the science and engineering of HPC itself. This is particularly the case as the leading edge of the community are now considering revolutionary hardware and software techniques to extend delivered performance in an era of flatlined clock rates and processor core design complexity. This is a very exciting time to be engaged in HPC system research.

insideHPC: What do you see as the most exciting possibility of what we can hope to accomplish over the next 5-10 years through the application of HPC.

Sterling: As excited as I am about the computers themselves, their true value lies in their role as the third pillar of science (complementing experimental observation and theoretical modeling). More than ever, civilization must rely on the strength of HPC in devising new methods to addressing major challenges of climate, energy, medicine, design search-space optimization, and national security applications.

But my real interest lies in a very different area of application: symbolic computing. I believe we are likely to encounter a renaissance in intelligent computing, not seen since the 1980s, because the need for intelligent systems is growing for knowledge management, data mining, real-time decision making, declarative human interfaces, robotics, target recognition, and many other problems that need to directly manipulate abstractions and their inter-relations rather than raw data. I am particularly intrigued by the new opportunities afforded for self-aware machine intelligence by the Petascale computing systems coming on line with hundreds of Terabytes of main memory and their concomitant memory bandwidth which is critical for effective symbolic computing. I believe that by the middle of the next decade (yes, 15 years from now) symbolic applications will compete with or exceed the demands for cycles of numeric intensive applications.

HAL, are you listening?

insideHPC: What do you see as the single biggest challenge we face over the next 5-10 years?

Sterling: Viva la Revolution! Or to paraphrase a worn-out expression: “It’s the execution model, stupid!”

We are at the leading edge of a phase change in HPC; the sixth by my count over a period of as many decades. As previously suggested, the last two decades have been dominated by the communicating sequential processes (CSP) model which has served well for both MPPs and commodity clusters. But with the forced reliance on multi/many core and the flirtation with GPU accelerators, the model is stretching passed the yield point. A new model of computation will become essential before the end of this decade when current technology trends will demand billion-way parallelism, latency hiding for tens of thousands of cycles, global address space that are not statically nailed to specific physical hardware, the ability to migrate flow control as easily as data, and increase of programmer productivity (it just shouldn’t be this hard).

For me the biggest challenge we face over the next 5-10 years is: What is the new model of computation for HPC to replace CSP? And given that answer, whatever that may prove to be (yes, I have my suspicions), how will such a model’s set of intrinsic governing principles influence the co-design of the new programming models (sorry, new languages guys, there is no way around it), runtime and operating systems, and most exciting of all, the new core architectures. HPC system development is going to be fun again.

insideHPC: Any final thoughts you would like to share with our readers?

Sterling: HPC is multi-faceted. I spend too much time out on the fringe pushing the performance limits, but have come to appreciate the challenge of strong scaling required to shorten the execution time of problems that already take far too long (many weeks and months) but cannot effectively employ anywhere near the available processing resources. This isn’t some future Exaflops problem, but real problems in AMR and molecular dynamics trying to be worked today. To serve these effectively will require a real change in how we build hardware core architectures because it is the inefficiencies in overhead, latency, and contention as well as poor use of available parallelism that is inhibiting better scaling for these and other current applications.

From this you may infer that “no” all the focus on Exascale is not the right thing, but many of the challenges to Exaflops performance of the future are the same as some of the challenges to strong scaling of problems in the present. The current generation of Petaflops machines, with the possible exception of Roadrunner, is really the end of the classical static CSP era. Core architectures of the future will have to incorporate hardware mechanisms that recognize the shared context of thousands or millions of like-cores rather than passing off all such interchange to the I/O subsystem which is not optimized for inter-thread cross-system interoperability.

Making a machine easier to use requires a machine that is easier to use and that is not what we have been building over the last two decades. Too many people, including those in sponsoring agencies, wish the problem to be tractable through uniquely software means. It is not a software problem, at least not exclusively. It is a multi-level system problem; yes, including software but only in conjunction with an enabling hardware solution as well. The new DARPA UHPC program recognizes this truth and is pushing for a holistic perspective towards innovative solutions. A consequence of that is that productivity will be enhanced and users will find systems easier targets to employ.

I will make an outrageous prediction: Exaflops systems in 2020 will be easier to use than Petaflops machines are in 2010.

Also posted in HPC People, Rock Stars of HPC | 4 Comments

Meeting notes: HPC User Forum explores manufacturing and more

Contribution by regular reader Steve Conway, Research Vice President in IDC’s High Performance Computing group.

The thirty-sixth High-Performance Computing (HPC) User Forum meeting took place at the Dearborn Inn in Dearborn, Michigan, April 13-14, 2010.  The meeting was held for the first time back to back with the annual DICE Alliance meeting.

IDC said the global recession hit HPC hard, dropping revenue from $10 billion in 2008 to $8.6 billion in 2009, a decline of 11.6%. But the “supercomputers” segment for HPC systems priced at $500,000 or more grew 25% in 2009, and the top bracket for systems sold for $3 million and up jumped 65% last year. In 2009, IBM and HP finished in a statistical tie for market leadership, while Dell and Cray gained market share.

The meeting’s main topic featured a strong cast of speakers from tier one and supply chain manufacturers: BMI, Boeing, BP, Caterpillar, Chrysler, Ford, General Electric  General Motors, L&L Products, Metacomp, Procter & Gamble, and R-Systems – not to mention organizations aiming to boost HPC usage in SMB “missing middle”: the Council on Competitiveness, National Association of Manufacturers, National Center for Manufacturing Sciences, NCSA, and USC-ISI.

L&L’s Steve Reagan brought home the benefits of HPC use in the supply chain by stressing that his 800-person firm could not compete without being able to use HPC to design solutions for Ford and others on short notice, typically within a 4-5 week window in the client’s product design cycle.

In a panel session, ISVs outlined their strategies for boosting the scalability of HPC applications and addressing the ballooning issue of software licensing on multicore and heterogeneous platforms. Another session looked at the current realities (as opposed to hype) of cloud computing for HPC. The verdict: it’s a slam dunk for some (e.g., CERN) but too early to call for others.

Government program updates from DoD HPCMP, INCITE, NASA and NSF rounded out the HPC User Forum agenda. The DICE Alliance 2010 agenda featured keynotes from Vic Reis and Gil Weigand of DOE, along with a panel on HPC data center power and cooling and other talks. A joint dinner and touristic experience at the amazing Ford Museum gave attendees from both meetings a chance to mix and mingle.

The next HPC User Forum meeting takes place September 13-15 in Seattle. For more information, go to www.hpcuserforum.com.

Steve Conway is a Research Vice President in IDC’s High Performance Computing group which sponsors the HPC User Forum.


Also posted in Events | 1 Comment

Platform turns its HPC management software up a notch

Platform hopes its HPC Enterprise Edition will encourage commercial adoption of HPC

Platform logo

This week Platform Computing announced that it was turning some of its focus back to HPC, after spending the past year or so aiming at a more traditional large-IT audience. On Monday the company announced the launch of HPC Enterprise Edition, a package of tools that Platform believes will help “enterprise” (I read that as either non-traditional or inexperienced) users in the commercial market ease into HPC.

Platform HPC Enterprise Edition brings together several pieces of Platform’s HPC toolkit into one offering that will help customers manage their cluster from deployment and management to monitoring and job submission. It includes Cluster Manager, LSF, Platform MPI (recall that Platform bought MPI stacks from both HP and Scali), Application Center, RTM monitoring dashboard, and ISF Adaptive Cluster into a single, web-based interface for the entire system lifecycle.

Enterprise Edition is the evolution of what Platform used to call Platform HPC Workgroup Manager — now called HPC Workgroup Edition — but without the 32-node limitation of that solution. Although it could theoretically be used to manage systems of arbitrary size, it really is designed for smaller clusters and COTS applications.

Platform hopes that it can ease the cluster integration and training worries of the IT admin group by providing its pre-integrated tools for installation, management, and monitoring. HPC EE will probably run on your favorite x86-64 Linux, with support for popular distros such as Red Hat, SUSE, CentOS, and Scientific Linux. If you are a mixed Windows/Linux shop then Adaptive Cluster, which integrates with LSF to reconfigure a pool of nodes for either Linux- or Windows-based jobs on the fly, will also address the real requirement that companies not have to reinvent their application infrastructure to move to HPC.

Getting users over the hurdle

But it is the inclusion of Platform Application Center and the pre-integrated COTS applications templates that will help get the users of these systems over some of their entry hurdles. Launched during SC09, Application Center is a shrink-wrapped role-based portal framework. Users or system managers can create templates for various applications and types of jobs (using XML) that allow users to launch applications and manage files from within a web portal. HPC Enterprise Edition comes with seven templates that integrate applications popular with Platform customers: Abaqus, Ansys, Blast, Eclipse, Fluent, LS-Dyna, and Nastran. Visualization is not supported out of the box with HPC Enterprise Edition, but users can upgrade their installation to enable vis in the portal.

The pre-integration of those seven application templates is part of what William Lu, Platform’s director of HPC marketing, emphasized when I spoke with him as the shrink-wrapped advantage. “HPC Enterprise Edition groups our existing commercial-grade technologies together into a single solution,” Lu says. “It is not just a collection of open source technologies.” Whether this line of marketing continues to hold up over time is entirely up to the open source community.

If you’re going to have an HPC hardware vendor as your launch partner…

Cray logo

When asked about pricing Lu said that HPC Enterprise Edition will run customers “a few hundred dollars per node,” and that Platform expects the majority of licenses will ship out through hardware partners, although they will also sell to users directly through their existing sales channels (you can’t buy this online at the website — not yet, anyway). HPC Enterprise Edition is already available, and Platform is launching with a significant HPC hardware partner: Cray. Cray is rebranding the software with its own Cray Cluster Manager nameplate and shipping it to CX1 and CX1000 customers. Although Lu wasn’t ready to talk about other partners yet, he did mention that they are in discussions with Dell and HP.

Interestingly, Cray’s Ian Miller (senior vice president of the productivity solutions group and marketing) told me that Cray Cluster Manager is now the default clusterware on both the CX1 and the CX1000. The Cray CX1-iWS sold by Dell won’t ship with CCM, which makes sense since it is entirely a Windows platform. “However, some customers have their own preferred cluster management software,” says Miller, “and some of our resellers are also aligned with other partners in this space, giving us the flexibility to be able to handle these situations too.”

Also posted in Applied HPC, HPC Software, System Management, Tools | 1 Comment

Looking forward to ISC, an interview with conference founder Hans Meuer

An insideHPC Exclusive Interview with Hans Meuer, Co-founder and Organizer of the International Supercomputing Conference, ISC.

While the HPC community has seen its share of vendor companies that haven’t been able to demonstrate the staying power needed for survival, it’s quite the opposite story when it comes to conferences focused around this community.

The ACM/IEEE SC conference series will hold its 23rd annual meeting in New Orleans in November, the HPCC insider’s conference known as the Newport Conference has wraped up its 24th annual event in Newport, Rhode Island, and this year marks an impressive 25th anniversary for Europe’s most important HPC event, ISC’10, the International Supercomputing Conference.

insideHPC is pleased to share this exclusive interview with Prof. Dr. Hans Werner Meuer, the man behind the ISC conference.


insideHPC: First of all, congratulations on celebrating the 25th anniversary of the International Supercomputer Conference. It is truly a remarkable achievement. When you reflect on the history of this conference and how much you have achieved, what stands out as something you are particularly proud of?

Hans Meuer

Meuer: We started out in 1986 as a seminar at the University of Mannheim with 81 local attendees and 11 speakers. We are thus the HPC conference with the longest tradition worldwide. Over time we evolved to become the International Supercomputing Conference (ISC). For our 25th Anniversary in 2010, we are expecting a visitor crowd of 2,000 and as many as 140 exhibitors from all over the world. And, around 200 experts will be sharing their expertise at ISC’10.

We are proud to have realized early on that the IT world needs two major HPC events per year — the SC in the US alone is not sufficient to cater for such a fast-evolving sector like supercomputing. ISC is a good addition and also an alternative to the SC conference.

insideHPC: Will the difficult global economic situation have an impact on the conference this year, and what, if anything, are you doing differently this year because of the economy?

Meuer: Even during ISC’09, we were confronted with the global financial crisis and a poor economic situation. It was exactly one year ago that many of our sponsors predicted that our visitor numbers would plummet by 50 percent. On the contrary, our visitor numbers soared up by 21 percent. Even though there is of course no guarantee as to what things will be like this year, we will stick to our proven strategy. We have increased our efforts to extend and improve our overall concept and in particular our conference program and I believe we have succeeded — this year’s program is the best in our 25 years of history.

insideHPC: What do you believe will be some of the highlights of this year’s conference?

Meuer: Let me summarize the highlights of this year’s conference:

  • An absolute MUST will be High Performance Computational Life Sciences — The Challenge for HPC Systems
  • The panel about hitting the Exascale frontier in the future, including the keynote from Prof Dr. Horst, Zuse 80 Years of Computing: From Konrad Zuse to Exascale Computing.
  • The session Parallel Computing in the Years to Come focuses on the development of parallel computing tools, especially under the point of view of millions of cores.
  • The HPC-enabled simulation of global warming in the session Supercomputers for Modeling the Climate & the Roles of Energy Production in Climate Change
  • And the upcoming supercomputer countries in Evolving New HPC Markets: China, Middle East & Russia.

insideHPC: Do you have many first-time exhibitors coming to ISC10? Will the number of exhibitors be up from last year?

Meuer: This year we see a 20 percent surge in new exhibitors — about 30 businesses and scientific institutions will be exhibiting at ISC for the first time. We are hopeful to host around 140 exhibitors compared to 120 in 2009. Our exhibitors come from all continents, apart from Australia. For example, Brazil, South Africa, China, Japan, Russia, US and almost all European countries will be represented at our show.

Even though we increased our exhibition space by 15 percent, we are almost sold out at the beginning of February. There are only very few slots left in the industrial area.

Moreover, sponsors continue to value ISC as a core HPC event, which further contributes to the success of the exhibition.

insideHPC: Many people speak fondly about the networking events at ISC. How important are these networking events to the conference?

Meuer: The ISC networking events have become an integral part of our conference. They offer an effective instrument for establishing, maintaining and extending professional contacts. What participants appreciate most is the casual and relaxed atmosphere that makes it particularly easy to meet new and interesting people.

It is important to point out that these networking events are no longer limited to ISC itself but are now also taking place before and after ISC. When choosing the locations for our conferences, one crucial factor is to offer suitable networking locations also outside of the convention centers. Fortunately, with Hamburg we have found a location that offers ideal conditions.

As organizers, we would like to invite our participants and exhibitors to join our 25th Anniversary opening party on May 31.

insideHPC: What advice do you have for a first-time attendee to help them get the most out of ISC10?

Meuer: This year’s conference program offers a broad spectrum for all attendees. New attendees will benefit significantly from the seven tutorials on Sunday, May 30. Topics like GPUs, Manycore, Parallel Programming, Interconnects, LINPACK and HPC Software will be covered extensively in the tutorials.

For non-HPC experts whose day-to-day work demands HPC solutions, the Crash Course on High Performance Computing, on Wednesday, June 2 will prove to be beneficial.

And naturally, they shouldn’t miss the opening party, networking events and the Hot Seat session which is unique to ISC.

Finally, I would advise attendees to register before April 30 because the early bird fees will disappear forever after that.

insideHPC: I’ve heard some people refer to ISC as the “sister event” to the annual SC conference. But in fact, the two conferences are not linked in any way — is that correct?

Meuer: We are completely independent of the SC, even if we certainly acknowledge that the SC is the largest HPC event around. I think that we can both learn a lot from one another — and we surely make use of this opportunity. As for myself, I participated in all SC events so far — starting with Orlando in 1988 to Portland in 2009. Many HPC key players from the US have been ISC regulars from the very beginning of the conference, as speakers as well as attendees. As regards the involvement of the SC officials in ISC, I feel there is room for improvements. I believe, however, that this is due to a lack of continuity in the SC which is caused by the annual rotation of responsibilities.

Finally I take the opportunity to thank insideHPC for this wonderful opportunity.

See you soon in Hamburg!


For more background on Prof. Dr. Hans Werner Meuer, check out his CV [PDF].

insideHPC has already started running special feature coverage of ISC’10. To submit items for editorial coverage related to ISC’10, send us an email.


Also posted in Events, HPC People | 1 Comment

Advertisement

Intel Truescale White Paper Ad

Video Archive

insideHPC.com is a production of insideHPC, LLC. © 2006-2013 Sitemap