Why Hardware is Leaving Software Behind

Print Friendly, PDF & Email

In the first report from last week’s PRACEdays15 conference in Dublin, Tom Wilkie from Scientific Computing World considers why so much Exascale software will be open source and why engineers are not using parallel programs.

Tom Wilkie, Scientific Computing World

Tom Wilkie, Scientific Computing World

Disruptive change is needed if software is to take full advantage of next-generation and Exascale machines, delegates to the PRACEdays15 conference in Dublin were warned last week. Researchers and engineers may have to reverse the way in which they approach the task of writing software by starting from parallelism and only then thinking of the mathematics, Mark Parsons, director of the UK’s Edinburgh Parallel Computer Centre (EPCC) said.

There is currently too little focus on usability, he told his audience: “It’s not got any easier to use HPC over the past 20 years. The amount of hand-holding we [at EPCC] have to give an SME for it to do its first simulation is enormous.” Usability is not just an issue of programming languages, he went on, but also of applications.

Open source software may not scale

pracedaysReporting on the EU-funded Cresta project to review software and encourage co-design for Exascale, he warned that two major Open Source software packages, OpenFOAM and OpenIFS, would not scale to the massively parallel architectures of the future. “It will cost many man years to port the codes to Exascale and massive parallelism by 2025,” he told the meeting.

The Cresta project, which has now completed its work, had revealed that OpenIFS “is in no way appropriate for Exascale,” he continued. The community supporting the code had taken the point to heart and were starting to look at how they could restructure the software.

OpenIFS is, technically, not open source despite the name, but perpetual licences for institutions, though not individuals, are avail free of charge. The project is led by the European Centre for Medium-Range Weather Forecasts (ECMWF) to provide an easy-to-use, exportable version of the IFS system in use at ECMWF for operational weather forecasting.

The situation for OpenFoam, which is one of the most widely used CFD codes, is worse, however. “It is not even a petascale code. It’s never going to get to Exascale in its current form,” Parsons said. In fact, the Cresta workers “gave up on” OpenFoam.

But exascale software will be open source

The importance of open-source software was underscored at the meeting in comments by Eric Chaput from Airbus who said that in future the company would be using open source software as the basis for its engineering simulation work. It would not be relying on commercial software from the independent software vendors (ISVs) because of the cost of licences. Commercial software was too expensive, even for a company of the size of Airbus, he explained, in a way because of the company’s size – most licencing models were ‘per user’ and Airbus had so many engineers and researchers carrying out simulations and other tasks that to pay so many licences was more than the company was willing to do.

In any case, according to Lee Margetts, lecturer in computational mechanics at the University of Manchester, the ISVs see their market as desktops and workstations, not HPC, let alone Exascale. Reporting on the results of a survey of 250 companies conducted late last year for NAFEMS, the international organization set up to foster finite element analysis, he told the meeting that ISVs were moving towards supporting accelerators such as Nvidia GPUs and the Intel Xeon Phi coprocessor. However, the idea of rewriting their code to port to FPGAs, ARM processors — or to cater for such features important to Exascale as fault tolerance and energy-awareness – was, for many ISVs, ‘not on their roadmap. Some vendors don’t know what it is,’ he said.

Most engineers use workstations, not HPC Clusters

In paying so little attention to the extreme end of computing, the ISVs are simply following the market, according to his presentation. Margetts’ survey looked at the computing platforms that engineers are using to carry our engineering simulation and the survey revealed that the standard workstation and laptop are where most engineers do most of their work.

Those ISVs that are aware of developments in Exascale computing see it as a high-risk area. They see several radically different approaches being taken to the hardware and no clarity as to what will be the final design. The ISVs are not going to deliver Exascale software, he continued, and hence Exascale engineering software needs to be open source.

The EPCC’s Mark Parsons reached a very similar conclusion based on his view that the first Exascale machines are ‘going to be large and difficult systems’ that would be power hungry and would employ parallelism at 500 million to a billion threads. In contrast, he pointed out, most users are not taking full advantage of the parallelism that already exists today. Even on ARCHER, the UK’s national supercomputer facility at Edinburgh, most jobs were running on a few tens of thousands of cores not hundreds of thousands, he said.

Software is being left behind

In Parsons’ view: “Hardware is leaving software behind, which is leaving algorithms behind.” In contrast to the relentless changes in hardware, “algorithms have changed only incrementally over the past 20 years. Software parallelism is a core challenge. We need to see codes moving forward.”

However, there was a strand of pessimism in Parsons’ presentation. Although he thinks that Europe needs to spend as much on software as on hardware development in its Exascale research program: “I don’t see that money being spent.” Work was being done on the parallel implementation of algorithms in the USA but much less effort was being undertaken in Europe.

The meeting heard that the issue is not one of Government or EU funding alone, nor of the researchers: part of the problem is the inherent conservatism of the end-users of applications. Chemists might have confidence, gained over years of use, in codes that scale to 250 nodes, say, but will not trust new codes that scale to thousands of nodes. There is a very similar situation in engineering, Parsons concluded, where companies would stick with, say, Ansys’ commercially available software rather than put their confidence in a new highly scalable code.

This story appears here as part of a cross-publishing agreement with Scientific Computing World.

Sign up for our insideHPC Newsletter.

Comments

  1. Recently, I attended some interesting presentation made by Cray, Greg Clifford. He showed a bunch of ISV applications which are able to scale up to 10000 cores and more, including our solvers Altair RADIOSS for Crash, Safety & Impacts and Altair AcuSolve for Computational Fluid Dynamics. He concluded that using 1000 cores per computation will be the default soon in mechanical engineering.
    Well, 1000 cores is not ExaScale but this is already near 10x more than what is currently used today for many load cases!

    Initiative like PRACE [Altair, PSA and Ecole Polytechnique/LMS were awarded a PRACE preparatory access for a project related to rupture during automotvie simulations using RADIOSS] and partnership with HW vendors like Cray are really helpful for vendors like Altair to improve solver scalability at large to scale, up to 10000 cores and beyond.

    The other difficulty is to deal with more and more complex multi-physics options that need to run altogether and achieve same level of performance while some options might be less scalable than others. This is a constraint which is certainly more severe for general purpose legacy codes than niche solver focused only on one type of physics.

    Like Altair did, a huge amount of work was performed in the past to adapt those legacy codes to MPI.
    And continuous improvements have permitted to reach this milestone of running with 10K cores and more today, pushing the limit of pure MPI and hybrid MPI OpenMP enhancement.
    This is state-of-the-art today, but I agree with the author, in the longer run, for true ExaScale computing, pushing further the scalability, this is probably not with MPI type of programing that such ambitious goal would be achieved!
    We would need more asynchronism & parallelism, dynamic load-balancing, fault tolerant… ideally embedded into a high level programing model able to deliver performance and robustness at large scale.
    This is true for both commercial and open source codes!

    Furthermore, it is difficult to anticipate all hardware changes and effects on HPC software design in the 10 coming years, behind the current trend to go to many cores.
    So, stay focus and ready to move forward!