Community Opinion: The Exascale Report Asks

Print Friendly, PDF & Email
Q: The term ‘co-design’ is becoming as ambiguous as cloud computing. How
do you (or your organization) define ‘Co-design’ as it applies to
exascale research and development – what do you see as the opportunities
– and what do you see as the misconceptions around this term?

A: The more “co-design” gets used by everybody to describe whatever
their favorite approach is, the more it becomes like “cloud”, “HPC
going mainstream”, etc. as terms that have simply earned mixed
credibility due to over-use. The basic message – we must get the
different communities working together for success (hardware designers,
HPC software professionals and don’t forget the scientists) – is a
very desirable goal.

But will we really see future hardware products with fundamental design
influences from the exascale community? I think this is unlikely without
significant investment from the user side – industrial users of HPC,
government funders/users of HPC, etc. Funding drives R&D , and if the
funding for design does not come from exascale focused efforts, it will
come from mass market product drivers, which likely will not reflect
exascale needs. I do think many advocating co-design get this and are
trying to convince funders. But equally, the software and end user
community must accept that co-design means they have to change as well.
This is the often forgotten side of the co-design debate. Your
applications, algorithms, etc. will have to adapt (maybe radically) to
meet the hardware – even if the hardware evolves closer to
programmer/user ideals.

Co-design relevant to the emerging widespread use of petascale too, not
just exascale. The users of modern supercomputers based on lots of
parallelism (petascale, manycore, GPU, whatever) – researchers and
engineers – must accept that software has to evolve with the hardware.
Co-design must mean a partnership of technology, hpc software engineers,
and domain scientists – it cannot be code for “change your hardware
to suit our applications”.

========

Q: The term ‘co-design’ is becoming as ambiguous as cloud computing. How
do you (or your organization) define ‘Co-design’ as it applies to
exascale research and development – what do you see as the opportunities
– and what do you see as the misconceptions around this term?

A: Co-design may have different meanings, some even for activities that
used to be labeled “interdisciplinary” one decade ago! It may be
worthwhile underlining that we have seen “co-design” in recent times
taking different meanings in US and Europe. In US, this is very much a
funding effort for Government agencies to drive a specific
project/program. In Europe, this effort takes the form of joint
collaborations to improve the performance (measured according a number
of criteria) of a given SW (application, tools and software stack)
package to better take advantage of highly parallel machines or future
architectures.

Let’s concentrate on co-design for exascale application software. This
is a domain in which on one hand, we need to prepare the requirements
of future exascale applications, the use cases for production campaigns
and to work on the present limitations; on the other hand, application
developers need to have a good understanding of the features of
forthcoming architectures in order to change or rethink their models or
algorithms, for example to avoid data transfers on all levels -which we
know is very costly- or to avoid being limited by the decreasing ratio
memory available per thread. The European Exascale labs created in 2010
by Intel and leading European HPC organizations play an important role
in the software co-design and feedback loop; we want to contribute
significantly to building this knowledge base in Europe. For example, in
the Paris lab we work with several applications, characterize them using
advanced tools and understand exascale performance bottlenecks. In the
Leuven lab we develop numerical kernels and test their performance using
a new simulator for many-core architectures. And finally, in Juelich, we
build a cluster of KNCs and we develop cluster management software to
run it. All chapters highly complementary which is helping to position
Intel’s efforts as a research partner in Europe for Exascale
architectures and software.

========

Q: What progress is being made in collaborative exascale “co-design”
efforts? Has anything been done beyond forming numerous working groups
and meetings, and the many discussions and articles on why we need a
global co-design effort?

A: There is general agreement that the only way to address the major
challenges of exascale computing such as energy consumption, scalability
and resilience is through co-design. These challenges touch a range of
stakeholders such as application developers, HPC facility managers, ISV
and HW suppliers. To start with, facility efficiency has become critical
in recent times because centres saw their capital and operation
expenditures grow very steeply from the Terascale era to the multi
Petascale. A good deal of creativity has been injected for the power and
cooling capacity, result of the work of teams made of engineers
specialized in machine room building, air flows, thermodynamics,
electricity and IT vendors at least. The driver is easily identifiable:
cost control for installation and operations. This is a good example of
a certain form of co-design, targeting the facility management segment,
ands it will benefit the Exascale.

The same is now happening in software development, and leading research
institutions are putting top priority on software re-engineering or new
software development. In the application segment, drivers for HPC
efficiency can be diverse: it could be time to solution, complexity or
size of the problem, minimizing the power or the cost for solving a
problem, or extreme resiliency for example. Hence the problem has more
than one angle for defining success. However, we see three trends
emerging across exascale software efforts: first, there is a healthy
tendency among leading scientists to FUNDAMENTALLY rethink how to model
their problem or recode the core of the numerical methods; second many
research labs or industry are willing to hire more skilled, excellent
programmers to join the scientific and engineering arena for
breakthroughs in highly parallel programming; and finally there is a
growing demand for highly scalable tools to be used as support for
decision makers, both for hardware and software. This creates many good
opportunities for scientists and tool developers to collaborate and
innovate while working on the exascale challenge. It is also an
excellent case for competence transfer between high education centres
and large research institutions, both public and private, as there is
not enough young talent yet to work on extreme parallel computing. To
address such challenges, we are seeing requests for cooperation, from
industry and from academia to jointly work on improving application
performance on highly parallel machines because developers understand
they need to get a better grasp on the machine architecture and hardware
designers need to work on representative user cases for stretching
system performance.

========

Closing Comment

It seems very few people in the community are sold on Intel’s
declaration of exascale leadership. Intel’s press briefing at ISC
raised a lot of eyebrows and resulted in mostly negative reactions from
a broad audience – from system architects to U.S. government funding
agency representatives.

So, The Exascale Report asked, “With this commitment, has Intel taken
the lead in the race to exascale?”

NVIDA certainly doesn’t think so. According to NVIDA Fellow, Dr.
David Kirk, “NVIDIA has set a target to build all the right
technologies to enable an exascale machine in 2018. This includes
dramatic reductions in power in the GPU, while achieving much higher
performance. These are not based on process technologies – 3D process
technology is a very small part of the exascale solution – but on new
circuit design techniques. The first of these improvements will be seen
in our next GPU architecture, Kepler, which will offer a big jump in
performance / watt over our current GPU architecture, Fermi. We are also
working on new software methods like global shared address spaces with
the HPC community to address the programmability and scalability of
exascale applications. In addition, exascale has to be driven by
standard technologies that have a high volume commercial business behind
them. Doing custom and proprietary HPC processors like Intel MIC will
not fly because it will carry a steep price tag and as such will not be
adopted by industry. We have seen this before and we will likely see it
again. ”

Andrew Jones
Vice-President
HPC Services and Consulting
Numerical Algorithms Group (NAG)
Marie-Christine Sawley
Exascale Labs Director
Intel Paris
Marie-Christine Sawley
Exascale Labs Director
Intel Paris
Dr. David Kirk
NVIDIA Fellow
NVIDIA