Entries filed under “Collaborations”

Partnerships between vendors or institutions to develop, deploy, or productize HPC technology

Collaboration with IBM Aims to Make U.K. World Leader in HPC

Today the UK’s Science and Technology Facilities Council and IBM announced collaboration to create one of the world’s foremost centres in software development. With a goal to establish high performance computing as a highly accessible and invaluable tool to UK industry, the new International Centre of Excellence for Computational Science and Engineering will be located at STFC’s Daresbury Laboratory in Cheshire.

This new cutting edge research centre will bring together experts from the science and business worlds,” said Minister for Universities and Science David Willetts. “It will be vital in realising the Government’s ambition for the UK to be a world leader in high performance computing and will provide industry and academia with the tools needed to drive growth and innovation. It will also build on the strengths of the Daresbury Science and Innovation Campus – which recently became an Enterprise Zone – and shows that our £145 million investment in e-infrastructure is attracting international companies to the UK.”

Under the initial 3 year agreement, STFC will invest in IBM’s most advanced hardware systems, most notably the BlueGene/Q and iDataplex. With a peak performance of 1.4 Petaflops, the Blue Gene/Q system at Daresbury will be one of the UK’s most powerful machines.

Read the Full Story.

Also posted in HPC | Leave a comment

Terascala Looks to Whamcloud for Level 3 Lustre Support

Today Whamcloud announced that the company will now provide Level 3 Lustre support for Terascala storage appliances.

Lustre has proven to be the fastest parallel file system available today. At Terascala, we’re seeing huge interest in fully supported, packaged Lustre solutions from both traditional HPC customers and commercial organizations that need the performance but lack the technical resources to build and manage their own solutions,” said Steve Butler, CEO at Terascala.  “Our agreement with Whamcloud will enable us to continue to deliver superior support to new and existing customers.”

To speed the analysis of big data sets by large interconnected server installations, Terascala builds high-throughput, high-capacity storage appliances that combine the Lustre file system and extensive analysis and optimization tools. According to the company, commercial organizations are increasingly looking to Terascala solutions to improve their ability to process pools of big data, which until now, has been limited due to the throughput abilities of standard file systems.

Read the Full Story.

Also posted in HPC, HPC Hardware, HPC Software, Storage | Leave a comment

Dell Spikes Xeon E5/R Servers with Nvidia GPUs

Today Nvidia announced that, for the first time ever, the company’s GPUs will be available in Dell PowerEdge rack and tower servers. Tailor-made for technical computing, the new servers combine 512-core Nvidia Tesla M2090 GPUs with the latest Intel Xeon E5/R CPUs based on the Sandy Bridge microarchitecture.

GPU computing is growing in demand and adoption based on its ability to provide a unique combination of ultra-high performance and energy efficiency,” said Virginia Swink, executive director of Dell Server Solutions. “Integrating accelerator technologies in Dell’s PowerEdge portfolio opens up new usage models, and extends our ability to deliver more cycles to a broader base of scientific and commercial users.”

Read the Full Story.

Also posted in Business of HPC, Compute, GPUs, HPC, HPC Hardware | 1 Comment

Eadline’s Retrospective on Clustered HPC

Douglas Eadline has posted a brief history of parallel computing over at Admin HPC. From Beowulf clusters to multicore to hybrid computing, the industry continues to make the difficult into mainstream, and it’s nice to see this perspective from someone who’s been there from the get-go.

One of the driving slogans of early Beowulf Clusters was “cheaper, better, faster.” To a large degree, this has been the case. Because of the many changes in the market, it might be worthwhile to rethink how best to the use mainstream hardware for HPC. In many respects, the x86 HPC market has become legitimized, and it now shows up on many marketing pie charts. Other components of the HPC market, including interconnect, storage, and software, have also helped move HPC clusters from the back of labs and server rooms to the front of a respectable and sizable market. Given the hardware pressures facing the market, however, it might be time to set up some “wrong” hardware on those old shelves and see what happens.

Read the Full Story.

In case you missed it, this video features Douglas Eadline, Don Becker, and others providing the backstory for the Beowulf Bash at SC10 in New Orleans.

Also posted in Compute, HPC, HPC Hardware, Video | 1 Comment

PRACE Launches Advanced Training Centres

PRACE, the Partnership for Advanced Computing in Europe, has selected six of its member’s sites as the first PRACE Advanced Training Centres. The mission of the PRACE Advanced Training Centres (PATCs) is to carry out and coordinate training and education activities that enable the European research community to utilise the computational infrastructure available through the organisation. The long-term vision is that such centres will become the hubs and key drivers of European high-performance computing education.

The chosen sites are Barcelona Supercomputing Center, Spain; CINECA – Consorzio Interuniversitario, Italy; CSC – IT Center for Science, Finland; EPCC at the University of Edinburgh, UK; Gauss Centre for Supercomputing, Germany; and Maison de la Simulation, France. In addition to providing education and training opportunities for computational scientists in Europe, the training centres are also the main bodies responsible for producing materials for the PRACE training portal: www.prace-ri.eu/training.

Contemporary HPC systems offer unprecedented computing power and their architectures are constantly evolving. The on-going challenge has always been to up skill scientists and programmers to maximise efficiency and research output on such systems,’ said Dr Simon Wong, leader of the training work package in PRACE-2IP and Head of Education and Training at ICHEC in Ireland. ‘PRACE has shown its commitment to address this challenge by establishing the PATCs to significantly expand its training programme.”

There will be at least one PRACE PATC in operation at any one time, but the geographical locations of centres, assessed every two years, will vary over time. Training events may also be organised at locations external to PATC hosting sites.

This story originally appeared on HPC Projects. It appears here as part of a cross-publishing agreement with Scientific Computing World.

Also posted in HPC | Leave a comment

Video: How HPC Wales is Powering Economic Change

In this video, David Craddock from HPC Wales presents: How HPC Wales is Powering Economic Change.

HPC Wales is an innovative collaboration aimed at giving businesses and universities involved in commercially focused research access to the most advanced and evolving computing technology available. HPC Wales will invest in state-of-the-art computing technology, infrastructure and facilities on a pan-Wales basis, high level skills development and training and provide tailor made support services to business.

Download the MP3 * Subscribe on iTunes * If Dropbox is blocked, download from this Google page.

Also posted in Digital Manufacturing, HPC, Video | Leave a comment

PRACE Research Initiative Welcomes Denmark, Israel, and Slovenia

The European PRACE Research Infrastructure has announced three new members: Denmark, Israel and Slovenia. Now 24 member countries strong, PRACE is a non-profit organization with a mission to: “enable high impact European scientific discovery and engineering research and development across all disciplines to enhance European competitiveness for the benefit of society.”

This clearly shows the high-level of interest in High-performance Computing (HPC) by so many European Member States and Associated States to the Framework Programme for Research and Technological Development in Europe. PRACE aims to be the living proof of the successful exploitation of HPC as a tool for innovation in Research and Industry in Europe”, said Dr. Maria Ramalho, Chair of the Board of Directors of the PRACE Research Infrastructure.

Read the Full Story.

Also posted in Computing Research | Leave a comment

RENCI, Duke, and IBM to Build Experimental ExoGENI Networking

What will the high speed networks of the future look like? This week RENCI at the University of North Carolina, Chapel Hill and Duke University announced a collaboration with IBM to build a nationwide test bed for next-gen networks. As part of the National Science Foundation’s GENI initiative, the project will deploy and operate 13 ExoGENI sites at research universities and labs across the U.S.

Future computer science and applied research must bring together computation, storage and network capabilities on a global scale to address emerging complex problems related to network science, large-scale distributed computations, large dataset mobility and future network architectures,” said Baldine. “With ExoGENI researchers will gain a global, elastic reconfigurable platform to conduct such research.”

Read the Full Story.

Also posted in Computing Research, HPC, HPC Hardware, Network | Leave a comment

Video: OpenMP Seeks Participation at SC11, IWOMP Call For Papers

In this video, Matthihjs van Waveren from the OpenMP ARB discusses the organization’s mission to oversee the OpenMP specification and organize conference, workshops and other related events. Recorded at SC11 in Seattle.

In related news, the 8th International Workshop on OpenMP has issued it’s Call For Papers. The 2012 IWOMP event will take place in Rome, June 11-13, 2012.

The International Workshop on OpenMP (IWOMP) is an annual workshop dedicated to the promotion and advancement of all aspects of parallel programming with OpenMP. It is the premier forum to present and discuss issues, trends, recent research ideas and results related to parallel programming with OpenMP. The international workshop affords an opportunity for OpenMP users as well as developers to come together for discussions and sharing new ideas and information on this topic. Deadline is January 31, 2012.

Also posted in Events, HPC Software, SC11, Video | Leave a comment

CATA Survey Seeks Input from Canadian HPC Users


Clipped from: cata.ca (share this clip)

The good folks at the Canadian Advanced Technology Alliance (CATA) are conducting an in-country survey on high performance computing.

The objective of this study is to evaluate the commercial values of High Performance Computing (HPC aka Supercomputing) to Canadian industry. Most HPC experts agree that Canadian adoption of supercomputing lags other nations against which our economy competes, so we will also be studying the barriers to adoption. To be competitive on a global scale, Canadian enterprises need to supercharge their business and R&D processes with supercomputing. Initiatives are being launched in the US and other nations to encourage greater HPC adoption by small and medium sized enterprises, if similar initiatives are not developed for Canada, we’ll be left behind. This HPC study will create a foundation of solid data from which to design initiatives together with the study partners tailored to the needs of Canadian business.

Fellow Canucks “who can speak to the business value of computing” are asked to participate. Read the Full Story.

Also posted in Computing Research | Leave a comment

A New ‘Home Base’ for HPC – ACM’s Newest Special Interest Group


Clipped from: sighpc.org (share this clip)

 

Today ACM launched a new Special Interest Group on High Performance Computing: SIGHPC, the first international group within a major professional society that is devoted exclusively to the needs of students, faculty, and practitioners in high performance computing. Their mission is simple: spread the use of high performance computing and help raise the standards of the profession and ensure a rich and rewarding career for people involved in the field.

Part of the excitement of high-performance computing as a career is that it is very multi-disciplinary in nature,” says Cherri Pancake, Professor at Oregon State University and the first Chair of SIGHPC. “HPC brings together computational techniques, algorithms, system software, computer architecture, parallel programming, and system administration. But finding your way among the choices and career paths can be challenging.”

The new group will host a booth in the Main Lobby at SC11. There prospective members can take advantage of a discounted introductory rate or join anytime at sighpc.org. Read the Full Story.

Also posted in Events, HPC, SC11 | Leave a comment

Interview: Norman Morse of OpenSFS on New Lustre Development Contract with Whamcloud

Today Whamcloud announced a multi-year contract with OpenSFS, the Lustre community group in North America. The contract, which goes out to 2013, covers performance and namespace improvements, and an online file system checker that will maintain distributed coherency across the file system.

To learn more, I caught up with Norman Morse, CEO of OpenSFS.

insideHPC: I have couple of catch-up questions first. How have the OpenSFS bylaws changed since the LUG meeting in April?

Norm Morse: Per the agreement at LUG2011, the bylaws were changed to implement a Community Representative Board Member and to streamline the contribution process.

insideHPC: Were there any significant developments in Lustre-land at the recent ISC’11 conference?

Norm Morse: The major development at ISC’11 was the signing of a Memorandum of Understanding (MOU) between European Open File System (EOFS) and OpenSFS. The MOU provides each organization with membership in the other, joint participation in executive meetings of both organizations and joint future operations, e.g., a joint booth at SC11.

insideHPC: About today’s news, OpenSFS has been extremely active recently. Can you disclose the size of this contract?

Norm Morse: The contract is for up to $2.1M over two years.

insideHPC: Will the software developed for this contract be added to the Lustre’s open source canonical tree?

Norm Morse: Yes

insideHPC: Will OpenSFS host a user meeting at SC11 in November? Will you have a booth as well?

Norm Morse: There will be a joint EOFS/OpenSFS booth at SC11. We have jointly proposed two Birds of a Feather (BOF) sessions for SC11 – a Lustre BOF meeting and a BOF session on open software in general.

insideHPC: Is OpenSFS now working with the European EOFS organization?

Norm Morse: Yes. Examples above. There is a close working relationship between EOFS and OpenSFS.

insideHPC: When can we expect the next major release of Lustre?

Norm Morse: The next major release of Lustre is 2.1 and is expected before the end of summer. Whamcloud, as you know, has been leading this effort so are the ones making the time estimate. OpenSFS and the community have been solidly supporting the effort.

Also posted in Business of HPC, HPC, HPC Software | Leave a comment

Video: Why Create a European Open File Systems Initiative?

In this video, Hugo Falter of the EOFS leads a panel discussion on Why Create a European Open File Systems Initiative? Recorded at ISC’11 in Hamburg.

Panelists:

  • Eric Barton, Whamcloud
  • Prof. Dr. Arndt Bode, Leibniz-Rechenzentrum der Bayerischen Akademie der Wissenschaften
  • Peter Braam, Xyratex
  • Johannes Diemer, Hewlett-Packard GmbH
  • Jean Gonnord, CEA/DAM
  • Brent Gorda, Whamcloud
  • Jacques-Charles Lafoucriere, CEA/DAM
  • Prof. Dr. Volker Lindenstruth, Goethe Univ. of Frankfurt a. Main
  • Prof. Dr. Thomas Ludwig, Deutsches Klimarechenzentrum
  • Eric Monchalin, Bull
  • Norman R. Morse, OpenSFS
  • Klaus Wolkersdorfer, Jülich Supercomputing Centre

A tip of the hat goes to Hugo Falter and Frank Severin for sending us this video.

Also posted in Events, HPC, ISC11, Video | 1 Comment

Video: PROSPECT – Creating a European Technology Platform for HPC

In this video from ISC’11, PROSPECT panelists discuss the creation of a European Technology Platform for HPC. PLATFORM is the “Promotion of Supercomputing Partnerships for European Competitiveness and Technology.”

Discussion Panel:

  • Moderator: Dr S. Girona, Barcelona Supercomputing Centre, Operations Director
  • Kostas Glinos, The European Commission
  • Dr. G. Tecchiolli, Eurotech, Executive Vice President and Chief Technology Officer
  • Mr. H. Falter, ParTec Cluster Competence Centre, Chief Operating Officer
  • Prof. T. Lippert, Research Centre Jülich, Institute for Advanced Simulation (IAS), Managing Director, also Prospect Executive Committee member
  • Prof. A. Bode, Leibniz Research Centre, Managing Director, also Prospect Executive Committee member

A tip of the hat goes to Hugo Falter and Frank Severin for sending us this video.

Also posted in Events, ISC11, Video | Leave a comment

Report: Accelerating HPC Apps Through MPI Offloading

In this guest feature from the HPC Advisory Council, authors Gilad Shainer, Tong Liu, Pak Lui, and Richard Graham explore the advantages of offloading MPI collectives communications from the CPU to the cluster interconnect on various applications’ performance.

Abstract
In the past, performance tuning of parallel applications could be fairly accomplished by separately optimizing their algorithms, communication, and computational aspects. However, as we continue to scale future larger machines, these issues become co-mingled and must be addressed comprehensively. MPI collectives communications are frequently being used for processes synchronization and their performance is critical for scalable, high-performance applications. Optimizing collectives communication performance can be achieved by offloading these communication to the network therefore minimizing the negative effect of system noise and jitter as well as separating them from the rest of the CPU activities. Throughout application profiling we can identify applications which will greatly benefit from such offloading and can determine the associate performance and productivity benefits.

1. Introduction
For many years optimizing high-performance computing applications could be done simply by separately optimizing their algorithms, communication, and computational aspects. As we move into the many-core and many-node compute environments, these issues must be addressed comprehensively. According to the June 2011 TOP500 supercomputers list, we have ushered into the PetaScale era and all top 10 systems have demonstrated above Petaflop performance. Multiple systems node count has exceeded ten thousand of nodes, and the number of cores is in the tens (or hundreds in cases of GPGPUs). The Message Passing Interface library (MPI) or the Shared Memory (SHMEM) environments are a few examples of libraries that provide implementations of collectives communications for the usage of HPC applications. Collectives communications have a crucial impact on the engineering and scientific application’s performance and scalability as they are frequently being used for operations such as broadcast for sending around initial input data, reductions for consolidating data from multiple sources and barriers for global synchronization. This behavior tends to have the most significant negative impact on the application’s scalability. In addition, the explicit and implicit communication coupling, used in high-performance implementations of collective algorithms, tends to magnify the effects of system-noise on application performance, further hampering application scalability.

A recent development, as a result of collaboration between HPC research center (Oak Ridge National Laboratory) and InfiniBand vendor (Mellanox Technologies), addresses the collective communication scalability problem by offloading the MPI collective communications from the host CPU to the network. This solution provides the mechanism needed to support not only computation and communication overlap (allowing the communication to progress asynchronously in hardware as being specified by the MPI Forum for MPI-3), but also supports simultaneous computations processed by the CPU for higher application performance. This minimizes the negative effects of systems noise and jitter and the effect of the non-application compute CPU activities. Offloading of the MPI communication semantics from the software MPI to network provides a comprehensive solution for emerging scalability and performance challenges, as well as enables the usage of “smart” clustering elements beyond the CPU for next-generation productive HPC.

2. MPI Collectives Offloads
The recent InfiniBand interconnect solutions include new hardware technology to support offloading communication management (Figure 1). The new technology defines a general purpose mechanism for coordinating multiple network operations. In the design process, care was taken to ensure this supports effective implementation of asynchronous collective communications (MPI, SHMEM and others) used by scientific applications. The goal of these enhancements is to relieve communication management workload from the CPU and to enhance the scalability of applications on ultra-scale computer systems.

The new offloading technology, named CORE-Direct®, includes Management-Queue, Multiple Work Request, and the wait task functionality. It is designed to support arbitrary communication patterns and to manage the data dependencies between tasks in these patterns. This was added specifically to support collective operations. The intent is to offload the progression of collective operations to the network, with the CPU being involved only in the completion of the collective communication. MPI collective operations are implemented using an interdependent sequence of network operations executed by each process.

Figure 1 – HPC system architecture (cluster) with MPI offloading

3. Applications Communications Profiling

In order to determine which applications can benefit from MPI collectives offloads, we need to review the communications patterns of each application, or application groups with similar network characteristics. MPI profiling can be done via dedicated tools either open source or commercial. For example: mpiP for MVAPICH MPI and IPM for Platform MPI. Throughout the MPI profiling we could determine how much of MPI processed communications is being done via MPI collectives communications and what is the associated overhead. The higher the usage of the MPI collective semantics, the higher the benefit from using MPI offloading. The current MPI collectives offloads mechanism support offloading of MPI Barrier, MPI Broadcast, MPI AllReduce, MPI Reduce, MPI AllGather and MPI AllgatherV collectives communications. The rest of the communications are being handled via the CPU.

3.1 OpenFOAM Computational Fluids Dynamics (CFD) Application
From concept to engineering, and from design to test and manufacturing, engineering relies on powerful virtual development solutions. CFD is performed in an effort to secure quality and speed up the development process. The OpenFOAM (Open Field Operation and Manipulation) CFD Toolbox is a free, open source CFD software package produced by a commercial company, OpenCFD Ltd. It has a large user base across most areas of engineering and science, from both commercial and academic organizations. OpenFOAM has an extensive range of features to solve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics.

Figure 2 – OpenFOAM MPI Profiling Information

Figure 3 – OpenFOAM Performance with and without MPI offloads

OpenFOAM communication profiling is presented in Figure 2 for a 16-node cluster, 192-coreconfiguration. The main collective communication used is MPI Allreduce, which is responsible for 80% of the MPI communications. The performance advantage of the MPI collectives offloads by using the cluster network to offload the MPI collectives communications from the CPU provides more than 20% performance increase, as presented in figure 3.

3.2 Amber Molecular Dynamic Application
Amber refers to two things: a set of molecular mechanical force fields for the simulation of biomolecules (which are in the public domain, and are used in a variety of simulation programs) and a package of molecular simulation programs which includes source code and demos. The current version of the code is Amber version 11, which is distributed by UCSF. Amber is one of the most widely used programs for biomolecular studies, with an extensive user base. It is being used for classical molecular dynamics simulations (NVT, NPT, etc), force field for biomolecular simulations, combined Quantum Mechanics/Molecular Mechanics (QM/MM) implementation and more.

Figure 4 – Amber MPI Profiling Information

Amber communication profiling is presented in Figure 4 for a 16-node cluster, 192-core configuration. The main collective communications used are MPI Allreduce and MPI Allgatherv, which are responsible for 83% of the MPI communications. The performance advantage of the MPI collectives offloads by using the cluster network to offload the MPI collectives communications from the CPU provides more than 30% performance increase as presented in figure 5.

Figure 5 – Amber Performance with and without MPI offloads

3.2 CMPD Car-Parrinello Electronic Structure and Molecular Dynamic Application
Car-Parrinello Molecular Dynamics (CPMD) is an ab initio electronic structure and molecular dynamics (MD) simulation software that provides a powerful way to perform molecular dynamic simulations from first principles, using a plane wave/pseudopotential implementation of density functional theory. The CPMD code has been used to examine systems including protein active sites, liquid-surface interactions, and surface catalysts. The ability to examine interactions on the nanoscale makes this approach ideal for studying systems where chemical and biological interactions are critical.

CMPD communication profiling is presented in Figure 6 for a 16-node cluster, 192-core configuration. The main collective communications used are MPI Alltoall, MPI Allreduce and MPI Barrier, which are responsible for 90% of the MPI communications. The performance advantage of the MPI collectives offloads by using the cluster network to offload the MPI collectives communications from the CPU provides more than 35% performance increase as presented in figure 7.

Figure 6 – CPMD MPI Profiling Information


Figure 7 – CPMD Performance with and without MPI offloads

4. Summary
In this paper we explored the advantages of offloading MPI collectives communications from the CPU to the cluster interconnect on various applications’ performance. Offloading the MPI collectives enables faster execution of the collectives communications, higher overlapping of computations and communications, and reduces the load from the CPU. The later hides two benefits – first it enables more CPU cycles to be dedicated to the application and second it minimizes the negative effect on the critical processes communications – the collectives operations.

We explored three open source applications by profiling their communications to examine their usage of the collective operations, and then reviewed the performance benefits of using collective offloads. As expected, as more collectives communication were in use, the higher the performance gain we saw. Our testing were limited to a small size of a cluster – 16 nodes, or 192 cores, therefore we expect to see higher performance benefits at larger cluster sizes.

Acknowledgment
We would like to thank the HPC Advisory Council for providing access time to the council compute center for conducting the described tests.

Download the PDF version of this report.

Also posted in HPC, HPC Software | Leave a comment


View All Videos

insideHPC.com is a production of insideHPC, LLC. © 2006-2011 Sitemap