Interview: Behind the Scenes at the DEEP Project

Estela Suarez, Project Manager of DEEP & DEEP-ER at Jülich

Estela Suarez, Project Manager of DEEP & DEEP-ER at Jülich

In this interview, Estela Suarez from the DEEP and DEEP-ER projects describes progress towards the goals of energy efficient Exascale computing.

DEEP: By now, the DEEP project is in its final phase – you have roughly one more year to go. What has been the greatest achievement so far?

Estela Suarez: Actually, I would like to highlight, that we have come a long way in every part of the project be it hardware, software or application optimization.

In terms of the hardware, one of the biggest successes surely was to make the Intel Xeon Phi boot via the Extoll network. This might not sound so special, but for the DEEP project it is – because this basically is the essential milestone for proving our architectural concept: The Cluster-Booster approach. In traditional heterogeneous architectures the accelerators cannot boot without a host CPU. Our aim was to develop a cluster – made up of usual CPUs – and a booster – made up of accelerators – that can both act autonomously while being interconnected via two networks.

DEEP: What aspects are the most mentionable for software and application then?

Estela Suarez: The team has been very productive in the software area: The Cluster-Booster protocol is basically ready, we have defined the programming environment, for which we have worked hard on OmpSs (OpenMP Superscalar System) and on ParaStation MPI. Performance analysis tools have been ported to the DEEP architecture, and the resource management and schedulers are prepared. Additionally, the RAS software, which will be responsible for reading out all the sensors in the system and collecting data to analyse the energy consumption of all parts of the system and measuring its overall efficiency, is almost ready.

Regarding the applications, the team has initially spent quite some time on training. Working with the DEEP architecture requires the scientists from the various academic disciplines to think a lot differently than they are used to. It is really crucial to get the concept first before you can start optimising your code and trying to benefit from this Exascale architecture in the best way possible. However, all improvements done to their code that allow them to maximise the use of the DEEP concept will also be beneficial when the applications run on any other system.

All in all, we can definitely be very proud of what we’ve achieved so far. But obviously there is still some work left.

DEEP: That sounds pretty exciting! What do you value most about your Exascale endeavor?

Estela Suarez: Definitely the holistic approach we pursue – or co-design, as we like to say. DEEP presents an architectural concept that spans all aspects from hardware to system software to tools to applications. This means we are working on quite a few subjects in parallel and there are quite a few researchers involved, everyone being eager to contributing to this bigger picture.

DEEP: What have been the toughest moments in the project?

Estela Suarez: Obviously this co-design approach I was just talking about is an extremely complex undertaking. Hence, we had expected facing some challenges from the beginning. It is really tough to develop and integrate hardware and software at the same time. If you have delays for instance in the hardware part of the project, this immediately also affects the software part. You have to come up with a mitigation plan and basically re-adjust the whole project plan. For the DEEP project that meant, we also had to apply for a six months extension of the project from a duration of 36 to one of 42 months. But there are also a lot of positive things that came from that turn of the project: There were quite some lessons learned in terms of hardware prototyping. Plus: We realized how flexible, adaptable, creative and innovative we can be as research project team – despite having as many as 16 partners at the table dispersed over eight European countries.

DEEP: What is the focus of the project for the months to come?

Estela Suarez: The most important task at the moment is to get a fully functioning prototype up and running. The hardware colleagues in the project are working very hard on this and we are optimistic to achieve this goal by the end of this year.

Having this hardware available will also push our research and experiments in the field of energy efficiency. Next to the main prototype system installed at the Jülich Supercomputing Centre, we will have an additional Energy Efficiency Evaluator installed at the Leibniz Supercomputing Centre in Munich. This machine will dispose of the same components. It is just a smaller system that allows us to run our tests without interfering the fine-tuning of the actual prototype system.

Very important, using the main system, our application teams will do all the necessary fine-tuning of their applications on the hardware. This work shall demonstrate the whole DEEP concept and the real-world applicability. They just can’t wait to get started.

DEEP: Talking about research projects inevitably stirs the question about the real-world applicability of the research results and outcomes. How do you see this for the DEEP project?

Estela Suarez: This is definitively a very sensitive issue and it has been on our minds ever since the project started. We are very aware that just because the machine can be scaled to be a 1000 times faster than current ones does not mean that the applications actually benefit from it – we talk about peak versus sustained performance here. Therefore, it was planned from the very beginning to test our concept by running actual scientific applications on the machine. For this, we work with six application partners, which we have chosen because they will be in the need of Exascale resources soon. Just to name two of these: One application deals with climate simulation and a second one with brain simulation. As mentioned, optimizing the applications to the hardware architecture is actually one of the major milestones to be achieved in the very last phase of the project.

DEEP: Do you see a chance that the DEEP prototype will actually live on after the project ends? And do you have any plans for continuing research in the Exascale area?

Estela Suarez: The prototype will remain in service after the project ends. Ideally, we can use it to convince the necessary stakeholders that the concept is valuable and mature enough to become a real product. The technology developed in the project aims at bringing forward European industry and research, increasing their competitiveness in the future

Nonetheless, this does not mean that we are developing a machine for the European market only. We rather hope that our system will be appealing to researchers and industry representatives from the global HPC – or more specific: Exascale – community.

Spearheading research in this Exascale environment, however, means that we have to develop our concept even further. We have found that especially the topics of highly scalable I/O and resiliency need to be addressed in more detail. That is why we have opted for an extension of the project and started with its successor DEEP-ER (DEEP – Extended Reach) at the end of last year.

Dr. Estela Suarez is Project Manager of both DEEP & DEEP-ER and is based at Jülich Supercomputing Centre, Germany. She holds a PhD in Physics and has collaborated on a high-energy astrophysics project at the University of Geneva, Switzerland. Already quite early in her studies Estela gained first experience in programming, something that accompanied her throughout her academic career. From there, it was only a small step into pursuing a career in high performance computing.

Source: DEEP Project.

Check out our insideHPC interview with the DEEP project team at ISC’14.