How Allinea Performance Reports Helps Jülich Scale their Applications

In this video from SC14, Bernd Mohr from the Jülich Supercomputing Centre describes how the institution is using Allinea tools for debugging at scale.

Full Transcript:

insideHPC: Bernd, I wanted to ask you something. I heard you’re using the Allinea debugging tools and I wanted to know, what are your thoughts?

Bernd Mohr: So, in our center, the Jülich computing centre, we are a PRACE center and a national center, so we need to provide good tool support for our users, and so we have the debugging tools from the beginning. But actually, what happened this summer was, Allinea brought out this new performance tools. And perhaps as you know, I’m one of the– like people in the world to develop performance tools themselves.

So, they asked me to test it and I’m, “Okay. I know these guys for a long time and let’s take it,” and we take that stuff apart and tell them all the bad things like, “This isn’t working and that’s not working, but I have to tell you, you totally failed.”

But this tool, like the– especially the Performance Reports tool is amazing. We took it up to scale. We run it on all 8,000 processes. We run it on our big lab applications and it just worked. It’s really nice.

We bought it, meanwhile, because it fills in a portion which our tools don’t support. We have this complicated– it can do everything tools, but it intimidates our user. It’s just too powerful. And Performance Reports tool is something that’s very simple and easy. You first run it and it gives you this performance report which tells you what’s wrong with your CPU, like with your network and your memory. It pinpoints it. So, it’s like a click like performance assessment and it tells you what’s wrong and how to fix it, and it does it in a really good way.

insideHPC: Well, it’s in the form of a webpage, isn’t it?

Bernd Mohr: It’s kind of. Yes, exactly. Yeah, yeah. Then if that’s not enough, then you can go on and if like, “Oh, this is wrong.” You need a better tool looking into it and then they have another second level tool called Map which can do like– or you can use tools like ours. As I said, for me the bottom line is it’s a great tool, it works out of the box, it does what it promised. I suggest everyone try it out yourselves.

insideHPC: I’m curious, you are known as an IBM shop, does that matter? Does performance reports care?

Bernd Mohr: Unfortunately, it’s currently– we are focusing on x86 platforms. We have five or six clusters and we have a big routine and unfortunately doesn’t work on the routine, but for us, that’s not the big thing. On the big routine, we have real experienced people who the handle large codes. They typically know what they’re doing.

This tool is good for our normal people doing the day to day work on our clusters. As I said, we have 500 users to support. The tool is just perfect. It helps us actually scaling our user reports. Because normally, I have a few people helping out and tuning user applications, but we cannot serve all of our 500 users– we don’t have enough people. But if I have such a tool like Performance Reports, we tell them to try that out first and then if they have a problem, then they can do the first assessment themselves. Then we can use this tool to find the hard cases when my guys need to get involved. This is great.

See our Full Coverage of SC14.