In this video from the 2015 PBS Works User Group, Satoshi Matsuoka from the Tokyo Institute of Technology presents: TSUBAME2: How to Manage a Large GPU-Based Heterogeneous Supercomputer.
The Tokyo Tech. TSUBAME2 supercomputer is one of the world’s leading supercomputer, ranked as high as #4 in the world on the Top500 and recognized as the “greenest supercomputer in the world” on the Green 500. With the GPU upgrade in 2013, it still sustains high performance (5.7 Petaflops Peak) and high usage (nearly 2000 registered users). However, such performance levels have been achieved with pioneering adoption of latest technologies such as GPUs and SSDs that necessitated non-traditional strategies in resource scheduling. Moreover, unlike some mission oriented supercomputers such as those in DoE or certain commercial sectors, TSUBAME2 usage portfolio is tremendously diverse, not only in the application portfolio, but also the usage patterns, expertise of users, etc. aggravating the scheduling challenge. Furthermore, external mandates such as the power shortage after the 3/11 Fukushima accident resulted in societal mandate to control power usage, thus strict power-aware scheduling strategies had to be brought forth to production. All of these combined were technical adventures that perhaps no other supercomputers had deployed, but solutions were made possible with PBS Professional as the underlying resource scheduler.”