Case Study: PBS Pro on a Large Scale Scientific GPU Cluster

Professor Taisuke Boku from the University of Tsukuba presented this talk at the PBS User Group. “We have been operating a large scale GPU cluster HA-PACS with 332 computation nodes equipped with 1,328 GPUs managed by PBS Professional scheduler. The users are spread out across a wide variety of computational science fields with widely distributed resource sizes from single node to full-scale parallel processing. There are also several categories of user groups with paid and free scientific projects. It is a challenging operation of such a large system keeping high system utilization rate as well as keeping fairness over these user groups. We have successfully been keeping over 85%-90% of job utilization under multiple constraints.”

Video: HPC Technology Panel at the PBS Works User Group

Rich Brueckner from insideHPC moderated this panel discussion on current trends in HPC. “President Obama’s Executive Order establishing the National Strategic Computing Initiative (NSCI) will set the stage for a new chapter in leadership computing for the United States. In this panel discussion, thought leaders from leading supercomputing vendors share their perspectives on current HPC trends and the way forward.”

TSUBAME2: How to Manage a Large GPU-Based Heterogeneous Supercomputer

Satoshi Matsuoka gave this talk at the PBS Works User Group this week. “The Tokyo Tech. TSUBAME2 supercomputer is one of the world’s leading supercomputer, ranked as high as #4 in the world on the Top500 and recognized as the “greenest supercomputer in the world” on the Green 500. With the GPU upgrade in 2013, it still sustains high performance (5.7 Petaflops Peak) and high usage (nearly 2000 registered users). However, such performance levels have been achieved with pioneering adoption of latest technologies such as GPUs and SSDs that necessitated non-traditional strategies in resource scheduling.”

Altair Moves forward with PBS Pro 13 at ISC 2015

In this video from ISC 2015, Bill Nitzberg from Altair describes why PBS Professional 13.0 is the biggest release yet for the company. “The industry needs to accomplish a lot in the coming years to deliver a working, useful exascale machine. PBS Pro is only one piece of the puzzle… but it’s an important piece. Job scheduling and workload management are core capabilities – a “must have” for every HPC system – ensuring HPC goals are met by enforcing site-specific use policies, enabling users to focus on science and engineering rather than IT, and optimizing utilization (of hardware, licenses, and power) to minimize waste.”