In this video from the 2015 PBS Works User Group, Professor Taisuke Boku from the University of Tsukuba presents: Case Study on PBS Pro Operation on Large Scale Scientific GPU Cluster.
“At the Center for Computational Sciences, University of Tsukuba, we have been operating a large scale GPU cluster HA-PACS with 332 computation nodes equipped with 1,328 GPUs managed by PBS Professional scheduler. The users are spread out across a wide variety of computational science fields with widely distributed resource sizes from single node to full-scale parallel processing. There are also several categories of user groups with paid and free scientific projects. It is a challenging operation of such a large system keeping high system utilization rate as well as keeping fairness over these user groups. We have successfully been keeping over 85%-90% of job utilization under multiple constraints. In this talk, I will provide a case study of how PBS Pro works in our operation as well as some proposals on more flexible and valuable system operation.”