Today the University of Oxford unveiled its new Arcus Phase B cluster. Integrated by OCF in the UK, the 5000 core Lenovo cluster will support research across all four Divisions at the University; Mathematical, Physical and Life Sciences; Medical Sciences; Social Sciences; and Humanities.
As a central resource for the entire University, we really see ourselves as the first stepping stone into HPC,” said Dr. Andrew Richards, Head of Advanced Research Computing at the University of Oxford. “From PhD students upwards i.e. people that haven’t used HPC before – are who we really want to engage with. I don’t see our facility as just running a big machine, we’re here to help people do their research. That’s our value proposition and one that OCF has really helped us to achieve.”
The new HPC cluster includes Lenovo NeXtScale servers with Intel Haswell CPUs connected by 40GB Infiniband to an existing Panasas storage system. The storage system was also upgraded by OCF to add 166TBs giving a total of 400TBs of capacity. Existing Intel Ivy Bridge and Sandy Bridge CPUs from the University of Oxford’s older machine are still running and will be merged with the new cluster.
With around 120 active users per month, the new HPC resource will support a broad range of research projects across the University. As well as computational chemistry, engineering, financial modeling, and data mining of ancient documents, the new cluster will be used in collaborative projects like the T2K experiment using the J-PARC accelerator in Tokai, Japan. Other research will include the Square Kilometer Array (SKA) project, and anthropologists using agent-based modeling to study religious groups.
The new service will also be supporting the Networked Quantum Information Technologies Hub (NQIT), led by Oxford, envisaged to design new forms of computers that will accelerate discoveries in science engineering and medicine.
20 NVIDIA Tesla K40 GPUs were also added at the request of NQIT, who co-invested in the new machine. This will also bring benefit to NVIDIA’s CUDA Centre of Excellence, which is also based at the University.
Simple Linux Utility Resource Manager (SLURM) job scheduler manages the new HPC resource, which is able to support both the GPUs and the three generations of Intel CPUs within the cluster.
With Oxford providing HPC not just to researchers within the University, but to local businesses and in collaborative projects, such as the T2K and NQIT projects, the SLURM scheduler really was the best option to ensure different service level agreements can be supported,” said Julian Fielden, Managing Director at OCF. “If you look at the Top500 list of the World’s fastest supercomputers, they’re now starting to move to SLURM. The scheduler was specifically requested by the University to support GPUs and the heterogeneous estate of different CPUs, which the previous TORQUE scheduler couldn’t, so this forms quite an important part of the overall HPC facility.”