“We are seeking an HPC Tools Software Engineers for our Engineering Support team on the HPC contract. Lockheed Martin provides High Performance Computing services throughout the HPC lifecycle for computational requirements, architecture, acquisition, and operations to federal government customers. The program provides key supercomputing capabilities for solving important problems in science and technology. The program is involved in efforts to develop scientific software and libraries for HPC platforms. This work involves working on cutting edge HPC technologies to ensure that scientists and engineers will be able to fully utilize modern HPC systems.”
Video: Monitoring a Heterogeneous Lustre Environment with Splunk
“Monitoring a large Lustre site, running multiple generations of Lustre filesystems can be a challenge. Some equipment offer vendor specific monitoring interfaces while others, built on open source Lustre, have minimal monitoring capabilities. This talk will report on our operational experience using a homegrown python module to collect data from each filesystem. We will discuss in detail how the data is visualized centrally in Splunk and cross-referenced with users workload to analyze and troubleshoot our environment.”