“Monitoring a large Lustre site, running multiple generations of Lustre filesystems can be a challenge. Some equipment offer vendor specific monitoring interfaces while others, built on open source Lustre, have minimal monitoring capabilities. This talk will report on our operational experience using a homegrown python module to collect data from each filesystem. We will discuss in detail how the data is visualized centrally in Splunk and cross-referenced with users workload to analyze and troubleshoot our environment.”
Dell has teamed with Intel to create innovative solutions that can accelerate the research, diagnosis and treatment of diseases through personalized medicine. The combination of leading-edge CPUs from Intel and the systems and storage expertise from Dell create a state-of-the-art solution that is easy to install, manage and expand as required.
“With the current Lustre Performance Monitoring Tool (LMT) no longer in active development, and the current version incompatible with DNE based Lustre 2.5 deployments, there is a critical need for a new set of tools delivering the same basic Lustre performance metrics with the added ability to work compatibly with contemporary releases of Lustre.”
“The Cray-Seagate partnership is helping expand the boundaries of what’s possible in large-scale, data-intensive computing, far beyond what we could have imagined just 10 years ago. This continued innovation using the Lustre open file system is helping assist data-intense applications critical to advancements in important industries around the world.”
“Large scale HPC IO is usually done either with a file per process or to a single shared file. Single shared file IO does not scale well in Lustre compared to file per process. This presentation from Cray’s Patrick Farrell will give details, examine the reasons for this, and explore existing and potential solutions. Group locks and a new feature, lock ahead, will be discussed in the context of strided IO.”