In this RCE podcast, Brock Palen and Jeff Squyres speak with Matei Zaharia about Apache Spark, a fast engine for large-scale data processing.
“Monitoring a large Lustre site, running multiple generations of Lustre filesystems can be a challenge. Some equipment offer vendor specific monitoring interfaces while others, built on open source Lustre, have minimal monitoring capabilities. This talk will report on our operational experience using a homegrown python module to collect data from each filesystem. We will discuss in detail how the data is visualized centrally in Splunk and cross-referenced with users workload to analyze and troubleshoot our environment.”