In this video from LUG 2015 in Denver, Steve Simms from Indiana University presents: Scalability Testing of DNE2 in Lustre 2.7.
“Comet is really all about providing high-performance computing to a much larger research community – what we call ‘HPC for the 99 percent’ – and serving as a gateway to discovery,” said SDSC Director Michael Norman, the project’s principal investigator. “Comet has been specifically configured to meet the needs of researchers in domains that have not traditionally relied on supercomputers to solve their problems.”
“We are now working with over 100 channel partners globally. You can get access to Intel Lustre from almost everyone who sells storage or compute worldwide. We’re expanding this to include software partners, cloud partners. We want to create the best product possible out of this open source technology, and make it available economically to the channel partner, and enable you to go after these hugely expanding markets of cloud and big data, while not giving up on HPC.”
“Data caching can provide increased performance when using a mix of high and low performance storage, but traditional replacement algorithms like LRU may evict important data in multi-tenant environments, or in situations where the cache is “cold”. By tagging and prioritizing data within the storage system, we can create a more intelligent mechanism that avoids many of the problems inherent to traditional caching. Methods for prioritizing data and passing this information through the filesystem will be discussed, as well as a performance analysis of small file IO in Lustre with cache hinting, and possible future enhancements.”
“In this talk, Seagate presents details on its efforts and achievements around improving Hadoop performance on Lustre including a summary on why and how HDFS and Lustre are different and how those differences affect Hadoop performance on Lustre compared to HDFS, Hadoop ecosystem benchmarks and best practices on HDFS and Lustre, Seagate’s open-source efforts to enhance performance of Lustre within “diskless” compute nodes involving core Hadoop source code modification (and the unexpected results), and general takeaways ways on running Hadoop on Lustre more rapidly.”
In this video from LUG 2015 in Denver, James Simmons from ORNL presents: Lustre + Linux – Putting the House in Order. “In the last year great strides have been made to sync up the lustre Intel branch to what is upstream. We present what that current state is as well as what is left for the intel branch to bring this to completion.”
“Monitoring a large Lustre site, running multiple generations of Lustre filesystems can be a challenge. Some equipment offer vendor specific monitoring interfaces while others, built on open source Lustre, have minimal monitoring capabilities. This talk will report on our operational experience using a homegrown python module to collect data from each filesystem. We will discuss in detail how the data is visualized centrally in Splunk and cross-referenced with users workload to analyze and troubleshoot our environment.”