In this special guest feature, Ken Strandberg offers this live report from Day 2 of the Lustre User Group meeting in Portland. “Scott Yockel from Harvard University shared how they are deploying Lustre across their massive three data centers up to 90 miles apart with 25 PB of storage, about half of which is Lustre. They’re using Docker containers and employing a backup strategy across the miles of every NFS system, parsing of the entire MDT, and includes 10k directories of small files.”
“The first release of OpenFabrics Interfaces (OFI) software, libfabric, occurred in January of 2015. Since then, the number of fabrics and applications supported by OFI has increased, with considerable industry momentum building behind it. This talk discusses the current state of OFI, then speaks to the application requirements driving it forward. It will detail the fabrics supported by libfabric, identify applications which have ported to it, and outline future enhancements to the libfabric interfaces and architecture.”
Today Silicon Mechanics announced that the University of New Orleans is the recipient of the company’s fifth annual Research Cluster Grant (RCG). Each grant awardee will receive a High-Performance Computing cluster with the latest high-performance processing and GPU technologies, valued at over $100,000 for use in demonstrated research purposes going forward. This is the second year that Silicon Mechanics has made the award to two institutions.
“With Supermicro’s 90 top-load 3.5” hot-swap bay JBOD as the storage core of our Lustre Pod Cluster, we maximize performance, density and capacity and simplify serviceability for massive scale HA storage deployments. Combining our preconfigured, validated 2U SuperStorage OSS, 1U Ultra SuperServer with Intel Enterprise Edition for Lustre software, and global service and support, Supermicro has the Total Solution for Lustre ready for HPC, Genomics and Big Data.”
Today Egypt’s Bibliotheca Alexandrina library announced plans to build an HPC platform using Huawei technologies. Based on high-density FusionServer servers, the 118 Teraflop Huawei cluster employs high-speed InfiniBand and 288 TB in storage capacity for concurrent file systems.
Today Mellanox announced a new line of InfiniBand router systems. The new EDR 100Gb/s InfiniBand Routers enable a new level of scalability critical for the next generation of mega data-center deployments as well as expanded capabilities for data center isolations between different users and applications. The network router delivers a consistent, high-performance and low latency router solution that is mission critical for high performance computing, cloud, Web 2.0, machine learning and enterprise applications.
NNSA’s next-generation Penguin Computing clusters based on Intel SSF are bolstering “capacity” computing capability at the Tri Labs. “With CTS1 installed in April, the NNSA scientists can continue their stewardship research and management on some of the most advanced commodity clusters the Tri Labs have acquired, ensuring the safety, security, and reliability of the nation’s nuclear stockpile.”
Gary Grider from Los Alamos National Laboratory will keynote the 2016 OpenFabrics Workshop this year with a talk on HPC Storage and IO Trends and Workflows. The event takes place April 4-8, 2016 in Monterey, California.
“High performance computing has begun scaling beyond Petaflop performance towards the Exaflop mark. One of the major concerns throughout the development toward such performance capability is scalability – at the component level, system level, middleware and the application level. A Co-Design approach between the development of the software libraries and the underlying hardware can help to overcome those scalability issues and to enable a more efficient design approach towards the Exascale goal.”