When Oracle took over Sun Microsystems in February, the future of the open-source Lustre file system came into question for many users in the HPC community. Enter Whamcloud, a startup founded by Brent Gorda that plans to take Lustre to the next level of scalability.
insideHPC: So what is Whamcloud going to do for LLNL in this contract?
Mark Seager: What we are announcing is a research and development contract with Whamcloud. They are going to do a couple of things for us relative to our need for Lustre for our high performance computing needs here at the lab.
First, Livermore wrote a Lustre monitoring tool that our system administrators and operators use to monitor the health and status of the Lustre file systems that we have in both our classified and unclassified environments. It’s tool used everyday in operations, and Whamcloud is going to port it to Lustre 2.0 and include the tool in their testing and
release infrastructures. So it will become part of the main line Lustre code base as opposed to living where it does now out in the contrib directory.
The second thing they are doing for us is work with our recently-announced deployment of a multi-million dollar data intensive test bed that has 80 nodes of Supermicro 1Us servers and flash memory I/O devices. This testbed has demonstrated over 500 gigabytes per second data rates for 1 MB transfers, exporting raw devices with InfiniBand. We did that with local file systems; now what we want to do is see how much more performance we can get, not just out of Lustre, but also oss or object storage performance as well.
insideHPC: What is Whamcloud going to do with the testbed?
Mark Seager: They are going to use that test bed and tweak the performance.
insideHPC: Which hardware vendors did you work with to set up the testbed?
Mark Seager: Appro was the integrator, Fusion IO supplied the flash memory devices, and SuperMicro supplied the mother boards and chassis.
insideHPC: So why would you be going outside for this Lustre support from Whamcloud? Is that a fair question?
Mark Seager: Sure, That’s a fair question. Why not do this in house? We have four or five people working in global parallel file system area supporting six file systems in two production environments, one classified and one unclassified, doing development and support. They are way busy. We don’t have the man power to do these activities. It’s something we wanted to do, but we just couldn’t squeeze it in with our very demanding production workload.
Then Whamcloud became available. They just announced the company recently and created an opportunity for us to partner with them. So we can now get some work done that we need and help fund those guys to get them going.
insideHPC: That’s terrific. I know you can’t speak for Brent Gorda, but do you think this kind of work is typical of the kind of work that Whamcloud plans on doing for customers?
Mark Seager: From what I understand from Brent, the Whamcloud business model is to do both development for higher-end kinds of stuff and also Lustre support activities as well.
insideHPC: What do you think about the future of Lustre? You must have a lot invested in this platform.
Mark Seager: We recently did a study to see how much we invested. It’s about $85 million dollars including all the R&D funding, support and capital expenses. Then there’s the manpower to deploy the world’s and operate the largest Lustre environment over the 8 or 10 years of work, if you add it up that is a major Lustre investment. We have a huge investment in Lustre and we’re very committed. We also working with the Lustre on Linux HPC community to create the Open Scalable File Systems or OpenSFS organization that will create a center of mass for Lustre that complements what Oracle is doing and effectively engage with the HPC community.
The OpenSFS focus on Lustre will compliment what Oracle is trying to do. We are very positive about the future of Lustre. Lustre is in a period of transition right now. The Lustre transitions in the past have gone well, we are hopeful that this transition will, with HPC community involvement, also be successful. Right now we’re just trying to work with the HPC community to show leadership and a viable path in order to take Lustre to the next level.