Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


HPE Scalable Storage for Lustre: The Middle Way

This sponsored post explores HPE scalable storage and the Lustre parallel file system, and outlines ‘a middle ground’ available via solutions that offer the combination of Community Lustre within a qualified hardware solution. 

Lustre is a widely-used parallel file system in the High Performance Computing (HPC) market. It offers the performance required for HPC workloads, with its parallel design, flexibility — due to its open source nature — which allows users to customize it exactly as needed, and is highly scalable. Lustre is widely understood within the HPC user community and has a robust and innovative development community.

hpe scalable storage

A middle ground is available via solutions that offer the combination of Community Lustre within a qualified hardware solution. (Photo: Shutterstock/ktsdesign)

However, consideration needs to be given as to how Lustre should be implemented for any given scenario. A fully custom approach is one way to implement Lustre, i.e., purchase a smorgasbord of servers, disk arrays, networking cards, etc., and put it all together. Given the knowledge and expertise, a system can be built from generic parts, and all the necessary operating systems and software installed, as well as Lustre, to create a home-grown system. As with any fully custom system, the advantage is that every single component is selected by the user and the overall system is tailor-made to the specific use case.

The disadvantages of a fully custom approach is the burden of designing, building and qualifying the system, as well as maintenance, troubleshooting and upgrades. The user will have to debug issues arising from any discovered incompatibilities or bugs while building and tuning the system. Multiple vendors may need to be contacted to resolve any issues, bugs, and part incompatibilities with the user coordinating all the communications. After the system is in production, the user will have to pull software upgrades from multiple sources, and resolve any problems that may occur from day-to-day operation. Scaling the system may also be a burden since the user will have to design and build or install the upgrades themselves, with the same requirement to troubleshoot and tune the system after the upgrade. A fully custom system may be desirable, but clearly lays the onus of design, trouble-shooting, maintenance, and upgrades on the user

Another approach is to buy a commercial “appliance” Lustre solution. Typically, these solutions include a vendor specific version of Lustre, customized vendor hardware, software (possibly including additional functionality such as system management, encryption, etc., as well as a customized Lustre), and full qualification of all the equipment within the solution. These solutions can be quickly installed and set up as they’re typically prepared at the factory, and offer high performance and scalability.

A fully custom system may be desirable, but clearly lays the onus of design, trouble-shooting, maintenance, and upgrades on the user.

However, buying a commercial Lustre solution does lock-in the version of Lustre used to the vendor’s specific version of Lustre as well as all related hardware and software.  The benefit of this approach is that support is handled from one source, minimizing maintenance and support time and costs. Upgrades, whether hardware or software, and replacement equipment are all pre-qualified for compatibility and performance. This type of solution eliminates the headaches of maintenance and support from a custom solution but with the restriction of only using equipment from that vendor’s ecosystem.

Cost is also a consideration in these approaches. A custom solution offers the potential for a lower cost solution since inexpensive off-the-shelf hardware can be acquired, and a free Lustre distribution obtained.  The commercial solutions will probably be more expensive than a cobbled together white box open source solution since the cost of design, QA and support are included. Essentially, with a white box open source solution, CapEx costs are minimized while OpEx costs are maximized.  Commercial solutions on the other hand, will probably have higher CapEx costs, but offer the benefit of lower overall OpEx costs over time.

A middle ground is available via solutions that offer the combination of Community Lustre within a qualified hardware solution. The use of Community Lustre provides an open source Lustre distribution with a known roadmap and community support. Training would be available from many sources as well. Another benefit is that Community Lustre allows use of other community solutions, such as the Integrated Manager for Lustre system management tool, thus minimizing software costs.

Factory qualified hardware eliminates the time required to support a custom solution, and offers components that have been tested for compatibility, reliability and performance with Community Lustre.  Configurations may be ordered and pre-built at the factory to minimize start up time, and upgrades can be effectively plug-and-play since upgrades have been pre-qualified. Support would be available as well with such a solution, so that internal resources won’t be tied up trouble-shooting systems, and can get assistance quickly from the solution provider. This type of solution offers the quick time to production as with commercial systems, with the ability to use Community Lustre.

Another benefit of a factory hardware solution is that benchmark data may be available to guide configuration decisions, removing any guess work for performance requirements.  Guidance for configurations and component choices to meet performance and capacity requirements would also be readily available, both in documentation, as well as from the sales team, eliminating the guess work of how a generic white-box solution might perform.

As described, there are multiple ways to create a Lustre storage solution. The middle way offers an appealing alternative to pure white box solutions and commercial solutions, offering the best of both worlds, lower CapEx and OpEx expenses while providing the flexibility to suit specific use cases.

Learn more about HPE scalable storage and a factory-supported Community Lustre solution.

Comments

  1. Peter Braam created Lustre and made it available to the world for free. He really deserves mention and credit for this achievement in any article about Lustre.

Leave a Comment

*

Resource Links: