SC10 Session Preview: ZFS on Linux for Lustre BoF

According to a recently updated post on Lustre.org, to improve the reliability and resilience of the underlying file system on the OSS and MDS components, Lustre will add ZFS support. Lustre supporting ZFS will offer a number of advantages, such as improved data integrity with transaction-based, copy-on-write operations and an end-to-end checksum on every block.

At this SC10 BoF session, Christopher Morrone and Brian Behlendorf from Lawrence Livermore National Laboratory will discuss progress on the ZFS port to Linux and how they plan to form a developer and user community around this port.

ABSTRACT:
The Lustre parallel filesystem is widely used on large systems in HPC. In order to continue scaling up Lustre filesystem at a price that is not prohibitive, we believe it is vital to replace the underlying ext3/4-based backend filesystem with ZFS. ZFS takes a fundamentally different approach to disks than previous Linux filesystems. It assumes that cheap commodity disks are unreliable, which is a given at the scales used in HPC. It takes measures to detect and recover from silent errors and corruption. Additionally, at HPC scales with Lustre, even large regular IO from applications looks like random IO to disks, making disk IOPs rates the limiting factor in filesystem performance. Thus another beneficial feature of ZFS is its ability to sequentialize random IO streams, reducing the number of IOPS that drives must handle. We will discuss the current state of the ZFS port to Linux, what work remains to be done, and other general issues of running ZFS on Linux with large drive counts. While our main goal of supporting Lustre only requires ZFS up through the DMU layer, we welcome those interested in porting the ZPL (POSIX layer) and those interested in including a FUSE based version of ZFS in the same code base. We are eager to form a community of users and developers of ZFS on Linux.