First Look: BeeGFS File System at CSCS

Print Friendly, PDF & Email

Hussein Hurake, Systems Engineer, CSCS

Hussein Hurake, Systems Engineer, CSCS

In this video from the HPC Advisory Council Spain Conference, Hussein Harake provides an overview of the CSCS and then introduces the audience to the BeeGFS parallel file system.

BeeGFS (formerly FhGFS) is the leading parallel cluster file system, developed with a strong focus on performance and designed for very easy installation and management. If I/O intensive workloads are your problem, BeeGFS is the solution. BeeGFS transparently spreads user data across multiple servers. By increasing the number of servers and disks in the system, you can simply scale performance and capacity of the file system to the level that you need, seamlessly from small clusters up to enterprise-class systems with thousands of nodes.”

Founded in 1991, CSCS, the Swiss National Supercomputing Centre, develops and provides the key supercomputing capabilities required to solve important problems to science and/or society. The centre enables world-class research with a scientific user lab that is available to domestic and international researchers through a transparent, peer-reviewed allocation process. CSCS’s resources are open to academia, and are available as well to users from industry and the business sector. The centre is operated by ETH Zurich and is located in Lugano.

The mission of the HPC Advisory Council is to bridge the gap between high-performance computing use and its potential, bring the beneficial capabilities of HPC to new users for better research, education, innovation and product manufacturing, bring users the expertise needed to operate HPC systems, provide application designers with the tools needed to enable parallel computing, and to strengthen the qualification and integration of HPC system products.

Sign up for our insideHPC Newsletter

 

 

Comments

  1. Harry Mangalam says

    Very nice presentation. We use multiple BeeGFS’s on our cluster, optimized for different IO profiles and we also use Robinhood via FS scans in the same way that you do. I agree with most of what you say but I think you are mistaken when you say (at ~19m) that the BeeGFS metadata server contains the information that would allow Robinhood to extract all the info needed in ~30m for a PB sized FS. I recently spoke to Sven Breuner (ThinkParq) specifically about this and while BeeGFS could feed Robinhood on an updating basis, it has to be done by all the clients feeding a log (which would then feed Robinhood) since (as I understood it), the MD server does not hold the complete path info to the chunks. So I know this is on their dev path, but I’m not sure how far this has gotten. Sven obviously has the final say on this.

    This is one place where Lustre has the advantage bc I think that the Lustre MD changelogs can be read directly by Robinhood now.

    I enthusiastically echo your experience about BeeGFS’s setup, general speed, expandability, performance on Zillions of Tiny files/recursion, performance under heavy load, and especially reliability.

    • Hi Harry,
      I think what Hussein was referring to is the fact that all of the BeeGFS metadata is available from the metadata targets (typically SSDs) and thus a file system scan from a client does not need to retrieve any information from the “slow” object storage targets (typically rotating media). This is different from a number of other distributed file systems, which either do not have this clear separation of metadata or still need to touch storage targets to retrieve certain parts of the metadata like e.g. file size.
      Best regards,
      Sven