First Look: BeeGFS File System at CSCS

October 9, 2016 by Doug Black

Hussein Hurake, Systems Engineer, CSCS

In this video from the HPC Advisory Council Spain Conference, Hussein Harake provides an overview of the CSCS and then introduces the audience to the BeeGFS parallel file system.

“BeeGFS (formerly FhGFS) is the leading parallel cluster file system, developed with a strong focus on performance and designed for very easy installation and management. If I/O intensive workloads are your problem, BeeGFS is the solution. BeeGFS transparently spreads user data across multiple servers. By increasing the number of servers and disks in the system, you can simply scale performance and capacity of the file system to the level that you need, seamlessly from small clusters up to enterprise-class systems with thousands of nodes.”

Founded in 1991, CSCS, the Swiss National Supercomputing Centre, develops and provides the key supercomputing capabilities required to solve important problems to science and/or society. The centre enables world-class research with a scientific user lab that is available to domestic and international researchers through a transparent, peer-reviewed allocation process. CSCS’s resources are open to academia, and are available as well to users from industry and the business sector. The centre is operated by ETH Zurich and is located in Lugano.

The mission of the HPC Advisory Council is to bridge the gap between high-performance computing use and its potential, bring the beneficial capabilities of HPC to new users for better research, education, innovation and product manufacturing, bring users the expertise needed to operate HPC systems, provide application designers with the tools needed to enable parallel computing, and to strengthen the qualification and integration of HPC system products.

Sign up for our insideHPC Newsletter

Comments

Harry Mangalam says

October 11, 2016 at 6:55 pm

Very nice presentation. We use multiple BeeGFS’s on our cluster, optimized for different IO profiles and we also use Robinhood via FS scans in the same way that you do. I agree with most of what you say but I think you are mistaken when you say (at ~19m) that the BeeGFS metadata server contains the information that would allow Robinhood to extract all the info needed in ~30m for a PB sized FS. I recently spoke to Sven Breuner (ThinkParq) specifically about this and while BeeGFS could feed Robinhood on an updating basis, it has to be done by all the clients feeding a log (which would then feed Robinhood) since (as I understood it), the MD server does not hold the complete path info to the chunks. So I know this is on their dev path, but I’m not sure how far this has gotten. Sven obviously has the final say on this.

This is one place where Lustre has the advantage bc I think that the Lustre MD changelogs can be read directly by Robinhood now.

I enthusiastically echo your experience about BeeGFS’s setup, general speed, expandability, performance on Zillions of Tiny files/recursion, performance under heavy load, and especially reliability.
- Sven Breuner says
  
  October 12, 2016 at 2:23 am
  
  Hi Harry,
  I think what Hussein was referring to is the fact that all of the BeeGFS metadata is available from the metadata targets (typically SSDs) and thus a file system scan from a client does not need to retrieve any information from the “slow” object storage targets (typically rotating media). This is different from a number of other distributed file systems, which either do not have this clear separation of metadata or still need to touch storage targets to retrieve certain parts of the metadata like e.g. file size.
  Best regards,
  Sven

First Look: BeeGFS File System at CSCS

Sponsored Guest Articles

Hammerspace Unveils the Fastest File System in the World for Training Enterprise AI Models at Scale

White Papers

Energy efficiency drives HPC to the cloud

Comments

Featured RSS Feed

More News from insideBIGDATA

First Look: BeeGFS File System at CSCS

Sponsored Guest Articles

Hammerspace Unveils the Fastest File System in the World for Training Enterprise AI Models at Scale

White Papers

Energy efficiency drives HPC to the cloud

Join Us On Social Media

Comments

Related Posts

Featured RSS Feed

More News from insideBIGDATA