NCAR Taps Globus Plus and GPFS in a “Dropbox for Big Data”

Print Friendly, PDF & Email
The Globally Accessible Data Environment (GLADE) is the centralized file service located at the NCAR-Wyoming Supercomputing Center in Cheyenne. (Photo courtesy David Read, NCAR.)

The Globally Accessible Data Environment (GLADE) in Cheyenne. (Photo courtesy David Read, NCAR.)

NCAR has implemented an enhanced data sharing service that allows scientists increased access to data and improved capabilities for collaborative research.  In addition to data sharing, NCAR has significantly upgraded its centralized file service, known as the Globally Accessible Data Environment (GLADE).

Managed by NCAR’s Computational and Information Systems Laboratory (CISL), both GLADE and the data sharing service are important upgrades for the HPC user community, allowing faster and better access to data and a more flexible virtual workspace.

The data sharing service leverages the capabilities of Globus Plus to increase customization options for storage as well as data sharing. Globus, a project of the Computation Institute (a partnership of The University of Chicago and Argonne National Laboratory), is a software service that has been described as a dropbox for big data. It is broadly used in the scientific community. “Plus” refers to a new feature that allows researchers to share data with colleagues outside of their home institutions, greatly improving ease of collaborative work.

Scientific collaborations are global endeavors, and researchers need to share data with colleagues around the world. As data sets have grown in size and number, the process of moving and managing access to them has become a significant challenge,” said Pam Gillman, manager of NCAR’s Data Analysis Services Group. “Globus Plus is a robust and user-friendly service that eases the workflow, and it allows users to be more productive by spending less time on the minutiae of data transfers.”

NCAR users have been accessing the Globus transfer service for many years. In addition to making data available to external colleagues, the upgrade now allows users of CISL’s HPC environment to control the users or groups of users to which the data are accessible. With the sharing service, outside users need only a free Globus account, not a UCAR username/token, to access shared data.

The Globus Plus service has a 1.5-petabyte capacity, and most users can take advantage of the Globus web interface to transfer data. Advanced users or service developers can leverage the Globus Plus features via a command-line interface.

CISL recently added 5 petabytes of high performance storage to the GLADE environment, bringing the total to 16.4 petabytes. GLADE is based on the GPFS file system and provides over 90 GB/s of sustained bandwidth across HPC, analysis, and visualization resources. GLADE file spaces are intended as work areas for day-to-day tasks and are well suited for managing software projects, scripts, code, and data sets.

We strive to meet the growing needs of our user community, which expand as the data sets grow and require greater and more efficient resources,” said Gillman. “These major upgrades are part of CISL’s ongoing commitment to giving users the tools and services they need to carry out cutting-edge computational research.”

Sign up for our insideHPC Newsletter.