Exascale Computing Project: Researchers Accelerate I/O with Novel Processing Method

Print Friendly, PDF & Email
MPI-IO’s two-phase I/O strategy, with collective functions that call on all processes to answer an I/O request, has delivered high performance in parallel computing but is expected to pose a performance bottleneck and increased costs compared to independent I/O as computing reaches exascale. The new method represents a new tool in the high-performance computing (HPC) toolbox, enabling richer output and faster time-to-science.TAM adds an intranode request aggregation layer, making use of local aggregators that coalesce requests from local processes into fewer, contiguous requests. Local aggregators then communicate I/O requests across nodes to global aggregators, completing I/O on behalf of the group. The researchers implemented TAM in ROMIO and benchmarked TAM’s performance against traditional two-phase I/O using E3SM-IO, S3D-IO, and BTIO.On Nersc Cori KNL nodes, for example, the MPICH ROMIO collective write function bandwidth of ~600–700 MB/s on a small number of compute nodes drops to <100 MB/s for the I/O kernel of E3SM F case when problem size scales to 16K processes on 256 compute nodes due to communication contention in the communication phase of two-phase I/O. Scaled-up, TAM maintains a collective write bandwidth of 700 MB/s.

Their experiments showed the new method works well for applications that run many processes on a single node and exhibit a high degree of noncontiguity in their accesses, enabling richer output and faster time-to-science.

Co-author Rob Ross recently received the Ernest Orlando Lawrence Award, one of DOE’s highest honors, for “significant research contributions in the areas of scientific data storage and management, and communication software and architectures; and leadership in major DOE initiatives such as the SciDAC program.”

source: ECP