What are alternatives to MPI for communication in parallel computing?

MPI may be the de facto standard for communication, but it is not the only library. While MPI attempts to be broad in scope, its message semantics are actually quite limited. Some developers have had to create their own communication libraries for special cases.

For example, the ARMCI library can perform one-sided communication of non-contiguous data, similar to the POSIX functions readv() and writev(). Seamless transmission of data, regardless of type—especially in heterogeneous environments—can be beneficial for the user. However, while there are opportunities for better performance in some applications, the marshaling required to prepare serialized data for transmission usually exhibits degrading overhead. The user must be cautious. ARMCI is used as the basis for Global Arrays.

Another remote memory access library worth mentioning is Sandia Portals, also simply known as Portals. In this system, each process maintains a queue of events. An event is generated whenever a message enters a new state of progress, such as submission, completion or failure. The events are entered into the queue, which the user may check to determine message progress. The goal is that Portals allows for independent progress and thus better overlap of communication and computation. Portals is the backend for the Lustre file system.

Users of message-passing models must develop special-case functionality when mere transmission of data is not enough. An alternative approach is to use active message that alert the remote node once data arrives. The arrival of an active message triggers an interrupt or alerts a thread. The message contains the address of a handler and the parameters of that handler. The handler is a small function that is invoked immediately and run to completion (to prevent deadlock, it is not allowed to block or busy-wait). It incorporates data and may optionally respond to the sender with a result.

In a way, active messages behave somewhat like remote versions of POSIX signals, and require that the application handle messages explicitly. This paradigm is useful for when the remote node must act on an unexpected message. Indeed, it is the architecture of ARMCI.

The message-driven model can be seen in GASNet. GASNet presents an interface for active messages and in turn uses this interface itself to create one-sided messages for network platforms that do not support remote direct memory access. GASNet is used in a number of partitioned global address space languages, and thus has goals similar to ARMCI’s.

In sum, MPI’s restrictive message semantics make it a poor choice for developing diverse applications. A few smaller initiatives hope to provide the necessary features to address this.