Following up on the announcement of ship dates for Fermi products, NVIDIA also announced this morning that it has been working with Mellanox to increase application performance by reducing memory copy latency when processes communicate over IB. From the company
The system architecture of a GPU-CPU server requires the CPU to initiate and manage memory transfers between the GPU and the InfiniBand network. The new software solution enables Tesla GPUs to transfer data to pinned system memory that a Mellanox InfiniBand solution is able to read and transmit over the network. The result is increased overall system performance and efficiency.
Basically CPU gets removed from the critical path when using an InfiniBand network, improving performance on IB-connected GPU-capable clusters.
“In GPU-based clusters, most of the compute intensive processing is running on the GPUs,” said Gilad Shainer, director of high performance computing and technical marketing at Mellanox Technologies. “It’s a natural evolution of the system architecture to enable GPUs to communicate more intelligently over InfiniBand. This helps create a computing platform that will enable future Exascale computing and dramatically increase performance for a broad spectrum of applications.”
…This software capability will be available in the NVIDIA CUDATM architecture toolkit beginning in Q2 2010 and will work on existing Tesla S1070 1U computing solution systems and Tesla M1060 module-based clusters and also with the new Tesla 20-series S2050 and S2070 1U systems.