Jeff Squyres on MPI Process and Memory Affinity

February 1, 2013 by Doug Black

Over at the MPI Blog, Cisco’s Jeff Squyres writes that if you’re not using processor and memory affinity in your MPI programs, you’re likely experiencing performance degradation without even realizing it.

You can’t completely eliminate the amount of traffic that is flowing across the NUMA-node-connecting-network, but you do want to minimize it. Networking 101 tells us that, in many cases, reducing congestion and contention on network links leads to overall better performance of the fabric. The same principle is true on networks inside a server as it is for networks outside of a server. Using processor and memory affinity helps minimize all of the effects described above. Processes start and stay in a single location, and all the data they use in RAM tends to stay on the same NUMA node (thereby making it local). Caches aren’t thrashed. Well-behaved MPI implementations use local NICs (when available). Less inter-NUMA-node traffic = more efficient computation.

Read the Full Story.

Jeff Squyres on MPI Process and Memory Affinity

Sponsored Guest Articles

Lenovo and NVIDIA at GTC 2024: An Alliance Enabling AI at Scale

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA

Jeff Squyres on MPI Process and Memory Affinity

Sponsored Guest Articles

Lenovo and NVIDIA at GTC 2024: An Alliance Enabling AI at Scale

White Papers

Energy efficiency drives HPC to the cloud

Join Us On Social Media

Related Posts

Featured RSS Feed

More News from insideBIGDATA