Leveraging Parallelism on Intel Xeon Phi Coprocessors and Multicore Processors

Print Friendly, PDF & Email

jefferseIntel’s James Reinders and Jim Jeffers have just published a new book, High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches. Reinders is Intel’s parallel programming evangelist and Jeffers is Engineering Manager & PE, Visualization Engineering at the company.

With 69 contributors from academia and industry, the book shows how to leverage parallelism on processors and coprocessors with the same programming, providing detailed illustrations of effective ways to combine Intel Xeon Phi coprocessors with multicore processors.

The book’s many contributors share how they successfully optimized codes to run on highly parallel architectures. That means, the editors explain, being able to tune to scaling, use lots of hardware threads, and being able to take advantage of vector data parallel hardware in an abstract enough fashion that it runs extremely well on Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors.

But it’s also about running efficiently on any parallel processor with multiple CPU cores including Intel Xeon Processors and the many-core Intel Xeon Phi product family. The book contains many specific examples of what people have done to make their code run faster.

We worked with the contributors to have them share examples of the actual code in the chapters, and point out and discuss the actual changes – not just talk about them in high-level terms,” Jeffers comments. “This includes graphs that clearly show performance changes. Some code changes have more effect than others and some are interrelated.”

All the code discussed in the book will be available for download at lotsofcores.com.

A New Buzz Word

pearlsThe editors include a new term, “neo-heterogeneous programming” in the book. Explains Reinders, “It’s actually a term that some of our customers came up with. One of the buzz phrases in the industry is ‘heterogeneous computing’ – the idea that there are advantages in building a system by combining a variety of different processing devices.

This is part of what motivated our many-core Intel Xeon Phi products,” he continues. “It’s about building a device that’s highly parallel. But heterogeneous also carried the stigma that it must be harder to program by using processing devices that are programmed differently. It will be harder because we’re going to introduce devices that aren’t compatible with each other. GPUs are an example and FPGAs another. They may have a permanent place in architecture, but they have a disadvantage – the way that you have to go about programming different devices in a heterogeneous system varies.”

Reinders adds, “The idea of neo-heterogeneity is this: by mixing multicore and many-core, but with the same programming model and the same X86 architecture underneath, you don’t have to decide which one you’re targeting at the same level that you do for a GPU or an FPGA. It’s the common programming model applied to a heterogeneous architecture that brings the benefits, particularly in power performance.”

The editors point out that the coprocessor supports a common programming model. “That’s the ‘neo,’ meaning it’s new, but it’s actually ‘old’ in the sense it’s familiar to programmers” says Jeffers. “It’s the common programming model, applied to a heterogeneous architecture that brings benefits, particularly in power/performance. Keeping that common programming model allows you to take advantage of future capabilities as well – for instance, what’s coming with Intel’s Knights Landing product. Utilizing this common programming model will allow modernized code to take advantage of architectural performance benefits without requiring reprogramming every time.”

Code Modernization

One of the recurring themes in the book is the benefits of code modernization.

Many of the contributors are domain experts as well as highly experienced computer scientists. For example, some are expert in computational fluid dynamics or finite element analysis; others have become proficient in parallel processing. Building on their strengths, many of the contributors collaborated in choosing code for parallelization and then methodically modernized the code to run in parallel not only on Intel Xeon Phi, but on Intel Xeon processors as well. They were rewarded with various levels of performance gains.

Almost all the chapters have more than one author – most of them have anywhere from three to five authors,” says Jeffers. “This exemplifies the collaborative effort that people went through to develop codes that take advantage of everyone’s knowledge.”

Reinders and Jeffers report that the challenge of programming a coprocessor is not all that different from programming a general purpose computer. A benefit that continues with the next-generation Knights Landing – a powerful processor that combines the attributes of a processor and coprocessor and is amenable to the same programming methods.

High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches is available on store.elsevier.com and other bookstores. Copies will also be available at SC14 in New Orleans.

Table of Contents: Print ‘n Fly Guide to SC14 New Orleans 


The Print ‘n Fly Guide to SC14 New Orleans is designed to be an in-flight magazine custom tailored for your journey to the Big Easy at SC14. Sponsored by Intel, the Guide will feature articles on code modernization.

As a supplement to the guide, we also offer this Sci-Fi Original story by Rich Brueckner: Angels of Silence. We hope you can enjoy it on your flight to New Orleans.