Introduction to Concurrency in Programming Languages
by Matthew J. Sottile, Timothy G. Mattson, and Craig E. Rasmussen
Chapman and Hall/CRC Press (2009)
I just recently finished reading Introduction to Concurrency in Programming Languages, one of the entries in CRC’s incredibly active Computational Science Series (“Incredibly active?” Yes: the series homepage lists 7 titles applicable to HPC coming in 2011, and a similar number published in 2010.)
I picked this book out of my large-ish stack of books waiting, mostly patiently, to be reviewed because I’m working on a research project these days that has to do with new models of parallel programming. I figured I’d get a decent grounding in what’s already been done, and why, but I was also concerned I’d get lost in computer science formalism. I was right, and wrong.
The authors are from a nice mix of the theoretical and the applied: Sottile is from the U of Oregon, Mattson is from Intel, and Rasmussen is from Los Alamos National Lab. All are active in supercomputing; you may recognize Mattson from his work on OpenMP or his book adapting the patterns concept to parallel programming. They set out for themselves as a goal “the motivation and definition of language level constructs for concurrency that have been designed, studied, and found their way into languages that we use today.”
Don’t let that turn you off. The authors don’t assume you are fresh out of a computer science languages course. They go to some pain to create a gentle slope back into the languages pool, with many explanations by way of analogy. “Language level constructs,” they explain, are things like loops that let the programmer express the concept of a loop without forcing her to manage the program counter explicitly. By extension, a language-level construct for parallelism might be a
for loop that executes in parallel. Of course, in theory, the benefit of this approach is that the compiler is free to pick the implementation that works best and the programmer gets to worry about higher-level tasks, like whether the science is right.
In practice, however, we’ve never really gotten much further than first base with this approach. The literature is littered with attempts to separate specification from implementation in HPC that worked fine for some subset of special cases, but never really panned out in the general case (for example, HPF). But there is also quite a large body of advances that have made it into general use, and the authors cover those in this text in the hope that the way forward is in understanding why these worked, and building upon them.
What won’t you find?
This is not a book about OpenMP, MPI, and the other libraries and language tools that extend or augment traditional sequential programming languages like C and Fortran so that programmers can develop code that executes concurrently. These approaches are discussed, but only to set them apart from the true focus of the book: languages that include concurrency in fundamental operators as part of the language.
I’ve already mentioned one alluring, if difficult to attain, advantage of including concurrent constructs in the language: the programmer can raise his focus from the implementation details and focus on correctness. This can be, as I’ve already discussed, a somewhat suspect motivation for further work given the dismal history of the practice. Happily there are more convincing reasons to care. Whenever the compiler encounters a library call, say to
MPI_Send, it has to assume that no optimizations are possible across that call. No code reordering, no optimizations on the calling side of the function to help the function execute more effectively, no elimination of variables that are never used again, and no optimization across processors to create more efficient code (for example, by coalescing many small messages on the programmer’s behalf). Promotion of concurrent constructs to be a member of the language itself, as the authors explain, puts all of this back into play, and puts the compiler back to work on behalf of the programmer.
This seems like a fairly small step that could have large payoffs in programmer productivity, and to my mind makes a convincing case for pursuing this work, no matter how jaded you are by previous grand plans to give over implementation to all-knowing compilers.
The lay of the text
As I’ve already mentioned, a clear focus in this book is on keeping the material accessible. The authors succeed at this brilliantly. I came to this book with the equivalent of a minor in computer science finished almost two decades ago, and almost no memory of language theory (other than the word “automata”, which I always liked as a word). That was enough grounding to enable me to easily keep up with the authors and still come away from the book with a deeper understanding of the concurrent world around me.
The text opens with a few chapters of introductory material on the core concepts in parallelism and concurrency. Then they move on in chapter 3 to concurrency control mechanisms, discussing the merits and demerits of techniques such as synchronization, locks, monitors, and so on. In chapter 4, “The State of the Art,” the authors cover libraries (and their limitations), along with message passing, explicitly controlled threads, and more advanced techniques such as transactional memory. Chapter 5 lays the groundwork for discussing and assessing the effectively of high-level language constructs for concurrency, including a nice subsection on cognitive dimensions which I found quite helpful.
Chapter 6 introduces the historical context (dataflow or ALGOL, anyone?), and chapter 7 pushes forward to modern day approaches with work on array notation, co-arrays, functional languages, and more. Chapter 8 addresses performance considerations.
Chapters 9 through 13 look at parallel algorithms and how they present themselves for effective concurrent implementation. After an introduction to parallel algorithms in general, each of the remaining chapters looks at a specific pattern of parallel processing (remember, Mattson is one of the authors of this book) and studies its implementation from several different perspectives. These chapters are quite helpful, and one can imagine them serving as focal points for practitioners expanding their knowledge or for students finishing out a semester with a big project.
The book has three appendices that provide additional material on three very different approaches to programming in parallel today: OpenMP, Erlang, and Cilk.
The last word
Ok, so you (probably) won’t be cracking this book open in front of a crackling fire at a ski lodge this winter. At least not if you want to go home with someone you didn’t know at the start of the trip. But if you are just jumping into the world of concurrent programming, or taking a more theoretical look at the approaches we’ve all been taking for granted for the past 20 years in an attempt to make things better, then this book is a great start.
The authors present a clear motivation for the relevance of continuing this work, and provide both the historical context and knowledge of present day practice that you’ll need to get off on the right foot. That they manage to do this while keeping the language clear and the text accessible is a tribute to the effort Sottile, Mattson, and Rasmussen put into the creation of the text.