A stream is an array whose elements can be operated on in parallel, similar to SIMD computing. In stream programming, data is gathered from memory into a stream, operated on in the stream, and then scattered from the stream back into memory. Memory latency is thus minimized by accessing the data in chunks, similar in effect to caching.
This style of computing was popularized by Stanford’s Merrimac project, which featured the Brook programming language (an extension to C, actually). At least two start-ups have come from this work: Stream Processors, which will develop signal processors, and PeakStream, which has just released a software engineering tool known as “Platform” intended to simplify development on co-processors.
PeakStream’s Platform is a combination virtual machine (no kernel modification) and library (a standard C++ API). The VM includes a scheduler that directs operations to the best runtime system, whether it be a CPU, a GPU, or even the Cell processor. The library should be easy for anyone familiar with products like MATLAB. Ultimately the goal with Platform is that technical computing customers will obtain much better performance in so-called “heterogeneous” systems.
In a way, the combination of PeakStream + GPU resembles ClearSpeed’s own approach, though Advance uses standard BLAS rather than a proprietary library. It is interesting to note that programming solutions for both co-processors and multicore CPUs have appeared recently. Intel is now offering their Threading Building Blocks library and Mercury has released their MultiCore Plus SDK. The only thing left is better vendor-supported tools for distributed-memory programming.
Update: To be fair to competitors, RapidMind is a commercial distribution of Sh. RapidMind / Sh is similar to PeakStream / Brook as both use stream programming to target CPU, GPU, and Cell. Pick whichever you prefer.