World record crypto hash throughput

Print Friendly, PDF & Email

My friend Ilya at software maker Cilk Arts sent me an email with a pointer to a recent example application making use of Cilk’s multicore libraries, this time for the MD6 cryptographic hash

We implemented MD6 for multicore processors using the CILK extension to the C programming language […] The CILK technology makes multicore programming quite straightforward […] Our implementation of MD6 in CILK used the layer-by-layer approach (so it assumes that the input message is available all at once). It processes each layer in turn, but uses parallelism to process a layer efficiently.”

The full MD6 application has over 3,000 lines of code. Adding two Cilk keywords (at the bottom of the code snippet below) was sufficient to multicore-enable the algorithm. MD6 is recursive, and adding the Cilk++ keywords exposes a great deal of parallelism.

…At the time of this writing, the >1GB/sec throughput achieved on a 16-core system with the Cilkified version is a world record for MD6!

This Cilk blog entry closes with a challenge:

Would you like to push the envelope further? Here’s the Cilkified MD6 (Linux) implementation – run it on the biggest shared memory system you can find, an let us know the reported throughput number (and system specs)!