Greg Pfister looks at why now may be the right time for accelerators to stick

Print Friendly, PDF & Email

I was up late two nights ago and happened to see Greg Pfister’s latest post go live. It’s an interesting piece that starts with a snapshot of the elephant that everyone likes to ignore: this technology isn’t a new idea (it goes back a very long way to companies like FPS and earlier), and if they are so great, why haven’t they already taken over the world?

Nonstandard software that never quite works with your OS release; disappointing results when you find out you’re going 200 times faster – on 5% of the whole problem; lethargic data transfer whose overhead squanders the performance; narrow applicability that might exactly hit your specific problem, or might not when you hit the details; difficult integration into system management and software development processes; and a continual need for painful upgrades to the next, greatest version with its different nonstandard software and new hardware features; etc.

But he does think that things may be changing, and he outlines some of his reasons in the post: programming standards, ease of connection into commodity systems, and the network effect, something I’ve talked about here before

Beyond companies making accelerators, there are a collection of companies who are accelerator arms dealers – they live by making technology that’s particularly good for creating accelerators, like semi-custom single-chip systems with your own specified processing blocks and/or instructions. Some names: Cavium, Freescale Semiconductor, Infineon, LSI Logic, Raza Microelectronics, STMicroeletronics, Teja, Tensilica, Britestream. That’s not to leave out FPGA vendors who make custom hardware simple by providing chips that are seas of gates and functions you can electrically wire up as you like.

Why now? Greg has an interesting theory that resonates with me

Until recently, everybody has had to run a Red Queen’s race with general purpose hardware. There’s no point in obtaining an accelerator if by the time you convince your IT organization to allow it, order it, receive it, get it installed, and modify your software to use it, you could have gone faster by just sitting there on your butt, doing nothing, and getting a later generation general-purpose system. When the general-purpose system has gotten twice as fast, for example, the effective value of your accelerator has halved.

What the graph shows is this: Suppose you buy an accelerator that does something 10 times faster than the fastest general-purpose “commodity” system does, today. Now, assume GP systems increase in speed as they have over the last couple of decades, a 45% CAGR. After only two years, you’re only 5 times faster. The value of your investment in that accelerator has been halved. After four years, it’s nearly divided by 5. After five years, it’s worthless; it’s actually slower than a general purpose system.

But the expected single-thread performance improvements are in the range 10-15% these days, and so accelerator-based systems can provide real value now for longer than before. Makes sense.


  1. It is a great article. I think with accelerators, there needs to be a focus on the long term road map really. Take the current trend with GPU. They are already an add in card to begin with. They may provide 10x to 100x improvements depending on the applications, but even as the general purpose commodity hardware increases to close the gap, the GPU industry is still improving. With APIs like CUDA, you are pretty much certain that your application today will work on the GPUs tomorrow, and same with OpenCL. Although with OpenCL, the expectation is that you will have a general framework that will allow access to numerous accelerator devices and their future incarnations.

    ASIC is really where things may still be at issue, but as long as a developer has a roadmap with their product, they should be able to keep relevant, unless their future roadmap involves a product that will be beaten out by, like you say, more general purpose hardware over the next few years.

    FPGAs though are a bit of an interesting bag of tricks. I’ve been looking at FPGA for years, and have worked with the implementation of an FPGA platform for high speed data acquisition and signal processing. The problem you end up running into down the line is changes in pin counts, gates, registers, etc. It can be difficult to have an upgrade path unless you are working from higher level interfaces that do all of the loop unrolling and population of logic for you. Even then, upgrade the hardware, upgrade the software to a new version number with a new bin for the updated FPGA board. Non trivial, but not impossible to have an upgrade path to look towards.