Why Hardware Acceleration Is The Next Battleground In Processor Design

Print Friendly, PDF & Email

In this special guest feature, Theodore Omtzigt from Stillwater Supercomputing writes that as workloads specialize due to scale, hardware accelerated solutions will continue to be cheaper than approaches that utilize general purpose components.

Theodore Omtzigt from Stillwater Supercomputing

Tech decision-makers and company leaders often say they value cost, performance, and efficiency—but following through requires determination and business acumen. Only the most sophisticated leaders can lead these transformations—for instance, the transition of IT to the cloud, or from internal combustion engines to electric motors—as it requires masterful chess moves to avoid the many pitfalls and distractions that the fast-moving tech industry is subject to.

At the end of the day, just a handful of business leaders are able to make the complex decisions required to move their industry forward and make their company or products more efficient. But the ones who have are major players.

In the world of information technology (IT), companies like Amazon, Google, and Baidu have already made behavioral shifts away from general purpose CPUs and GPUs, toward more efficient hardware accelerator solutions, prompting a flood of VC capital into previously unfashionable chip startups. Hundreds of millions of dollars have been poured into Graphcore, Wave Computing, and Cerebras, with the early movers, such as Nervana, already having been acquired by Intel in 2016 for $350M.

Much of this activity is driven by the AI boom, but the benefits of efficiency come from scale, and the Internet of Everything (IoE) is pushing scale in every direction.

Against this backdrop, hardware acceleration is a key piece of the efficiency puzzle—and companies like Huawei understand this well.

Last October, Huawei announced the release of two AI chips: the Ascend 310, a hardware accelerator for AI edge computing, and the Ascent 910, a hardware accelerator for cloud-based AI training services. Then, in January of this year, they announced the industry’s highest-performance Advanced RISC Machine (ARM)-based chip. These chips, consisting of a general purpose processor combined with one or more energy efficient compute accelerators, are designed not only to boost the development of computing in big data and distributed storage, but also to provide higher computing performance using less power.

Together, they complete a full stack of hardware accelerated products for mobile, edge, and cloud services.

But Huawei isn’t the only organization making major changes. Other examples can be found in media processing, security, blockchain, and sensor fusion for autonomous vehicles.

The common thread is that all these applications have global scale. Organizations that deploy efficient solutions for these applications will be richly rewarded. To make your strategic technology plan future-proof, your systems need to do more with less.

There’s power in specialization

As the cloud providers have demonstrated, there is tremendous business value derived from specialization.

Google’s Tensor Processor Unit optimized for AI processing is two orders of magnitude more efficient than a general-purpose device. Put simply, If your business delivers AI services and pays $1.00 for IT capacity, Google pays just $0.01 for that same capacity.

It’s impossible to be competitive if your cost to deliver a service is 100x more expensive than your competitor. And in the age of IoE, this specialization means serious business opportunities across the spectrum of information technologies.

For example:

  • Security solutions for public spaces are governed by video, a very specialized workload. Specialized video processors can deliver that same 100x cost benefit as Google enjoys for AI workloads.
  • Internet of Things (IoT) applications deploy large networks of sensors to gather information about traffic flows, device operations, and customer behavior. Sensor fusion algorithms to interpret that data are highly structured and custom-tailored hardware accelerators can reduce the cost of a Smart IoT gateway by orders of magnitude.
  • Autonomous Vehicles, even at low levels of autonomy, generate roughly 2TB/hr, and this will increase to hundreds of TB per hour as their level of autonomy increases. This workload is a combination of video processing and sensor fusion we saw previously, but this time it needs to be processed in real-time to have any value at all. Specialized hardware accelerators are the only solution available to manage this level of computation.

The general availability of always-on communication has altered the IT landscape dramatically

The volume of information is set to grow exponentially, but the cost of associated information processing must stay even. That’s why, fundamentally, processing needs to become more efficient to support the economics of the IoT.

Early movers in the industry—like Google, Microsoft, and Baidu—buoyed by the windfall provided by AI, have demonstrated that hardware specialization can deliver this leap in efficiency improvements. As computation becomes increasingly embedded in smartphones and autonomous vehicles, efficiency will be the competitive differentiator.

The hardware cost of a system is proportional to the number of transistors integrated, and any product that can do more with the same transistors will yield a better product or service.

So, what does this mean for CIOs?

If you’re a CIO who manages integrations of third-party hardware and software, be aware of new hardware acceleration technologies that can reduce the cost of service delivery by orders of magnitude. If your competitor can leverage these technologies, they will be able to deliver the same service at a significant cost advantage.

If you’re a CIO who develops your own systems and applications, on the other hand, understand that if your system depends on application software that requires general purpose devices, a competitor that uses application software that takes advantage of custom-tailored hardware accelerators will be more efficient—and will be able to deliver the same service at lower cost.

We are entering a new phase for the IT industry where computers are disappearing into the cloud and into embedded applications, and our business metrics need to adapt from transactions per second to total cost of service. As the workloads specialize due to scale, there will always be hardware accelerated solutions that will be cheaper than approaches that utilize general purpose components.

In any service-oriented business, those with the lowest cost will win, and those will always be solutions that are tailored to the workload.

Theodore Omtzigt is the founder of Stillwater Supercomputing, a company devoted to building the next generation platform for computational science and engineering. Stillwater believes that the 21st century belongs to the computational scientist and that many important innovations will be driven by computational models. It seeks to aid in that quest with a high productivity environment for data mining, quantitative finance, statistics, and computational science.

Sign up for our insideHPC Newsletter