MLCommons Launches MLPerf Tiny AI Inference Benchmark

Print Friendly, PDF & Email

Today, open engineering consortium MLCommons released a new benchmark, MLPerf Tiny Inference to measure trained neural network AI inference performance for low-power devices in small form factors.

MLPerf Tiny v0.5 is MLCommons’s first inference benchmark suite for embedded device machine learning, a growing field in which AI-driven sensor data analytics is performed in real-time, close to where data resides. The new benchmark suite addresses a variety of use cases that involve “tiny” neural networks, typically 100 kB and below, that process sensor data such as audio and vision to provide endpoint intelligence.

The first v0.5 round included five submissions from academic, industry organizations, and national labs, producing 17 peer-reviewed results, according to the organization. Submissions this round included software and hardware innovations from Latent AI, Syntiant, PengCheng Labs, Columbia, UCSD, CERN, and Fermilab. Results can be found at:

MLPerf Tiny Inference enables reporting and comparison of embedded ML devices, systems, and software. Developed in partnership with the Embedded Microprocessor Benchmark Consortium (EEMBC), the benchmark consists of four ML tasks encompassing use of microphone and camera sensors with embedded devices:

  • Keyword Spotting (KWS), which uses a neural network that detects keywords from a spectrogram;
  • Visual Wake Words (VWW), a binary image classification task for determining the presence of a person in an image;
  • Tiny Image Classification (IC), a small image classification benchmark with 10 classes; and
  • Anomaly Detection (AD), which uses a neural network to identify abnormalities in machine operating sounds.

KWS has several use cases in endpoint consumer devices, such as earbuds and virtual assistants. VWW has application use cases, for instance, with in-home security monitoring. IC has myriad use cases for smart video recognition applications. AD has several applications in industrial manufacturing for tasks such as predictive maintenance, asset tracking and monitoring.

“To understand progress and advance innovation, particularly in edge computing, the ML industry needs benchmarks,” said Peter Torelli, President of EEMBC. “Creating new metrics and measurement across neural networks and a variety of form factors is challenging, and we’re thrilled to partner with MLCommons to make MLPerf Tiny a reality.”

“The goal of MLPerf is to measure performance for machine learning across the full spectrum of systems – from microwatts to megawatts,” said Professor Vijay Janapa Reddi of Harvard University and MLPerf Tiny Inference working group chair. “This new benchmark will bring intelligence to devices like wearables, thermostats, and cameras, and further MLCommons’ mission to accelerate machine learning innovation to benefit everyone.”

With the addition of MLPerf Tiny, MLCommons said it covers the full range of machine learning inference benchmarks, from cloud and datacenter benchmarks that consume kiloWatts of power to tiny IoT devices that consume only a few milliWatts. The MLPerf Tiny v0.5 inference benchmarks were created by over the last 18 months by representatives from: Harvard University, EEMBC, CERN, Columbia, Digital Catapult, Fermilab, Google, Infineon, Latent AI, ON Semiconductor, Peng Cheng Laboratories, Qualcomm, Renesas, SambaNova Systems, Silicon Labs, STMicroelectronics,  Synopsys, Syntiant, UCSD, and Voicemed.

The MLPerf Tiny working group recently submitted a paper to the NeurIPS benchmarks and datasets track that provides information about the design and implementation of the benchmark suite ( Additional information about the MLPerf Tiny Inference benchmarks is available at the github repository (