SAN JOSE — April 16, 2024 — NeuReality, an AI infrastructure technology company, announced today the release of a software developer portal and demo for installation of its software stack and APIs. The company said the announcement marks a milestone since delivery of its 7nm AI inference server-on-a-chip, the NR1 NAPU, and bring up of […]
In-Memory Computing Could Be an AI Inference Breakthrough
[CONTRIBUTED THOUGHT PIECE] In-memory computing promises to revolutionize AI inference. Given the rapid adoption of generative AI, it makes sense to pursue a new approach to reduce cost and power consumption by bringing compute in memory and improving performance.
Industry Heavyweights Form Ultra Ethernet Consortium for HPC and AI
SAN FRANCISCO – July 19, 2023 – A host of industry heavyweights have formed the Ultra Ethernet Consortium (UEC), intended to promote “industry-wide cooperation to build a complete Ethernet-based communication stack architecture for high-performance networking” for HPC and AI workloads, the new group said. Founding members include AMD, Arista, Broadcom, Cisco, Eviden (an Atos Business), […]
MLCommons: Latest MLPerf AI Benchmark Results Show Machine Learning Inference Advances
SAN FRANCISCO – September 8, 2022 – Today, the open engineering consortium MLCommons announced results from MLPerf Inference v2.1, which analyzes the performance of inference — the application of a trained machine learning model to new data. Inference allows for the intelligent enhancement of a vast array of applications and systems. Here are the results and […]
MLCommons Launches MLPerf Tiny AI Inference Benchmark
Today, open engineering consortium MLCommons released a new benchmark, MLPerf Tiny Inference to measure trained neural network AI inference performance for low-power devices in small form factors. MLPerf Tiny v0.5 is MLCommons’s first inference benchmark suite for embedded device machine learning, a growing field in which AI-driven sensor data analytics is performed in real-time, close […]
LeapMind Unveils Efficiera Ultra Low-Power AI Inference Accelerator IP
Today LeapMind announced Efficiera, an ultra-low power AI inference accelerator IP for companies that design ASIC and FPGA circuits, and other related products. Efficiera will enable customers to develop cost-effective, low power edge devices and accelerate go-to-market of custom devices featuring AI capabilities. “This product enables the inclusion of deep learning capabilities in various edge devices that are technologically limited by power consumption and cost, such as consumer appliances (household electrical goods), industrial machinery (construction equipment), surveillance cameras, and broadcasting equipment as well as miniature machinery and robots with limited heat dissipation capabilities.”
Gyrfalcon Acceleration Chips Speed SolidRun AI Inference Server
Today SolidRun introduced a new Arm-based AI inference server optimized for the edge. Highly scalable and modular, the Janux GS31 supports today’s leading neural network frameworks and can be configured with up to 128 Gyrfalcon Lightspeeur SPR2803 AI acceleration chips for unrivaled inference performance for today’s most complex video AI models. “While GPU-based inference servers have seen significant traction for cloud-based applications, there is a growing need for edge-optimized solutions that offer powerful AI inference with less latency than cloud-based solutions. Working with Gyrfalcon and utilizing their industry-proven ASICs has allowed us to create a powerful, cost-effective solution for deploying AI at the Edge that offers seamless scalability.”
NVIDIA Tops MLPerf AI Inference Benchmarks
Today NVIDIA posted the fastest results on new benchmarks measuring the performance of AI inference workloads in data centers and at the edge — building on the company’s equally strong position in recent benchmarks measuring AI training. “NVIDIA topped all five benchmarks for both data center-focused scenarios (server and offline), with Turing GPUs providing the highest performance per processor among commercially available entries.”
New MLPerf Benchmark Measures Machine Learning Inference Performance
Today a consortium involving over 40 leading companies and university researchers introduced MLPerf Inference v0.5, the first industry standard machine learning benchmark suite for measuring system performance and power efficiency. “Our goal is to create common and relevant metrics to assess new machine learning software frameworks, hardware accelerators, and cloud and edge computing platforms in real-life situations,” said David Kanter, co-chair of the MLPerf inference working group. “The inference benchmarks will establish a level playing field that even the smallest companies can use to compete.”
Qualcomm to bring power-efficient AI Inference to the Cloud
Today Qualcomm announced that it is bringing the Company’s artificial intelligence expertise to the cloud with the Qualcomm Cloud AI 100. “Our all new Qualcomm Cloud AI 100 accelerator will significantly raise the bar for the AI inference processing relative to any combination of CPUs, GPUs, and/or FPGAs used in today’s data centers,” said Keith Kressin, senior vice president, product management, Qualcomm Technologies, Inc.