Dell Unveils Xeon 6 Servers, New Storage Appliances and Software for HPC-AI

Dell Technologies (NYSE: DELL) today made announcements across its data center server, storage and data protection portfolios. Here’s a rundown of the new offerings: Dell PowerEdge R470, R570, R670 and R770 ….

Vultr Announces Early Availability of NVIDIA HGX B200

WEST PALM BEACH, Fla. — Vultr, a privately-held cloud infrastructure company, today announced it offers early access to the NVIDIA HGX B200. Vultr Cloud GPU, accelerated by NVIDIA HGX B200, will provide training and inference support for enterprises looking to scale AI-native applications via Vultr’s 32 cloud data center regions worldwide. “We are pleased to […]

Axelera AI Wins EuroHPC Grant of up to €61.6M for AI Chiplet Development

AI hardware maker Axelera AI has unveiled Titania, which the company described as a high-performance, low-power and scalable AI inference chiplet. Part of the EuroHPC JU’s effort to develop a supercomputing ….

Lambda Launches Inference API

Dec. 13, 2024 — AI company Lambda today announced its Inference API, which the company said enables access to LLMs through a serverless AI for “a fraction of a cent.” The company said Lambda Inference API offers low-cost, scalable AI inference with such models as Meta’s recently released Llama 3.3 70B Instruct (FP8) at $0.20 […]

Cerebras Claims Fastest AI Inference

AI compute company Cerebras Systems today announced what it said is the fastest AI inference solution. Cerebras Inference delivers 1,800 tokens per second for Llama3.1 8B and 450 tokens per second for Llama3.1 70B, according to the company, making it 20 times faster than GPU-based solutions in hyperscale clouds.

NeuReality Launches Developer Portal for NR1 AI Inference Platform 

SAN JOSE — April 16, 2024 — NeuReality, an AI infrastructure technology company, announced today the release of a software developer portal and demo for installation of its software stack and APIs. The company said the announcement marks a milestone since delivery of its 7nm AI inference server-on-a-chip, the NR1 NAPU, and bring up of […]

HPC News Bytes 20240226: Intel Foundry Bash, Nvidia Earnings and AI Inference, HPC in Space, ISC 2024

A happy Monday of Leap Year Week to you! We offer a rapid run-through of the latest in HPC-AI, including: Intel Foundry bash, Gelsinger talks up the “Systems Foundry Era,” Wall Street hangs on Nvidia earnings, AI Training vs Inference, Digitial In-Memory Computing for inference efficiency, HPC in space, ISC 2024.

In-Memory Computing Could Be an AI Inference Breakthrough

[CONTRIBUTED THOUGHT PIECE] In-memory computing promises to revolutionize AI inference. Given the rapid adoption of generative AI, it makes sense to pursue a new approach to reduce cost and power consumption by bringing compute in memory and improving performance.

Accelerating AI Inference for High-throughput Experiments

An upgrade to the ALCF AI Testbed will help accelerate data-intensive experimental research. The Argonne Leadership Computing Facility’s (ALCF) AI Testbed—which aims to help evaluate the usability and performance of machine learning-based high-performance computing (HPC) applications on next-generation accelerators—has been upgraded to include Groq’s inference-driven AI systems, designed to accelerate the time-to-solution for complex science problems. […]

Kickstart Your Business to the Next Level with AI Inferencing

{SPONSORED GUEST ARTICLE] Check out this article form HPE (with NVIDIA.) The need to accelerate AI initiatives is real and widespread across all industries. The ability to integrate and deploy AI inferencing with pre-trained models can reduce development time with scalable secure solutions….