Dec. 13, 2024 — AI company Lambda today announced its Inference API, which the company said enables access to LLMs through a serverless AI for “a fraction of a cent.” The company said Lambda Inference API offers low-cost, scalable AI inference with such models as Meta’s recently released Llama 3.3 70B Instruct (FP8) at $0.20 […]
Lambda Launches Inference API
@HPCpodcast Industry View: Penguin Solutions on Big HPC-AI Deployment Challenges and How to Overcome Them

In this “Industry View” episode of the @HPCpodcast, special guest Ryan Smith joins Shahin and Doug to discuss the vexing challenges of implementing HPC-class AI systems in a managed services model. We examine ….
Meta FAIR and VSParticle Launch Catalyst Database Designed to Accelerate Clean Energy Transition

Delft, 20th November 2024: In a bid to accelerate the transition to clean energy in the fight against climate change, VSParticle (VSP) – a Dutch nanotechnology engineering company – today announced the first results from a collaboration with Meta’s Fundamental AI Research (FAIR) team, and the University of Toronto (UofT). The collaboration brings together VSP’s […]
Penguin Solutions’ Big Cluster Expertise Extends to Powerhouse Services for Big AI Deployments

[SPONSORED GUEST ARTICLE] These are the times that try the souls of IT staffs tasked with deploying AI at scale… Effectively implementing AI at scale involves a whole host of design, build, deployment, and management considerations. It requires ….
Cerebras Claims Fastest AI Inference

AI compute company Cerebras Systems today announced what it said is the fastest AI inference solution. Cerebras Inference delivers 1,800 tokens per second for Llama3.1 8B and 450 tokens per second for Llama3.1 70B, according to the company, making it 20 times faster than GPU-based solutions in hyperscale clouds.
Open Compute Project Foundation and Hyperscalers to Trial Low-Carbon ‘Green Concrete’

AUSTIN, Texas, Aug. 20, 2024 — Today, the Open Compute Project Foundation (OCP) announces a collaboration to test development and deployment of low-embodied carbon concrete or “green concrete.” While numerous emerging technologies exist to achieve production of low carbon concrete, adoption has not yet scaled. This proactive and collaborative demonstration project is an important step towards […]
Nvidia AI Foundry for Custom Llama 3.1 Generative AI Models

Nvidia today announced its AI Foundry service and NIM inference microservices for generative AI with Meta’s Llama 3.1 collection of models, also introduced today. The company said its AI Foundry allows organizaations to create custom “supermodels” for their ….
Ultra Accelerator Link Group for Data Center AI Connectivity Formed: AMD, Broadcom, Cisco, Google, HPE, Intel, Meta and Microsoft

BEAVERTON, Ore.– AMD, Broadcom, Cisco, Google, Hewlett Packard Enterprise (HPE), Intel, Meta and Microsoft today announced they have aligned to develop a new industry standard dedicated to advancing high-speed and low latency communication for scale-up AI systems linking in data centers. Called the Ultra Accelerator Link (UALink), this initial group will define and establish an […]
Meta VR Project Developers Turn to Liqid UltraStack for NVIDIA GPU-Packed Dell Servers

[SPONSORED GUEST ARTICLE] Meta (formerly Facebook) has a corporate culture of aggressive technology adoption, particularly in the area of AI and adoption of AI-related technologies, such as GPUs that drive AI workloads. Members of a virtual reality research project were in need of greater GPU-driven compute ….
HPC News Bytes 20240415: Intel Gaudi 3, Meta’s MTIA Chip, Easing GPU Shortages, AI Category Theory, China’s Growth Strategy

Happy Tax Day to you! Here’s a quck (6:23) romp through recent news from the world of HPC-AI, including: Intel’s Gaudi 3 GPU and Xeon-6 CPU, Meta’s new accelerator chip, GPU shortage easing….