Amid swirling reports of a multi-billion capital infusion offer from an investment firm along with a reported acquisition attempt, Intel proceeded today with launches of the Xeon 6 CPU and the Gaudi 3 AI GPU, both chips aimed at exploding demand for AI compute — and at competitors Nvidia and AMD
Intel said the Xeon 6 (“Granite Rapids”) with Performance-cores (P-cores) delivers twice the performance of its predecessor and features increased core count, double the memory bandwidth, and AI acceleration capabilities embedded in its cores.
The Gaudi 3 AI Accelerator designed for large-scale generative AI offers 64 Tensor processor cores (TPCs) and eight matrix multiplication engines (MMEs) to handle deep neural network computations.
If competing with Nvidia and AMD chips weren’t enough, Intel also is slaying dragons on the corporate front. A report emerged Sunday that that investment firm Apollo Global Management has offered to invest $5 billion in the company, sending Intel shares higher. Simultaneously, it was reported last week that phone chip giant Qualcomm approached Intel with an acquisition offer. The Apollo offer is seen by some as an endorsement of Intel’s strategic under CEO Pat Gelsinger, who took charge of the company in erly 2021, as well as a response to Qualcomm ‘s possible takeover.
Gaudi 3 includes 128 gigabytes of HBM2e memory for training and inference, and 24 200 gigabit Ethernet ports for scalable networking. Gaudi 3 also offers compatibility with the PyTorch programming framework and Hugging Face transformer and diffuser models, according to the company.
Intel said its new GPU offers favorable cost/performance compared with the Nvidia H100 GPU, delivering, for example, ~1.09x inference throughput and 1.8x perf/dollar on Meta’s LLaMA 3 8B large language model, according to the company. Intel did not have comparison data for Gaudi 3 versus Blackwell, Nvidia’s top GPU chip scheduled for shipment during the first half of next year
Since the launch of OpenAI’s ChatGPT in November 2022 ignited the generative AI market, demand for GPUs used for training and inferencing on large language models has far exceeded supply. Intel and AMD are stiving to catch up with Nvidia, which has a roughly 90 percent GPU market share.
This has left Intel with two challenges: Gaudi performance validation by a major customer, and procurement of sufficient supply of the chips from TSMC, the Taiwan-based chip foundry that leads in GPU manufacturing.
At a media pre-announcement event last week, Intel declined to reveal projected Gaudi shipment numbers, saying only that it is comfortable with expected volume.
The company also announced a collaboration with IBM to deploy Gaudi 3 on IBM Cloud — specifically on Watsonx, IBM’s generative AI and scientific data platform — a joint effort the companies said is intended to lower the total cost of ownership to leverage and scale AI.
“Our focus is inferencing right now,” Rohit Badlaney, GM of IBM Cloud Product and Industry Platforms, told us at the Intel event in Hillsboro, OR, last week. “So we’ve been very focused on inferencing use cases around cost performance, across security and compliance. That’s where we do well.”
He said IBM has evaluated Gaudi 3 against Nvidia and AMD GPUs, and while he declined to provide specifics, he said its cost/performance is “in the same range.”
“We’ve tested all the accelerators, we do extensive work with Nvidia, we do extensive work with AMD, and we are excited by (Gaudi). and we’ve tested (it on) Llama, the Mistral AI model, our own Granite models, and it (Gaudi) really differentiates for our Granite models, which is why we’re excited. We’re going to start rolling it out and keep co-creating with (Intel) and co-inventing with them.”
Intel also shared Gaudi endorsement quotes from arcee.ai, AsteraLabs and Dell Technologies.
“Demand for AI is leading to a massive transformation in the data center, and the industry is asking for choice in hardware, software, and developer tools,” said Justin Hotard, Intel executive vice president and general manager of the Data Center and Artificial Intelligence Group. “With our launch of Xeon 6 with P-cores and Gaudi 3 AI accelerators, Intel is enabling an open ecosystem that allows our customers to implement all of their workloads with greater performance, efficiency, and security.”
As for Xeon, it is established as a tandem processor alongside GPUs handling AI workloads. Intel said 73 percent of GPU-accelerated servers use Xeon as the host CPU. The company partners with such OEMs as Dell Technologies, Lenovo and Supermicro to develop co-engineered systems for AI deployments. The company also said Dell is co-engineering retrieval-augmented generation (RAG)-based solutions leveraging Gaudi 3 and Xeon 6.
“Transitioning gen AI solutions from prototypes to production-ready systems presents challenges in real-time monitoring, error handling, logging, security and scalability,” the comany said. “Intel addresses these challenges through co-engineering efforts with OEMs and partners to deliver production-ready RAG solutions.”
These solutions, built on the Open Platform Enterprise AI (OPEA) platform, integrate OPEA-based microservices into a scalable RAG system, optimized for Xeon and Gaudi systems, designed to allow customers to integrate applications from Kubernetes, Red Hat OpenShift AI and Red Hat Enterprise Linux AI, the company said
Intel also said its Tiber Develper Cloud portfolio offers business solutions to handle access, cost, complexity, security, efficiency and scalability across AI, cloud and edge environments. Tiber provides preview systems of Intel Xeon 6 for tech evaluation and testing. Additionally, select customers will gain early access to Intel Gaudi 3 for validating AI model deployments, with Gaudi 3 clusters to begin rolling out next quarter for large-scale production deployments.
New service offerings include SeekrFlow, an end-to-end AI platform from Seekr for developing trusted AI applications. Updates feature Gaudi software’s newest release and Jupyter notebooks loaded with PyTorch 2.4 and Intel oneAPI and AI Intel tools 2024.2, which include new AI acceleration capabilities and support for Xeon 6 processors.