Industry Heavyweights Form Ultra Ethernet Consortium for HPC and AI

Print Friendly, PDF & Email

SAN FRANCISCO – July 19, 2023 – A host of industry heavyweights have formed the Ultra Ethernet Consortium (UEC), intended to promote “industry-wide cooperation to build a complete Ethernet-based communication stack architecture for high-performance networking” for HPC and AI workloads, the new group said.

Founding members include AMD, Arista, Broadcom, Cisco, Eviden (an Atos Business), HPE, Intel, Meta and Microsoft.

“This isn’t about overhauling Ethernet,” said Dr. J Metz, Chair of the Ultra Ethernet Consortium and AMD technical director – systems design. “It’s about tuning Ethernet to improve efficiency for workloads with specific performance requirements. We’re looking at every layer — from the physical all the way through the software layers — to find the best way to improve efficiency and performance at scale.”

The consortium said it will work on minimizing communication stack changes while maintaining and promoting Ethernet interoperability. The UEC’s technical goals are to develop specifications, APIs and source code to define:

  1. Protocols, electrical and optical signaling characteristics, application program interfaces and/or data structures for Ethernet communications

  2. Link-level and end-to-end network transport protocols to extend or replace existing link and transport protocols

  3. Link-level and end-to-end congestion, telemetry and signaling mechanisms; each of the foregoing suitable for artificial intelligence, machine learning and high-performance computing environments

  4. Software, storage, management and security constructs to facilitate a variety of workloads and operating environments

UEC will follow a systematic approach with modular, compatible, interoperable layers with tight integration intended to provide improvement for demanding workloads. The founding companies are seeding the consortium with contributions in four working groups: Physical Layer, Link Layer, Transport Layer and Software Layer.

The UEC in its announcement included quotes from industry analysts and executives and consortium-member companies.

“Many HPC and AI users are finding it difficult to obtain the full performance from their systems due to weaknesses in the system interconnect capabilities,” said Dr. Earl Joseph, CEO of Hyperion Research. “It’s also difficult for users to integrate and learn multiple new or different solutions. It’s exciting to see this impressive group of leading companies work together to create a new common higher-performance interconnect solution. Buyers in the HPC and AI areas have very demanding workloads, which the Ultra Ethernet Consortium (UEC) approach could greatly help improve interoperability, performance and capabilities.”

“Today there are no standard, vendor-neutral data center networking solutions that focus on performance at scale for parallel applications,” said Addison Snell, CEO of Intersect360 Research. “Because the majority of data centers are Ethernet-based, having extensible solutions driven by UEC will make scalability more straightforward and accessible. The companies involved in UEC are capable of developing consistent Ethernet solutions that scale from single connections to the largest supercomputers and hyperscale data centers.”

“There has been an ongoing discussion, dare I say battle, over the best networking to use for infrastructure supporting the training and inference of large language models for generative AI,” said Karl Freund, founder and principal analyst at Cambrian-AI Research. “Some companies have been shifting to Ethernet-based networking, preferring its ease of installation and use. The UEC initiative will be a welcome addition to the AI community,” .

UEC is a Joint Development Foundation project hosted by The Linux Foundation. The consortium said it will begin accepting applications for new members in Q4 2023. More information can be found at ultraethernet.org.