By Peter Carson, Lightmatter
The Next Great Leap in AI Performance
The AI revolution has been fueled by a decade-long, million-fold increase in computational capability, driven by enhanced processor efficiency, reduced precision computing, and multidimensional parallelism.
However, interconnect bandwidth has not kept pace, leaving powerful processors often waiting for data and hindering system efficiency. This disparity, illustrated in Figure 1, is a limiting factor in AI progress.
Training advanced AI models demands ever-larger scale-up domains – the high-bandwidth core of AI infrastructure. For instance, a leading AI accelerator platform typically uses 72 compute chips (XPU packages), each with two dies, electrically interconnected within a hyperscale data center rack at 7.2 Terabits per second (Tbps). Plans for denser, 144-XPU configurations, with 4 dies per package, are nearing power and thermal limits of a single rack.
The shift in AI development towards Mixture of Experts (MoE) models further exacerbates this challenge. In MoE models, “experts” or sub-networks are selectively activated by a router to process different input types, allowing for efficient scaling and specialization. As the number and complexity of experts in MoE models grow, often deployed on separate XPUs, scale-up domains are rapidly expanding and are expected to soon exceed the capabilities of electrical interconnects.
Re-Defining Scale-Up With Optics
The approximate one-meter reach limitation of passive electrical interconnects effectively restricts scale-up domains to a single rack. To overcome this obstacle, the industry is developing solutions that bring optical input/output (I/O) interfaces closer to XPU and switch silicon (referred to in this article as “ASICs”). Optical interconnects offer several advantages over copper, including increased bandwidth through wavelength-division multiplexing (WDM), superior signal integrity and bandwidth scaling free of EMI and crosstalk, longer reach (hundreds of meters or more) at low energy per bit, and enhanced resilience via optical circuit switching.
From Pluggables to Co-Packaged Optics

Figure 2: Evolution of Optical Interconnects
Inside the data center, pluggable optical transceivers primarily connect switches from rack-to-rack across multiple network tiers in the scale-out domain. This signal path traverses multiple hops and electrical-optical conversions, resulting in higher latency and lower bandwidth and power efficiency than required within the scale-up domain.
Through advances in nanophotonics, electro-optical integration and packaging, silicon photonics has matured, moving beyond pluggables into higher-performance applications like co-packaged optics (CPO). Laser disaggregation has been crucial for enabling in-package optics, also known as the optical engine (OE), by isolating the light source from the heat generated by XPU and switch ASICs.
Figure 2 illustrates each generational advancement, based on the nature of the OE’s photonic-electronic interface (from 2D planar to 3D stacked) and its proximity to the ASIC – factors that, together, improve bandwidth, area and power efficiency.
To clarify the key distinctions among current in-package OE alternatives, the following comparison provides detailed descriptions. This aims to navigate the inherent complexity of silicon photonics integration, packaging technology, and the often ambiguous terminology surrounding these systems.

As illustrated in Figure 3, 2D Co-Packaged Optics (CPO) solutions typically comprise a photonic integrated circuit (PIC) and an electronic integrated circuit (EIC) bonded together into a single, pre-packaged optical engine (OE) module. To accommodate multiple OE modules, each is fanned out up to tens of millimeters from the ASIC on the substrate, consuming significant package area. The ASIC and OE communicate across the substrate via SerDes, which drives the optical modulator (depicted by the pink dot). However, its optical and electrical I/O interfaces are placed along the chip edge, or “shoreline,” limiting each module’s bi-directional (bidi) bandwidth to around 6 Tbps. Furthermore, the EIC’s SerDes heat generation from beneath the PIC presents significant thermal management challenges.

As shown in Figure 4, 2D optical chiplets form the OE by integrating into the PIC both UCIe macros, which provide D2D connectivity, and SerDes macros, which modulate the optics. Since no separate OE package or fan-out is needed, the chiplet can be mounted closer to the ASIC on its organic substrate, saving area as compared to the 2D CPO approach. However, as their I/O interfaces remain shoreline-bound, Gen 3 optics do not offer meaningful bandwidth improvements over 2D CPO. In addition, its monolithic integration of CMOS blocks (UCIe and SerDes) with the photonics devices in larger manufacturing process nodes (e.g., 45nm) is suboptimal in terms of area and power efficiency and limits bandwidth scalability.
Re-Thinking I/O Design with 3D Photonics
Shoreline limitations will render conventional optical interconnects short of 2027-2028 XPU scale-up bandwidth requirements, which are expected to exceed 50 Tbps in keeping pace with High Bandwidth Memory (HBM) bandwidth growth. A hallmark of 3D photonics is its “edgeless I/O,” which enables electrical SerDes signals to interface with photonics virtually anywhere on the chip surface. This unlocks significantly higher bandwidth and radix (physical connection density) per millimeter of chip edge, in contrast to traditional shoreline-bound solution as depicted in Figure 5.

Another key benefit of this 3D architecture is that the SerDes signals directly drive the optics over the shortest possible distance—tens of microns—enabling the lowest I/O latency and power consumption—less than 3 picojoules per bit (pJ/bit).
Realization of these advantages requires much more than just advanced integration and packaging techniques. It takes substantial investment and innovation in nanophotonic components, advanced packaging and electro-optical integration, wavelength division multiplexing (WDM) up to 16 λ on a single fiber, dense laser arrays with sufficient power for optical bandwidth, and optimization of power efficiency, thermal management, and software control.

Fourth-generation (Gen 4) photonics implementations include 3D CPO and active 3D photonic interposer (3D Interposer). Unlike conventional interposers, an active 3D interposer enables signal regeneration and reconfiguration within its internal waveguide network. In both Gen 4 solutions, electronic dies are 3D-stacked on the PIC, providing up to 100+ Tbps of in-package bi-directional (bidi) connectivity (i.e., Tx+Rx). The key distinctions between these Gen 4 approaches are as follows:
Figure 6 illustrates 3D CPO, where the OE comprises an EIC, 3D stacked on a PIC. Through-Silicon Vias (TSVs) in the PIC handle power and electro-optical communications. The OE can integrate with the ASIC using standard or advanced packaging technology. For the latter, the ASIC is mounted on its own silicon interposer, connecting to the PIC via a D2D interface. Microring optical modulators (MRMs) in the PIC (shown as pink dots) are directly driven by SerDes signals from the EIC. This architecture removes shoreline constraints, allowing SerDes to be placed anywhere on the EIC, dramatically increasing bidi bandwidth to tens of Tbps and freeing up shoreline for higher radix (fiber port density). 16 λ bi-directional WDM enables massive package bandwidth escape, and optical circuit switching in the PIC enables networking flexibility and resilience.
A 3D Interposer, depicted in Figure 7, consists of multiple reticle-sized PICs, with optically-stitched waveguides and TSVs for power and electro-optical communications. Rather than placing ASICs beside the PIC, all electronic dies in the package are 3D-stacked on top of the PIC. This enables SerDes interfaces virtually anywhere on the PIC surface tens of microns above the MRMs (pink dots), eliminating the need for a dedicated D2D or I/O chiplet between the PIC and ASIC. The result is support for massive die complexes and over 100 Tbps of bidi bandwidth, addressing the most area- and power-efficient implementation for advanced XPUs and switches. A reconfigurable waveguide network allows for rerouting of traffic, for example, when an individual XPU chip fails.
Charting the New Interconnect Landscape
Major advances in optical connectivity, culminating with 3D photonics, have produced unprecedented bandwidth gains and other enhancements, such as optical circuit switching. Table 1 summarizes the key attributes of current photonic interconnects across the PIC categories, delineated by OE integration approach, independent of ASIC integration options.With bandwidths of tens to 100+ Tbps spanning millimeter to kilometer distances, 3D photonics offer a unified scale-up fabric within the XPU package, between XPUs and across racks. This new era in bandwidth scaling promises to close the gap between compute and connectivity performance. The resulting expansion of scale-up clusters to thousands of nodes will dramatically accelerate AI model training, enabling the next exponential leap in compute performance and efficiency that will power the future of artificial intelligence.

Peter Carson is Director, Product Marketing, Lightmatter, a company that merges photonics and computing.
[1] Sources: Epoch AI, 2025; Visual Capitalist, 2025; Lightmatter analysis of publicly available compute performance and interconnect bandwidth data for vendor products, 2025.
[2] Used as D2D interface between ASIC and EIC and to modulate optics in PIC.
[3] No dedicated D2D required. ASIC’s SerDes can be used to directly modulate optics in PIC.
[4] OE-ASIC integration via substrate is 2D-like. 2.5D packaging used for ASIC-memory die integration.



