UCIe (Universal Chiplet Interconnect Express) is an open industry standard published in 2022 that defines the physical, electrical, and protocol interfaces for connecting chiplets within a package. It allows chiplets from different vendors and foundries to communicate over a standardized die-to-die link, enabling a chiplet ecosystem analogous to how PCIe enables plug-and-play peripherals.

What are the three layers of UCIe?

UCIe has three layers. The Protocol Layer hosts the upper-level protocol (PCIe 5/6, CXL 2/3, or a raw streaming interface). The Die-to-Die (D2D) Adapter handles link training, scrambling, FEC, retiming, and error recovery. The Physical Layer manages the bump interface — bump pitch, clock distribution, analog signaling, and AC coupling.

What is the difference between UCIe Standard Package and Advanced Package?

UCIe Standard Package uses 25 µm bump pitch with conventional organic substrates, achieving up to 16 GB/s/mm bandwidth density. UCIe Advanced Package uses 10 µm or finer bump pitch (hybrid bonding or EMIB), achieving up to 94 GB/s/mm — nearly 6× higher density.

Which protocols does UCIe support?

UCIe's Protocol Layer supports PCIe 5.0 and 6.0, CXL 2.0 and 3.0, and a raw Streaming interface for vendor-proprietary protocols. This means a UCIe physical link can carry PCIe or CXL traffic between chiplets without protocol translation.

How does UCIe compare to Intel AIB and BoW?

Intel AIB is proprietary and used in EMIB packaging. BoW (Bunch of Wires) from Open Compute Project is simpler with lower overhead but fewer features. UCIe is the broadest open coalition standard, backed by Intel, AMD, Arm, Qualcomm, Samsung, TSMC, and designed to become the universal chiplet interconnect.

What is the FDI interface in UCIe?

FDI (Flit-aware DIE Interface) is the logical boundary between the Protocol Layer and the Die-to-Die Adapter. It passes protocol flits and control signals between the two layers. RDI (Raw DIE Interface) is the boundary between the D2D Adapter and the Physical Layer. Both interfaces are standardized to ensure interoperability between IP from different vendors.

A chiplet is a small, modular, pre-validated die designed to be combined with other chiplets in a single package. Instead of one large monolithic die (high cost, lower yield), chiplets allow mixing compute, memory, I/O, and analog dies from different process nodes and foundries, connected by a high-bandwidth interface like UCIe.

UCIe – Universal Chiplet Interconnect Express Explained

Motivation

Why Chiplets? The End of the Monolithic Die

Moore's Law scaling costs are soaring and yield loss on large dies makes monolithic integration increasingly impractical for complex SoCs.

Yield Problem

Yield drops exponentially with die area. A 800 mm² monolithic GPU may yield 60%, but splitting it into two 400 mm² dies connected by UCIe can push combined yield above 85% — delivering far better wafer economics.

Process Node Mismatch

Compute cores benefit from leading-edge nodes (2 nm, 3 nm), but analog, SerDes, and memory controllers do not. Chiplets allow each function to use the optimal process — mix N3 logic with N16 analog, no compromise needed.

Reuse & Time-to-Market

A validated SRAM or PHY chiplet can be reused across multiple products. With UCIe standardization, chiplets from Chiplet IP vendor A can plug into a package designed by SoC vendor B without custom interface design — analogous to PCIe plug-and-play.

Bandwidth Wall

Off-package DRAM bandwidth (PCIe-attached) hits ~130 GB/s. On-package chiplet interconnects via UCIe achieve hundreds of GB/s with sub-pJ/bit energy — critical for AI accelerators that demand memory bandwidth in the terabytes-per-second range.

Real-world adoption: AMD's EPYC "Genoa" uses chiplets (CCDs + IOD) connected by internal Infinity Fabric. Intel's Ponte Vecchio GPU uses 47 chiplets with EMIB and Foveros. UCIe standardizes the interface so future chiplets from any vendor can interoperate.

Key Numbers

UCIe at a Glance

2022

UCIe 1.0 Published

30+

Consortium Members

16

GB/s/mm (Standard Pkg)

94

GB/s/mm (Advanced Pkg)

3

Protocol Layers (PCIe/CXL/Stream)

<1

pJ/bit (Advanced Pkg target)

System View

Chiplet Package Architecture

Multiple dies sit side-by-side on an interposer or organic substrate. UCIe bumps bridge the die-to-die gap at the package level — all within a single IC package.

Fig 1 — Three chiplets on a common package substrate, connected by UCIe die-to-die links. Each chiplet can be manufactured on a different process node.

Architecture

UCIe 3-Layer Stack

UCIe mirrors the layered philosophy of PCIe — each layer has a well-defined responsibility and a standardized interface to the layer above and below.

Fig 2 — UCIe 3-layer stack. FDI (Flit-aware DIE Interface) separates the Protocol and D2D Adapter layers; RDI (Raw DIE Interface) separates the D2D Adapter and Physical layers. These interfaces enable IP from different vendors to interoperate.

Protocol Layer

Hosts the upper-level protocol: PCIe 5.0/6.0, CXL 2.0/3.0, or a raw Streaming interface. Responsible for generating and terminating protocol packets (TLPs for PCIe, flits for CXL/PCIe 6.0). This layer is protocol-aware and talks to the D2D Adapter via the FDI.

Die-to-Die (D2D) Adapter

The intelligence of the UCIe stack. Handles link training and initialization, lane scrambling (PRBS-based), optional FEC (Reed-Solomon), retiming for clock domain crossing, cyclic redundancy check, and credit-based flow control between dies. Connects to PHY via RDI.

Physical Layer

The bump interface and analog signaling circuitry. Defines bump pitch (25 µm standard, ≤10 µm advanced), differential AC-coupled signaling, forwarded clock distribution, and the bump map layout. The PHY is the only layer that differs between Standard and Advanced packaging.

FDI & RDI Interfaces

FDI (Flit-aware DIE Interface) is the logical boundary between Protocol and D2D Adapter — passes flits and link management signals. RDI (Raw DIE Interface) is the boundary between D2D Adapter and Physical Layer. Both are standardized, enabling separate sourcing of protocol IP and PHY IP.

Packaging

Standard Package vs Advanced Package

UCIe defines two physical packaging tiers. The bump pitch dictates bandwidth density and determines which packaging technology is required.

Fig 3 — Standard Package (25 µm bump pitch) vs Advanced Package (≤10 µm). Smaller pitch means more bumps per mm, yielding ~6× higher bandwidth density. Advanced packages require silicon interposers, EMIB, or hybrid bonding technology.

Standard Package

Conventional organic substrate or leadframe

Bump pitch: 25 µm
Bandwidth density: up to 16 GB/s/mm
Max data rate: 16 GT/s per bump
Packaging: FCBGA, organic substrates
Cost: lower — uses mature packaging infra
Use cases: chiplets with moderate bandwidth needs, mainstream SoCs

Advanced Package

Silicon interposer, EMIB, or hybrid bonding

Bump pitch: ≤10 µm (hybrid bonding: ~1 µm)
Bandwidth density: up to 94 GB/s/mm
Max data rate: 32 GT/s per bump
Packaging: 2.5D Si interposer, Intel EMIB, TSMC SoIC
Cost: higher — requires advanced foundry packaging
Use cases: AI/HPC chiplets, GPU stacking, CPU + memory on package

Protocol Support

Supported Upper-Layer Protocols

UCIe's Protocol Layer is a carrier for existing well-defined protocols, not a new one — it re-uses PCIe and CXL to minimize adoption friction.

⬡

PCIe 5.0 / 6.0

The industry's universal I/O protocol. PCIe 5 uses 128b/130b encoding (32 GT/s). PCIe 6 uses PAM4 + FLIT mode (64 GT/s). Over UCIe, PCIe traffic traverses a die-to-die link instead of a slot connector — same software stack, new physical medium.

⬡

CXL 2.0 / 3.0

Compute Express Link for cache-coherent CPU–accelerator communication. CXL.cache, CXL.mem, and CXL.io run on top of PCIe PHY. Over UCIe, AI accelerators or memory expanders can attach coherently to the CPU chiplet on the same package.

⬡

Streaming Interface

A raw, low-latency, vendor-defined protocol channel. Allows proprietary fabric (AXI streaming, Infinity Fabric, NVLink-like) to traverse a UCIe physical link. Enables custom chiplet topologies while still using standardized packaging and PHY.

Key insight: UCIe does not invent a new protocol. It wraps existing protocols (PCIe, CXL) in a standardized die-to-die physical layer. This means existing PCIe and CXL software stacks work unchanged — only the physical transport changes from a PCIe slot to a bump array on the same package.

Performance

Bandwidth & Signaling Specs

Parameter	Standard Package	Advanced Package
Bump Pitch	25 µm	≤10 µm
Max Data Rate per Bump	16 GT/s	32 GT/s
Bandwidth Density	~16 GB/s/mm	~94 GB/s/mm
Signaling	Differential, AC-coupled	Differential, AC-coupled / DC (HB)
Clock	Forwarded clock per module	Forwarded clock per module
Packaging Technology	Organic substrate, FCBGA	Silicon interposer, EMIB, SoIC, Foveros
Energy Efficiency	~2 pJ/bit typical	<1 pJ/bit target
FEC	Optional (Reed-Solomon)	Optional (Reed-Solomon)
Link Width	64-bit module × N modules	64-bit module × N modules
Latency	~2–4 ns (PHY + D2D)	~1–2 ns (PHY + D2D)

Comparison

UCIe vs Other Die-to-Die Standards

Standard	Organization	Open?	Protocol Layer	Max BW Density	Status
UCIe 1.0	UCIe Consortium	Open	PCIe 5/6, CXL 2/3, Streaming	94 GB/s/mm	Published 2022
Intel AIB	Intel (Open Domain Specific Architecture)	Partially Open	Vendor-defined	~2 TB/s/mm² (area)	ODSA licensed
BoW (Bunch of Wires)	Open Compute Project	Open	None (raw parallel)	~128 GB/s/mm	Niche adoption
HBM (High Bandwidth Memory)	JEDEC	JEDEC standard	Memory-only	~1 TB/s (per stack)	Widely deployed
NVLink-C2C	NVIDIA	Proprietary	NVLink	~900 GB/s total	NVIDIA only
Infinity Fabric (IF)	AMD	Proprietary	AMD-defined	~500 GB/s internal	AMD only

Initialization

UCIe Link Training Sequence

Before data can flow, the D2D Adapter performs a structured link initialization handshake — similar in spirit to PCIe LTSSM but optimized for the on-package environment.

Step	State	What Happens
1	Reset	Both sides hold PHY in reset; bump drivers inactive.
2	Detect	Electrical detect — verifies receiver termination present on bump pins.
3	Initialize	Clock forwarding starts; D2D adapters lock to forwarded clock.
4	Lane Repair	Optional: identify defective bump lanes (due to packaging defects) and remap around them. Critical for advanced packaging yield improvement.
5	Data Calibration	Eye diagram scan; per-lane DFE/CTLE adjustments; PRBS bit-error-rate check.
6	Link Up	Scrambling enabled; FEC (if used) activated; RDI signals to D2D Adapter that PHY is ready.
7	Protocol Active	D2D Adapter signals FDI-ready to Protocol Layer. PCIe/CXL configuration space enumeration begins over the UCIe link.

Lane Repair is a unique UCIe feature for advanced packaging. Because bumps at 10 µm pitch can have manufacturing defects, the D2D Adapter can dynamically remap around a small number of failed bumps during link training — improving yield without manual chip replacement.

Ecosystem

UCIe Consortium & Real-World Adoption

Intel

Founding member and lead contributor. Intel's EMIB (Embedded Multi-die Interconnect Bridge) and Foveros 3D stacking are UCIe-compatible packaging technologies. Ponte Vecchio GPU (Xe-HPC) uses 47 chiplets. Intel Meteor Lake (2023) is the first Intel consumer SoC with a chiplet architecture.

AMD

Founding member. AMD's EPYC "Genoa" and "Bergamo" CPUs already use chiplet architecture (CCDs + IOD) connected by Infinity Fabric. Future products are expected to migrate the inter-chiplet interface toward UCIe for multi-vendor compatibility.

Arm

Founding member. Arm is defining UCIe-compatible interfaces for future Arm Neoverse compute chiplets and Arm Total Design ecosystem. The goal is to allow semiconductor companies to build Arm-based SoCs from pre-validated UCIe chiplets.

Qualcomm

Founding member. Qualcomm's Snapdragon architecture is exploring chiplet designs with UCIe to reduce time-to-market and cost for server-class Oryon CPU chiplets and modem chiplets that can be mixed at the package level.

TSMC

Founding member providing the packaging technology. TSMC's SoIC (System on Integrated Chips) platform using chip-on-wafer bonding is a key advanced-package technology for UCIe, enabling sub-10 µm bump pitch for maximum bandwidth density.

Samsung, Google, Meta, Microsoft

All founding/early members. Cloud hyperscalers (Google, Meta, Microsoft) are driving UCIe adoption for custom AI accelerator chiplets — they can source compute chiplets from one vendor and I/O chiplets from another, assembled into a single package.

FAQ

Frequently Asked Questions

UCIe (Universal Chiplet Interconnect Express) is an open standard published in March 2022 by a consortium of 30+ companies including Intel, AMD, Arm, Qualcomm, Samsung, and TSMC. It defines a standardized die-to-die interface so chiplets from different vendors and foundries can interoperate on the same package. It matters because it enables a chiplet marketplace — just as PCIe allowed any vendor's GPU to plug into any PC, UCIe allows any vendor's compute chiplet to connect to any I/O chiplet, reducing design cost and time-to-market.

1. Protocol Layer — hosts PCIe 5/6, CXL 2/3, or a Streaming protocol. This layer is protocol-aware and generates/terminates packets. 2. Die-to-Die (D2D) Adapter — handles link training, scrambling, optional FEC, retiming, and flow control. 3. Physical Layer — manages the bump array, differential AC-coupled signaling, and forwarded clock. The FDI interface sits between Protocol and D2D Adapter; the RDI interface sits between D2D Adapter and PHY.

Choose Standard Package (25 µm pitch, up to 16 GB/s/mm) if your die-to-die bandwidth requirement is modest and you want to use conventional organic substrate packaging with mature supply chains. Choose Advanced Package (≤10 µm pitch, up to 94 GB/s/mm) if you need the highest possible bandwidth density — typically for AI/HPC accelerators, on-package DRAM-like memory chiplets, or CPU core clusters that need near-HBM bandwidth without an external memory slot. Advanced packaging requires silicon interposers (TSMC CoWoS), Intel EMIB, or hybrid bonding (TSMC SoIC), which adds cost and process complexity.

No. UCIe is a physical die-to-die transport, not a protocol. It carries PCIe and CXL traffic on top of its physical layer. Think of UCIe as defining the cable and connector, while PCIe/CXL are the communication protocol spoken over that cable. PCIe continues to be used for chip-to-board connections (slots, M.2, etc.); UCIe extends PCIe semantics to chip-to-chip within a package.

Lane Repair is a UCIe link training feature that allows the D2D Adapter to identify defective bump lanes (due to micro-bump defects at tight pitches in advanced packaging) and remap those lanes around the fault during initialization. This is critical for advanced packaging at 10 µm pitch where individual bump yield is a real concern. By tolerating a small number of defective bumps in firmware, Lane Repair significantly improves overall chiplet assembly yield.

HBM is a stacked DRAM standard (JEDEC) with a very wide parallel interface (1024 bits per stack) designed specifically for high-bandwidth memory access. UCIe is a general-purpose die-to-die interface that can carry any protocol including PCIe and CXL. They are complementary: an AI accelerator chiplet might connect to its compute partner via UCIe (carrying CXL) and to HBM memory dies via the HBM interface simultaneously — different interfaces for different roles on the same package.

FDI stands for Flit-aware DIE Interface. It is the standardized logical boundary between the Protocol Layer and the Die-to-Die Adapter in UCIe. FDI carries protocol flits (fixed-size data units used by PCIe 6.0 and CXL 3.0) and link management control signals between the two layers. Because FDI is standardized, a company can independently source a PCIe 6 Protocol Layer IP core from one vendor and a UCIe D2D Adapter IP from another vendor, and they will interoperate via FDI without custom integration work.

UCIe – Universal ChipletInterconnect Express

Why Chiplets? The End of the Monolithic Die

UCIe at a Glance

Chiplet Package Architecture

UCIe 3-Layer Stack

Standard Package vs Advanced Package

Supported Upper-Layer Protocols

Bandwidth & Signaling Specs

UCIe vs Other Die-to-Die Standards

UCIe Link Training Sequence

UCIe Consortium & Real-World Adoption

Frequently Asked Questions

UCIe – Universal Chiplet
Interconnect Express