Question 1

What is CCIX protocol?

Accepted Answer

CCIX (Cache Coherent Interconnect for Accelerators) is an open industry standard that extends PCIe with cache coherency between host CPUs and accelerators (GPUs, FPGAs, ML accelerators). It allows a CPU and an accelerator to share the same cache coherency domain — meaning both see a consistent view of memory without software-managed cache flushing. CCIX runs over standard PCIe physical layer (Gen3/Gen4) using an enhanced Transaction Layer Packet (TLP) format.

Question 2

What is the difference between CCIX and CXL?

Accepted Answer

Both CCIX and CXL (Compute Express Link) add cache coherency over PCIe, but they differ: CCIX was developed by an industry consortium (ARM, AMD, Xilinx, Qualcomm) and predates CXL. CXL (Intel-led, now broader) has become the dominant standard with stronger industry backing, official PCI-SIG integration, and three protocol variants (CXL.io, CXL.cache, CXL.mem). As of 2024, CXL has largely superseded CCIX for new designs, though CCIX deployments exist in ARM-based servers and FPGA accelerators.

Question 3

How does CCIX achieve cache coherency?

Accepted Answer

CCIX uses a snoop-based coherency protocol. When an accelerator reads a cache line, it sends a read request to the host. The host Home Node (HN) checks whether the CPU cache holds a dirty copy (via snoop), forces a writeback if needed, then supplies the data to the accelerator. For writes, the accelerator acquires exclusive ownership via a ReadUnique or MakeUnique transaction, invalidating all other copies. The protocol supports MOESI states (Modified, Owned, Exclusive, Shared, Invalid).

Question 4

What are the main CCIX transaction types?

Accepted Answer

CCIX transaction types mirror ARM CHI protocol: ReadNoSnoop (non-coherent read), ReadOnce (transient coherent read), ReadUnique (read for ownership — write intent), MakeUnique (acquire ownership without data), WriteNoSnoop (non-coherent write), WriteUnique (coherent write), Evict (evict a clean line), Snoop (HN-initiated snoop to RN cache). Request Node (RN) = accelerator. Home Node (HN) = CPU/SoC system memory controller.

Question 5

What chips use CCIX?

Accepted Answer

CCIX was adopted by: AMD EPYC 'Rome' and 'Milan' CPUs (CCIX-capable PCIe slots), Xilinx Alveo FPGAs (U280, U250), Marvell ThunderX2 ARM server CPUs, Ampere Altra ARM server CPUs, Arm Neoverse-based SoCs, and various AI accelerator ASICs. In practice, many of these platforms have since shifted toward CXL for new generation designs.

Property	CCIX	CXL	CAPI (IBM)
Founded	2016, CCIX Consortium	2019, Intel-led	2013, IBM
Physical layer	PCIe Gen3/Gen4	PCIe Gen5/Gen6	PCIe Gen3+
Protocol basis	ARM CHI-like	CXL.cache, CXL.mem, CXL.io	IBM POWER architecture
Coherency model	Full MOESI snoop-based	Full coherency (CXL.cache)	Full coherency
Memory semantic	Host memory + device memory	CXL.mem for device memory	Host memory
Industry status (2026)	Legacy / limited new adoption	Dominant standard	IBM Power-only
Key adopters	AMD EPYC, Xilinx FPGAs, ARM servers	Intel, AMD, NVIDIA, Samsung, all major SoC	IBM Power servers

Node Type	Role	Example
RN-F (Request Node - Full)	Fully coherent agent — has a cache, participates in snoops	GPU with cache, FPGA compute engine
RN-I (Request Node - I/O)	Non-caching agent — issues read/write without snoops	DMA engine, I/O device
HN-F (Home Node - Full)	Point of coherency — receives requests, issues snoops, orders transactions	CPU LLC controller, SoC NIC
HN-I (Home Node - I/O)	Manages I/O address space, non-coherent	Peripheral fabric controller
SN (Slave Node)	DRAM controller — serves data to HN	DDR5 memory controller

Transaction	Direction	Coherency	Purpose
ReadNoSnoop	RN → HN	Non-coherent	DMA-style read, no cache involvement
ReadOnce	RN → HN	Coherent, transient	Read data once, don't cache long-term
ReadShared	RN → HN	Coherent Shared	Read and cache in Shared state
ReadUnique	RN → HN	Exclusive (write intent)	Read with intent to modify — invalidates other copies
MakeUnique	RN → HN	Exclusive (upgrade)	Upgrade Shared → Exclusive without data transfer
WriteNoSnoop	RN → HN	Non-coherent	Non-coherent write to memory
WriteUnique	RN → HN	Coherent write	Write to unique copy — invalidates Shared copies
Evict	RN → HN	Cache management	Notify HN that a clean Shared line is being evicted
SnpShared	HN → RN	Snoop	HN-initiated: downgrade cache line to Shared
SnpUnique	HN → RN	Snoop	HN-initiated: invalidate cache line (for new exclusive owner)
SnpCleanInvalid	HN → RN	Snoop	HN-initiated: writeback dirty data and invalidate

State	Meaning	Can Read?	Can Write?	Must Writeback?
M — Modified	Only copy, dirty (differs from memory)	Yes	Yes	Yes (on eviction)
O — Owned	Dirty, shared with others — owner must supply data on snoop	Yes	No	Yes
E — Exclusive	Only copy, clean (matches memory)	Yes	Yes (silent → M)	No
S — Shared	Clean, multiple caches may hold	Yes	No (must upgrade)	No
I — Invalid	Not present in cache	No (must fetch)	No	No

Layer	CCIX Definition	Standard Equivalent
Application	Coherent memory transactions	Custom per use case
Transaction Layer	CCIX TLP extensions (over PCIe TLP)	PCIe TLP + CCIX header
Data Link Layer	PCIe DLLP (unchanged)	PCIe standard
Physical Layer	PCIe Gen3/Gen4 SerDes	PCIe standard

CCIX Protocol Explained
Cache Coherent Interconnect for Accelerators

What is CCIX? OVERVIEW

CCIX vs CXL vs CAPI COMPARISON

CCIX Architecture — Key Nodes ARCHITECTURE

CCIX Transaction Types PROTOCOL

Cache Coherency States (MOESI) COHERENCY

CCIX Use Cases APPLICATIONS

CCIX Protocol Stack LAYERS

Frequently Asked Questions FAQ

CCIX Protocol ExplainedCache Coherent Interconnect for Accelerators