Signal definitions, timing diagrams, state machines, and RTL implementation for AXI4, AHB, APB, AXI-Stream, PCIe, DDR, CXL, UCIe, JTAG, I2C, SPI, and I3C.
Modern SoCs (System-on-Chip) contain dozens of IP blocks — CPUs, GPU cores, DMA engines, memory controllers, peripherals, communication interfaces — all connected by standardized bus protocols. The protocol defines the electrical signaling, the handshake rules, the timing requirements, the error handling, and the arbitration behavior. Every chip that goes to silicon must implement these protocols correctly or the entire system fails.
ARM's AMBA (Advanced Microcontroller Bus Architecture) defines the most widely used on-chip bus protocol family. APB (Advanced Peripheral Bus) handles low-bandwidth, low-frequency peripherals like UARTs, timers, and GPIO. AHB (Advanced High-performance Bus) handles mid-speed masters like DMA engines and instruction caches. AXI4 (Advanced eXtensible Interface) handles high-bandwidth, out-of-order transfers between CPU cores, GPU, and memory controllers. AXI-Stream is a simpler streaming variant for data pipes with no address. Each protocol occupies a different bandwidth/latency tradeoff in the SoC interconnect hierarchy, and most real chips use all three simultaneously through a bus bridge that translates between them.
AXI4's handshake mechanism is simple but has a critical rule: a master must not de-assert VALID after asserting it, until the handshake completes (READY is seen). Violating this rule causes the slave to potentially miss or corrupt the transaction. Conversely, the slave may freely assert and de-assert READY at any time — it creates no obligation. A transfer occurs exactly on the rising clock edge when both VALID and READY are simultaneously high. This asymmetry — strict VALID behavior, loose READY behavior — is fundamental to the protocol's deadlock-free properties. AXI4 also supports out-of-order transactions through transaction IDs, allowing a high-latency memory read to complete after a short peripheral write completes, improving SoC throughput.
PCI Express (PCIe) is the dominant high-speed chip-to-chip and chip-to-device interconnect in servers, workstations, and laptops. It uses serial differential pairs (lanes) at speeds from 2.5 GT/s (Gen 1) to 64 GT/s (Gen 6), with a 3-layer architecture (Physical, Data Link, Transaction). CXL (Compute Express Link) builds on PCIe Gen 5/6 PHY to add cache coherence — allowing the CPU's cache coherency protocol to extend to attached accelerators and memory devices. UCIe (Universal Chiplet Interconnect Express) goes further, standardizing the die-to-die interconnect within a multi-chiplet package, enabling different chiplets (designed by different companies, on different process nodes) to communicate at wafer-scale bandwidths.
An RTL engineer implementing an AXI4 slave must understand the protocol well enough to handle back-pressure correctly, never de-assert VALID inappropriately, and generate the correct response on the BRESP/RRESP channel. A verification engineer writing a UVM environment must model the protocol's valid/ready handshake accurately enough to catch corner cases — like a slave that asserts READY one cycle before data is actually available. Protocol bugs — deadlocks, incorrect RESP values, wrong burst counts — are among the hardest bugs to debug in silicon because they are timing-dependent and require specific traffic patterns to manifest. Understanding the protocol at the signal level, not just the API level, is the difference between writing correct RTL and hoping the VIP catches your bugs.
From low-bandwidth peripheral buses to high-speed interconnects.
Writing RTL that implements a protocol is only half the work. Verifying that the implementation is compliant with the specification — across all legal sequences of VALID and READY, all error injection scenarios, all burst types, and all boundary conditions — requires a structured verification environment. Protocol verification failures are among the most expensive bugs in ASIC design because they often manifest only in specific traffic patterns that are not covered by unit-level directed tests.
A Verification IP (VIP) is a reusable SystemVerilog UVM agent that drives and monitors a protocol interface. The AXI4 VIP drives randomized burst transfers with constrained-random READY de-assertion patterns, verifies the master never illegally withdraws VALID, checks that BRESP and RRESP carry the correct response codes, and monitors out-of-order transaction ID reuse. The APB VIP verifies that PSEL is never asserted without PENABLE following after exactly one cycle in the setup phase. Using a commercial or open-source VIP eliminates weeks of hand-writing a protocol monitor from the specification and gives the verification team a monitor that has been validated against the spec independently of the DUT being tested.
Formal verification tools like Cadence JasperGold and Synopsys VC Formal can prove that a protocol property holds for all possible inputs without simulation. For AXI4, a formal app encodes the VALID-must-not-de-assert rule as an SVA (SystemVerilog Assertion) property and exhaustively proves it against the RTL model. Formal is particularly powerful for protocol compliance because the property space is bounded — the number of legal AXI4 transaction types and response combinations is finite — while simulation coverage is practically infinite. Formal catches the "one-in-a-million" traffic pattern that simulation misses and that causes a functional failure at silicon bring-up.
Beyond legal protocol traffic, a production verification environment injects illegal transactions to test the DUT's error handling. For AXI4, this includes injecting SLVERR and DECERR on BRESP/RRESP to verify the master propagates errors correctly to software. For PCIe, it includes bit-error injection at the data link layer to verify LCRC retry mechanisms. For I2C, it includes injecting clock stretching to verify the master handles slow slave responses without timing out or corrupting subsequent transactions. Error injection requires the VIP to be configurable beyond its default legal mode — a feature that commercial VIPs include but that hand-written monitors often omit, leaving error handling paths untested until silicon.
Coverage-driven verification (CDV) measures which parts of the protocol's functional space have been exercised by simulation. For AXI4, a functional coverage model defines bins for each burst type (INCR, WRAP, FIXED), each burst length (1–256 beats), each data width (8, 16, 32, 64 bits), and each response combination. The simulation harness generates constrained-random traffic until all bins are filled, at which point the protocol's functional space has been fully explored. Crossing coverage with the DUT's RTL code coverage (which lines executed) creates a two-dimensional map of tested vs untested combinations. When all functional coverage bins are hit and RTL coverage exceeds 95%, the verification team has statistical confidence that the protocol implementation is correct — a bar that directed tests alone rarely achieve.