⚡ Interactive Lab

AXI4 Handshake Lab

Watch the VALID/READY protocol live — scrolling waveforms, independent write & read channels, back-pressure stall simulation, and cycle-accurate efficiency tracking.

Live Waveform Simulation VALID / READY Protocol Write & Read Channels Back-Pressure Stalls Verilog RTL
Master VALID
Slave READY
Slave Response
Master Resp-Ready
Handshake cycle
Speed
Back-pressure
0
Transactions
0
Stall Cycles
Efficiency

How AXI4 Handshake Works

The Golden Rule

A transfer completes only when VALID and READY are both HIGH on the same rising clock edge. Either side can stall independently.

1
Master asserts VALID and holds data stable
2
Slave asserts READY when it can accept
3
Transfer completes — both high on same rising edge

Write Transaction (3 phases)

Three independent channel handshakes — AW, W, and B can overlap in a pipelined design:

AW
Address — master sends AWADDR, AWLEN, AWSIZE
W
Data — master sends WDATA + WSTRB byte-enables
B
Response — slave returns BRESP = OKAY / SLVERR

📖 Read Transaction (2 phases)

Slave holds RVALID low while fetching from memory — this is normal read latency, not an error:

AR
Address — master sends ARADDR, ARLEN, ARSIZE
R
Data — slave returns RDATA + RRESP (may stall)

Back-Pressure Rule

Once VALID is asserted, the master cannot withdraw it until the handshake completes. The slave may freely de-assert READY between cycles.

Enable back-pressure above to watch stall cycles accumulate and bus efficiency drop in real-time.

Verilog RTL – AXI4-Lite Slave

verilog · axi4_lite_slave.v
// ── Write Channel ────────────────────────────────────────────────────── module axi4_lite_slave_wr ( input wire aclk, aresetn, input wire awvalid, output reg awready, input wire [31:0] awaddr, input wire wvalid, output reg wready, input wire [31:0] wdata, input wire [ 3:0] wstrb, output reg bvalid, input wire bready, output reg [1:0] bresp ); localparam IDLE=2'd0, AW=2'd1, W=2'd2, B=2'd3; reg [1:0] state; reg [31:0] regs [0:15]; reg [31:0] addr_lat; always @(posedge aclk or negedge aresetn) begin if (!aresetn) begin state<=IDLE; awready<=0; wready<=0; bvalid<=0; bresp<=0; end else begin awready<=0; wready<=0; case (state) IDLE: if (awvalid) begin awready<=1; addr_lat<=awaddr; state<=AW; end AW: if (wvalid) begin wready<=1; state<=W; end W: begin if (wstrb[0]) regs[addr_lat[5:2]][ 7: 0] <= wdata[ 7: 0]; if (wstrb[1]) regs[addr_lat[5:2]][15: 8] <= wdata[15: 8]; if (wstrb[2]) regs[addr_lat[5:2]][23:16] <= wdata[23:16]; if (wstrb[3]) regs[addr_lat[5:2]][31:24] <= wdata[31:24]; bvalid<=1; bresp<=2'b00; state<=B; end B: if (bready) begin bvalid<=0; state<=IDLE; end endcase end end endmodule // ── Read Channel ─────────────────────────────────────────────────────── module axi4_lite_slave_rd ( input wire aclk, aresetn, input wire arvalid, output reg arready, input wire [31:0] araddr, output reg rvalid, input wire rready, output reg [31:0] rdata, output reg [1:0] rresp ); reg state; reg [31:0] regs [0:15]; // shared register file always @(posedge aclk or negedge aresetn) begin if (!aresetn) begin state<=0; arready<=0; rvalid<=0; rresp<=0; rdata<=0; end else begin arready<=0; case (state) 0: if (arvalid) begin arready<=1; rdata<=regs[araddr[5:2]]; rresp<=2'b00; rvalid<=1; state<=1; end 1: if (rready) begin rvalid<=0; state<=0; end endcase end end endmodule

Frequently Asked Questions

AXI4 (Advanced eXtensible Interface 4) is the AMBA 4 bus protocol used in virtually every modern SoC. Its handshake mechanism uses a VALID/READY pair: the sender asserts VALID when data is ready, the receiver asserts READY when it can accept, and a transfer occurs exactly on the rising clock edge when both are simultaneously high.
No — this is a hard AXI protocol rule. Once the sender asserts VALID, it must hold VALID and all associated channel signals (address, data, strobe, etc.) stable until the handshake completes. The READY side has no such constraint and may freely toggle between cycles.
Back-pressure occurs when a downstream block holds READY low — signaling it cannot accept data yet. The master stalls (holding VALID high) until the slave is ready. No data is lost, the bus simply pauses. Toggle back-pressure in this lab to see stall cycles accumulate and efficiency drop in real-time.
AXI4-Lite is a register-access subset: single-beat transfers only, no burst support, fixed 32/64-bit data width, typically one outstanding transaction. Full AXI4 adds burst lengths up to 256 beats (AWLEN/ARLEN), multiple outstanding transactions with ID tagging, out-of-order completion, and narrow/unaligned transfers. The VALID/READY handshake protocol is identical in both.
Yes. AXI4 write channels (AW/W/B) and read channels (AR/R) are completely independent. A master may issue a read and write simultaneously. Full AXI4 also supports multiple outstanding IDs per channel, enabling deep pipelines — a master can issue many reads before receiving any responses, maximising memory bandwidth.
2-bit response field: 2'b00 = OKAY (successful transfer), 2'b01 = EXOKAY (exclusive access succeeded), 2'b10 = SLVERR (slave error — peripheral fault), 2'b11 = DECERR (decode error — no slave mapped at that address). Normal register-level accesses always return OKAY.

AXI4 in Real SoC Design — Beyond the Handshake

Outstanding Transactions and ID Tagging

Full AXI4 (not AXI4-Lite) supports multiple outstanding transactions through ID tagging. A master may issue 8 read requests before receiving a single response, each tagged with a unique ARID value. The slave may return responses out of order as long as it uses the matching RID. This out-of-order capability is what gives AXI4 its high memory bandwidth efficiency: the master keeps the bus busy issuing new requests rather than waiting idle for responses to return. The ordering rule is stricter within a single ID: responses for the same ARID must return in the same order the requests were issued. This constraint forces slaves with in-order completion (like most SRAM controllers) to serialize requests with the same ID, which is why master IDs are carefully partitioned in real interconnect designs — a DMA engine and a CPU core using the same ID would serialize their requests through a shared slave.

AXI4 Burst Types: INCR, WRAP, FIXED

AXI4 supports three burst types encoded in AWBURST/ARBURST. INCR (incrementing) is the most common: each beat's address increments by the transfer size. A 4-beat INCR burst to address 0x1000 with 32-bit transfers accesses 0x1000, 0x1004, 0x1008, 0x100C. WRAP (wrapping) is used primarily for cache line refills: the address increments but wraps at a power-of-2 boundary. A 4-beat WRAP burst starting at 0x1008 wraps at 0x1010, accessing 0x1008, 0x100C, 0x1000, 0x1004 — the cache refill returns the requested word first and fills the rest of the cache line in address-wrap order. FIXED repeats the same address every beat, useful for polling a status register or streaming to a FIFO. Understanding burst types is critical for memory controller RTL design: WRAP bursts require the controller to track a boundary and reset the address mid-burst, which adds a state to the burst FSM.

Narrow Transfers and Byte Strobes

AXI4 allows transfers narrower than the bus width through WSTRB (write strobe) — one bit per byte lane. A 32-bit AXI4 bus has a 4-bit WSTRB. Writing a single byte to address 0x1001 uses WSTRB = 4'b0010 to enable only byte lane 1. The slave must mask its write operation to update only the enabled byte lanes. This mechanism allows a 32-bit bus to serve 8-bit and 16-bit peripherals without requiring separate bus width adapters. For RTL engineers designing a slave register block, WSTRB handling is a common source of bugs: the write logic must apply the strobe on a per-byte basis, not a per-word basis. A common mistake is writing if (wvalid && wready) reg <= wdata; which ignores WSTRB completely, corrupting bytes adjacent to the intended write target.

Interconnect Deadlock Prevention

In an AXI4 interconnect with multiple masters and slaves, circular dependencies can cause deadlock. The canonical deadlock scenario: master A holds write data buffer space waiting for a write response (B channel), while master B holds the address buffer space needed by master A's outstanding address transaction, and the slave's response cannot proceed because the return path is blocked by master A's stall. The AXI specification defines ordering rules and recommends that interconnect components always have capacity to accept the write response channel (B channel) independently of the write data channel (W channel). Interconnect implementations enforce this by sizing per-channel buffers independently rather than sharing a common pool. When designing AXI-connected subsystems, ensuring the B and R channels are never starved is as important as managing VALID/READY timing — a well-behaved handshake at the individual channel level does not prevent deadlock if channel arbitration creates circular dependencies.