Home Tools CDC Lab Req-Ack Synchronizer
CDC Synchronizer · 4-Phase Handshake

Req-Ack Synchronizer

Request-Acknowledge CDC Handshake — Architecture · FSM · Timing · Verilog RTL
Type 4-Phase Handshake
Sync Chain 2-FF per signal
Data Width Parameterizable
Latency ~4–6 dst cycles
Safe For Multi-bit data CDC
System-Level Architecture
Two unrelated clock domains — only single-bit REQ and ACK cross the boundary through 2-FF synchronizers
SOURCE DOMAIN — CLK_A DESTINATION DOMAIN — CLK_B CDC BOUNDARY DATA REG data[N-1:0] clocked by CLK_A REQ FF req_a set/clear by FSM ACK 2-FF SYNC ack_in → FF1_a → FF2_a ack_sync_a clocked by CLK_A FSM REQ 2-FF SYNC req_in → FF1_b → FF2_b req_sync_b clocked by CLK_B CAPTURE REG (CLK_B) data_b[N-1:0] ACK FF ack_b set/clear by FSM FSM data[N-1:0] (held stable — never sync'd) req_a req_in ack_b ack_in LEGEND REQ path (2-FF sync) ACK path (2-FF sync) Data (no sync — stable) CLK_A (any freq) CLK_B (any freq)
2-FF Synchronizer — Internal Structure
Applied to both REQ (A→B) and ACK (B→A) signal paths — never applied to the data bus
sig_async FF1 D Q may go metastable CLK_DST FF1_Q ⚠ may be metastable 1 clock cycle resolution time Tr FF2 D Q fully resolved output CLK_DST sig_sync ✓ safe to use WHY TWO FLIP-FLOPS? FF1 may enter metastability but has one full clock period (Tr) to resolve. Metastability probability decays as e^(−Tr/τ), τ ≈ 100 ps. FF2 captures a fully resolved 0 or 1. ✓ ⚠ RULE: Apply 2-FF sync only to 1-bit signals (REQ, ACK). NEVER to a multi-bit data bus.
4-Phase Handshake Protocol
Each transfer requires exactly four phase transitions — REQ↑, ACK↑, REQ↓, ACK↓
PHASE 1

SRC asserts REQ

Source locks data onto the bus and asserts req_a ↑. Data is now stable and will not change until the full handshake completes.

PHASE 2

DST sees REQ, asserts ACK

After 2-FF sync delay, req_sync_b ↑ is seen in CLK_B domain. Destination captures data into its local register, then asserts ack_b ↑.

PHASE 3

SRC sees ACK, de-asserts REQ

After 2-FF sync delay, ack_sync_a ↑ is seen in CLK_A domain. Transfer is confirmed. Source de-asserts req_a ↓. Data may now change.

PHASE 4

DST sees REQ low, de-asserts ACK

After 2-FF sync delay, req_sync_b ↓ is seen. Destination de-asserts ack_b ↓. System is now reset — ready for the next transfer.

Timing Waveform
Signal progression across both clock domains — sync delays shown as shaded regions
CLK_A data_a req_a CLK_B req_sync_b ack_b ack_sync_a data_b DATA_OLD DATA_NEW ← locked, stable next data req_a HIGH 2-FF delay req_sync_b HIGH ack_b HIGH 2-FF delay ack_sync_a HIGH → SRC de-asserts REQ DATA_NEW captured in CLK_B domain ✓ invalid / old PRE PH1+2 — REQ↑ → ACK↑ PH3+4 — REQ↓ → ACK↓
Controller FSM — Both Domains
Source FSM (CLK_A) and Destination FSM (CLK_B) drive the 4-phase handshake
SOURCE FSM (CLK_A)
IDLE req_a=0 ASSERT REQ req_a=1 WAIT ACK req_a=1 DE-ASSERT req_a=0 send_req always ack_sync_a=1 wait ack_sync_a=0
DESTINATION FSM (CLK_B)
IDLE ack_b=0 CAPTURE latch data ack_b=0 ASSERT ACK ack_b=1 DE-ASSERT ack_b=0 req_sync_b=1 always req_sync_b=0 wait → IDLE
IP Interface — Port List
Top-level ports for the parameterizable Req-Ack synchronizer module
PortWidthDomainDirDescription
clk_a1CLK_AINSource clock — rising-edge active
rst_a_n1CLK_AINActive-low synchronous reset in source domain
send_req1CLK_AINPulse high for 1 cycle to initiate a transfer. Ignored while busy
data_aNCLK_AINData to transfer — must be stable from send_req until transfer_done
transfer_done1CLK_AOUTHigh for 1 CLK_A cycle when ACK handshake completes
busy1CLK_AOUTHigh while a transfer is in progress
clk_b1CLK_BINDestination clock — rising-edge active
rst_b_n1CLK_BINActive-low synchronous reset in destination domain
data_bNCLK_BOUTCaptured data in destination domain — valid when data_valid_b is high
data_valid_b1CLK_BOUTHigh for 1 CLK_B cycle when data_b has been captured from source
Verilog RTL
Full synthesizable implementation — 2-FF synchronizers + FSMs for both domains
Verilog req_ack_sync.v
module req_ack_sync #(
    parameter DATA_W = 8
) (
    // Source domain (CLK_A)
    input  wire             clk_a, rst_a_n,
    input  wire             send_req,
    input  wire [DATA_W-1:0] data_a,
    output reg              transfer_done,
    output wire             busy,
    // Destination domain (CLK_B)
    input  wire             clk_b, rst_b_n,
    output reg  [DATA_W-1:0] data_b,
    output reg              data_valid_b
);

// ── Source FSM states ───────────────────────────────────────────
localparam [1:0] S_IDLE=2'd0, S_ASSERT=2'd1, S_WAIT=2'd2, S_DEASSERT=2'd3;
reg [1:0] src_state;

// ── Destination FSM states ──────────────────────────────────────
localparam [1:0] D_IDLE=2'd0, D_CAP=2'd1, D_ACK=2'd2, D_DEACK=2'd3;
reg [1:0] dst_state;

// ── Raw handshake signals ───────────────────────────────────────
reg              req_a;      // driven by source FSM
reg              ack_b;      // driven by dest FSM
reg [DATA_W-1:0] data_latch; // locked at send_req, released after done

// ── 2-FF synchronizer: req_a → CLK_B domain ────────────────────
reg ff1_req_b, ff2_req_b;   // synthesis attribute: async_reg
always @(posedge clk_b or negedge rst_b_n)
    if (!rst_b_n) {ff1_req_b, ff2_req_b} <= 2'b0;
    else          {ff2_req_b, ff1_req_b} <= {ff1_req_b, req_a};
wire req_sync_b = ff2_req_b;

// ── 2-FF synchronizer: ack_b → CLK_A domain ────────────────────
reg ff1_ack_a, ff2_ack_a;   // synthesis attribute: async_reg
always @(posedge clk_a or negedge rst_a_n)
    if (!rst_a_n) {ff1_ack_a, ff2_ack_a} <= 2'b0;
    else          {ff2_ack_a, ff1_ack_a} <= {ff1_ack_a, ack_b};
wire ack_sync_a = ff2_ack_a;

// ── Source FSM (CLK_A) ──────────────────────────────────────────
assign busy = (src_state != S_IDLE);
always @(posedge clk_a or negedge rst_a_n) begin
    if (!rst_a_n) begin
        src_state <= S_IDLE; req_a <= 0; data_latch <= 0; transfer_done <= 0;
    end else begin
        transfer_done <= 0;
        case (src_state)
            S_IDLE: if (send_req) begin
                        data_latch <= data_a;
                        req_a      <= 1;
                        src_state  <= S_ASSERT;
                    end
            S_ASSERT: src_state <= S_WAIT; // let req_a propagate 1 cycle
            S_WAIT:   if (ack_sync_a) begin
                          req_a     <= 0;
                          src_state <= S_DEASSERT;
                      end
            S_DEASSERT: if (!ack_sync_a) begin // wait for ACK to fall
                            transfer_done <= 1;
                            src_state     <= S_IDLE;
                        end
        endcase
    end
end

// ── Destination FSM (CLK_B) ─────────────────────────────────────
always @(posedge clk_b or negedge rst_b_n) begin
    if (!rst_b_n) begin
        dst_state <= D_IDLE; ack_b <= 0; data_b <= 0; data_valid_b <= 0;
    end else begin
        data_valid_b <= 0;
        case (dst_state)
            D_IDLE: if (req_sync_b) begin
                        data_b    <= data_latch; // data_latch is stable here
                        dst_state <= D_CAP;
                    end
            D_CAP: begin
                       data_valid_b <= 1;
                       ack_b        <= 1;
                       dst_state    <= D_ACK;
                   end
            D_ACK: if (!req_sync_b) begin // REQ has fallen (source de-asserted)
                       dst_state <= D_DEACK;
                   end
            D_DEACK: begin
                         ack_b     <= 0;
                         dst_state <= D_IDLE;
                     end
        endcase
    end
end

endmodule
Critical Design Rules
Violating any of these causes hard-to-debug intermittent failures in silicon
🚫
Never sync a multi-bit bus

Each bit resolves independently — you can capture a half-old, half-new bus. Only sync single-bit REQ and ACK.

🔒
Lock data before asserting REQ

Data must be stable on the bus before req_a goes high and must not change until transfer_done is received.

Minimum 2 clock cycles per sync

Never reduce to a 1-FF chain to save area. One extra cycle is the MTBF insurance — removing it causes failures at scale.

📌
Mark async_reg in synthesis

Annotate the 2-FF chain registers with ASYNC_REG / async_reg so the tool keeps them close in placement and doesn't retime them.

📊
Use for low-throughput transfers

One word every ~10+ cycles. For bulk streaming data, use a Gray-coded async FIFO — Req-Ack is not designed for continuous flow.

Clock ratio doesn't matter

Req-Ack works correctly regardless of CLK_A vs CLK_B frequency ratio — including when one is much faster or completely asynchronous.

Req-Ack vs Other CDC Techniques
Choose the right synchronizer for your use case

✓ Use Req-Ack when

  • Transferring multi-bit data at low rate
  • You need guaranteed acknowledgment
  • Clock domains are completely asynchronous
  • Transfer rate: <1 word per 10 cycles
  • Control path signals crossing domains

✓ Use 2-FF Sync when

  • Crossing a single-bit signal only
  • Enable, reset, or flag signals
  • No multi-bit data involved
  • No acknowledgment needed
  • Fastest solution — 2-cycle latency

✓ Use Async FIFO when

  • High-bandwidth data streaming
  • Producer and consumer run independently
  • Audio, video, network packet streams
  • Back-pressure / flow control needed
  • 1 word per destination clock throughput
Related
CDC Lab — Metastability, Pulse Sync, MTBF Calculator
Interactive animations for all CDC techniques
Open CDC Lab →