EcrioniX/VLSI/ Metastability CDC FIFO
CDC · Metastability

Metastability in Clock Domain Crossing
Three-Stage FIFO & Synchronizer Design

Metastability causes, two-stage vs three-stage synchronizers, MTBF equation, async FIFO with Gray code pointers — with complete Verilog RTL.

TWO-STAGE vs THREE-STAGE SYNCHRONIZER 2-Stage Sync (standard) ASYNC_IN → FF1 may be meta FF2 resolved → SYNC_OUT 1 cycle resolve time per FF — works for most <500MHz designs 3-Stage Sync (high-speed / safety-critical) ASYNC_IN → FF1 FF2 FF3 → SYNC_OUT 2 cycles resolve time — 1000× better MTBF at 1GHz, +1 cycle latency CLK_DEST domain All FFs clocked by destination clock

What is Metastability? DEFINITION

When a flip-flop samples a signal that is changing within its setup or hold window, the output can enter a metastable state — neither logic 0 nor logic 1 — and remain there for an unpredictable amount of time before resolving to a valid level. This is a fundamental physical phenomenon (not a design bug) caused by the regenerative amplifier behavior of CMOS latches.

In clock domain crossing (CDC), signals from a faster or asynchronous source clock arrive at the receiving flip-flop at an arbitrary phase — making setup/hold violations statistically unavoidable. The solution is synchronization, not elimination.

Critical rule: You cannot prevent metastability from occurring. You can only give it time to resolve. A synchronizer adds one or more clock cycles of latency to allow the metastable flip-flop output to settle before it fans out to combinational logic.

MTBF Equation RELIABILITY

MTBF = e(T_resolve / τ) / (f_clk × f_data × C_meta)
ParameterMeaningTypical Value
T_resolveTime available for metastability to resolve = T_clk − T_setupAt 1GHz: 1ns − 0.05ns ≈ 0.95ns per FF stage
τ (tau)FF metastability resolution time constant — process-dependent28nm: ~50ps, 7nm: ~30–40ps
f_clkDestination clock frequency1 GHz typical
f_dataRate at which the async signal toggles100 MHz typical
C_metaMetastability window constant — flip-flop specific~10⁻¹² seconds

2-stage vs 3-stage MTBF: Adding a 3rd synchronizer FF multiplies MTBF by e(T_clk/τ). At 1GHz with τ=50ps: multiplier = e20 ≈ 5×108. A 2-stage sync with MTBF of 100 years becomes a 3-stage sync with MTBF of 5×1010 years — effectively infinite. Use 3-stage when f_clk > 500MHz or safety requirements demand it.

Synchronizer Types DESIGNS

verilog
// Two-stage synchronizer (standard)
module sync_2stage (
  input  clk_dst, rst_n, async_in,
  output sync_out
);
  reg ff1, ff2;
  // No combinational logic between FFs — synthesis must not split
  (* dont_touch = "true" *) always @(posedge clk_dst or negedge rst_n) begin
    if (!rst_n) {ff2, ff1} <= 2'b0;
    else        {ff2, ff1} <= {ff1, async_in};
  end
  assign sync_out = ff2;
endmodule

// Three-stage synchronizer (high-speed / safety-critical)
module sync_3stage (
  input  clk_dst, rst_n, async_in,
  output sync_out
);
  reg ff1, ff2, ff3;
  (* dont_touch = "true" *) always @(posedge clk_dst or negedge rst_n) begin
    if (!rst_n) {ff3, ff2, ff1} <= 3'b0;
    else        {ff3, ff2, ff1} <= {ff2, ff1, async_in};
  end
  assign sync_out = ff3;
endmodule

Async FIFO — Multi-Bit CDC Using Gray Code ASYNC FIFO

For multi-bit data crossing clock domains, a simple synchronizer doesn't work — multiple bits changing simultaneously can be captured in a mix of old and new values. The solution is an asynchronous FIFO with Gray-coded pointers.

Why Gray code? Only one bit changes between adjacent Gray code values. So when the write pointer is synchronized into the read clock domain, even if the synchronizer captures a metastable value, it resolves to either the old or new pointer value — off by one is safe (worst case: FIFO appears slightly less full/empty than reality).

verilog
// Async FIFO — Gray code pointer synchronization
module async_fifo #(
  parameter DATA_W = 8,
  parameter DEPTH  = 8   // must be power of 2
) (
  input              wclk, wrst_n, wen,
  input  [DATA_W-1:0] wdata,
  output             wfull,

  input              rclk, rrst_n, ren,
  output [DATA_W-1:0] rdata,
  output             rempty
);
  localparam PTR_W = $clog2(DEPTH) + 1;  // extra bit for full/empty distinction

  reg [DATA_W-1:0] mem [0:DEPTH-1];

  // Binary pointers (internal)
  reg [PTR_W-1:0] wptr_bin, rptr_bin;

  // Gray code pointers (for crossing)
  wire [PTR_W-1:0] wptr_gray = wptr_bin ^ (wptr_bin >> 1);
  wire [PTR_W-1:0] rptr_gray = rptr_bin ^ (rptr_bin >> 1);

  // Synchronize wptr_gray → rclk domain (2-stage)
  reg [PTR_W-1:0] wptr_sync1, wptr_sync2;
  always @(posedge rclk or negedge rrst_n) begin
    if (!rrst_n) {wptr_sync2, wptr_sync1} <= '0;
    else         {wptr_sync2, wptr_sync1} <= {wptr_sync1, wptr_gray};
  end

  // Synchronize rptr_gray → wclk domain (2-stage)
  reg [PTR_W-1:0] rptr_sync1, rptr_sync2;
  always @(posedge wclk or negedge wrst_n) begin
    if (!wrst_n) {rptr_sync2, rptr_sync1} <= '0;
    else         {rptr_sync2, rptr_sync1} <= {rptr_sync1, rptr_gray};
  end

  // Write logic
  always @(posedge wclk) if (wen && !wfull) begin
    mem[wptr_bin[PTR_W-2:0]] <= wdata;
    wptr_bin <= wptr_bin + 1;
  end

  // Read logic
  assign rdata = mem[rptr_bin[PTR_W-2:0]];
  always @(posedge rclk) if (ren && !rempty) rptr_bin <= rptr_bin + 1;

  // Full: wptr_gray MSB != rptr_sync MSB, next MSB also != , rest equal
  assign wfull  = (wptr_gray == {~rptr_sync2[PTR_W-1:PTR_W-2], rptr_sync2[PTR_W-3:0]});
  // Empty: synchronized wptr equals rptr_gray
  assign rempty = (wptr_sync2 == rptr_gray);
endmodule

2-Stage vs 3-Stage: When to Use Which GUIDELINE

Criterion2-Stage Sync3-Stage Sync
Destination clock< 500 MHz> 500 MHz
Latency2 cycles3 cycles (+1 vs 2-stage)
MTBF improvementBaseline~10⁸–10¹⁰× better at 1GHz
Functional safetyConsumer / general purposeAutomotive ASIL-D, aerospace, medical
Timing constraintmax_delay / false path on FF1→FF2max_delay / false path on FF1→FF2 and FF2→FF3

CDC SDC Constraints SDC

tcl / SDC
# Set false path on async input to first sync FF
set_false_path -from [get_clocks CLK_SRC] \
               -to   [get_cells sync_ff1_reg]

# OR: set max delay (allows timing check on ff1→ff2)
set_max_delay -datapath_only 1.5 \
  -from [get_cells {sync_ff1_reg}] \
  -to   [get_cells {sync_ff2_reg}]

# For FIFO: constrain Gray code pointer synchronizer
set_max_delay -datapath_only [expr {1.0/$CLK_FAST_MHZ * 1e9}] \
  -from [get_cells wptr_gray_reg*] \
  -to   [get_cells wptr_sync1_reg*]

Frequently Asked Questions FAQ

What is metastability in clock domain crossing? +

Metastability occurs when a flip-flop samples a signal changing within its setup/hold window — producing an undefined output that takes unpredictable time to resolve. In CDC, asynchronous signals always risk this. Synchronizers add FF stages to give more time to resolve before the output propagates.

When should I use a three-stage synchronizer? +

Use 3-stage when: destination clock >500MHz, automotive/aerospace functional safety (ASIL-D), or process node τ is large. The 3rd FF multiplies MTBF by ~10⁸–10¹⁰ at 1GHz. Trade-off: +1 clock cycle latency.

Why use Gray code in async FIFO? +

Gray code changes only 1 bit at a time between adjacent values. When a binary pointer crosses clock domains, multiple bits change simultaneously — can corrupt full/empty detection. With Gray code, only 1 bit changes → the synchronized pointer is always the old or new value (off by 1 at worst), which is safe for FIFO control logic.

What SDC constraint do I need for a synchronizer? +

Use set_false_path or set_max_delay -datapath_only from the source clock to the first sync FF. Do NOT set false path between sync FF1 and FF2 — that path must be timed (the hold check on FF1→FF2 is real and must pass in the destination clock domain).

Clock Domain Crossing Clock Skew & CTS VLSI