Metastability causes, two-stage vs three-stage synchronizers, MTBF equation, async FIFO with Gray code pointers — with complete Verilog RTL.
When a flip-flop samples a signal that is changing within its setup or hold window, the output can enter a metastable state — neither logic 0 nor logic 1 — and remain there for an unpredictable amount of time before resolving to a valid level. This is a fundamental physical phenomenon (not a design bug) caused by the regenerative amplifier behavior of CMOS latches.
In clock domain crossing (CDC), signals from a faster or asynchronous source clock arrive at the receiving flip-flop at an arbitrary phase — making setup/hold violations statistically unavoidable. The solution is synchronization, not elimination.
Critical rule: You cannot prevent metastability from occurring. You can only give it time to resolve. A synchronizer adds one or more clock cycles of latency to allow the metastable flip-flop output to settle before it fans out to combinational logic.
| Parameter | Meaning | Typical Value |
|---|---|---|
| T_resolve | Time available for metastability to resolve = T_clk − T_setup | At 1GHz: 1ns − 0.05ns ≈ 0.95ns per FF stage |
| τ (tau) | FF metastability resolution time constant — process-dependent | 28nm: ~50ps, 7nm: ~30–40ps |
| f_clk | Destination clock frequency | 1 GHz typical |
| f_data | Rate at which the async signal toggles | 100 MHz typical |
| C_meta | Metastability window constant — flip-flop specific | ~10⁻¹² seconds |
2-stage vs 3-stage MTBF: Adding a 3rd synchronizer FF multiplies MTBF by e(T_clk/τ). At 1GHz with τ=50ps: multiplier = e20 ≈ 5×108. A 2-stage sync with MTBF of 100 years becomes a 3-stage sync with MTBF of 5×1010 years — effectively infinite. Use 3-stage when f_clk > 500MHz or safety requirements demand it.
// Two-stage synchronizer (standard) module sync_2stage ( input clk_dst, rst_n, async_in, output sync_out ); reg ff1, ff2; // No combinational logic between FFs — synthesis must not split (* dont_touch = "true" *) always @(posedge clk_dst or negedge rst_n) begin if (!rst_n) {ff2, ff1} <= 2'b0; else {ff2, ff1} <= {ff1, async_in}; end assign sync_out = ff2; endmodule // Three-stage synchronizer (high-speed / safety-critical) module sync_3stage ( input clk_dst, rst_n, async_in, output sync_out ); reg ff1, ff2, ff3; (* dont_touch = "true" *) always @(posedge clk_dst or negedge rst_n) begin if (!rst_n) {ff3, ff2, ff1} <= 3'b0; else {ff3, ff2, ff1} <= {ff2, ff1, async_in}; end assign sync_out = ff3; endmodule
For multi-bit data crossing clock domains, a simple synchronizer doesn't work — multiple bits changing simultaneously can be captured in a mix of old and new values. The solution is an asynchronous FIFO with Gray-coded pointers.
Why Gray code? Only one bit changes between adjacent Gray code values. So when the write pointer is synchronized into the read clock domain, even if the synchronizer captures a metastable value, it resolves to either the old or new pointer value — off by one is safe (worst case: FIFO appears slightly less full/empty than reality).
// Async FIFO — Gray code pointer synchronization module async_fifo #( parameter DATA_W = 8, parameter DEPTH = 8 // must be power of 2 ) ( input wclk, wrst_n, wen, input [DATA_W-1:0] wdata, output wfull, input rclk, rrst_n, ren, output [DATA_W-1:0] rdata, output rempty ); localparam PTR_W = $clog2(DEPTH) + 1; // extra bit for full/empty distinction reg [DATA_W-1:0] mem [0:DEPTH-1]; // Binary pointers (internal) reg [PTR_W-1:0] wptr_bin, rptr_bin; // Gray code pointers (for crossing) wire [PTR_W-1:0] wptr_gray = wptr_bin ^ (wptr_bin >> 1); wire [PTR_W-1:0] rptr_gray = rptr_bin ^ (rptr_bin >> 1); // Synchronize wptr_gray → rclk domain (2-stage) reg [PTR_W-1:0] wptr_sync1, wptr_sync2; always @(posedge rclk or negedge rrst_n) begin if (!rrst_n) {wptr_sync2, wptr_sync1} <= '0; else {wptr_sync2, wptr_sync1} <= {wptr_sync1, wptr_gray}; end // Synchronize rptr_gray → wclk domain (2-stage) reg [PTR_W-1:0] rptr_sync1, rptr_sync2; always @(posedge wclk or negedge wrst_n) begin if (!wrst_n) {rptr_sync2, rptr_sync1} <= '0; else {rptr_sync2, rptr_sync1} <= {rptr_sync1, rptr_gray}; end // Write logic always @(posedge wclk) if (wen && !wfull) begin mem[wptr_bin[PTR_W-2:0]] <= wdata; wptr_bin <= wptr_bin + 1; end // Read logic assign rdata = mem[rptr_bin[PTR_W-2:0]]; always @(posedge rclk) if (ren && !rempty) rptr_bin <= rptr_bin + 1; // Full: wptr_gray MSB != rptr_sync MSB, next MSB also != , rest equal assign wfull = (wptr_gray == {~rptr_sync2[PTR_W-1:PTR_W-2], rptr_sync2[PTR_W-3:0]}); // Empty: synchronized wptr equals rptr_gray assign rempty = (wptr_sync2 == rptr_gray); endmodule
| Criterion | 2-Stage Sync | 3-Stage Sync |
|---|---|---|
| Destination clock | < 500 MHz | > 500 MHz |
| Latency | 2 cycles | 3 cycles (+1 vs 2-stage) |
| MTBF improvement | Baseline | ~10⁸–10¹⁰× better at 1GHz |
| Functional safety | Consumer / general purpose | Automotive ASIL-D, aerospace, medical |
| Timing constraint | max_delay / false path on FF1→FF2 | max_delay / false path on FF1→FF2 and FF2→FF3 |
# Set false path on async input to first sync FF set_false_path -from [get_clocks CLK_SRC] \ -to [get_cells sync_ff1_reg] # OR: set max delay (allows timing check on ff1→ff2) set_max_delay -datapath_only 1.5 \ -from [get_cells {sync_ff1_reg}] \ -to [get_cells {sync_ff2_reg}] # For FIFO: constrain Gray code pointer synchronizer set_max_delay -datapath_only [expr {1.0/$CLK_FAST_MHZ * 1e9}] \ -from [get_cells wptr_gray_reg*] \ -to [get_cells wptr_sync1_reg*]
Metastability occurs when a flip-flop samples a signal changing within its setup/hold window — producing an undefined output that takes unpredictable time to resolve. In CDC, asynchronous signals always risk this. Synchronizers add FF stages to give more time to resolve before the output propagates.
Use 3-stage when: destination clock >500MHz, automotive/aerospace functional safety (ASIL-D), or process node τ is large. The 3rd FF multiplies MTBF by ~10⁸–10¹⁰ at 1GHz. Trade-off: +1 clock cycle latency.
Gray code changes only 1 bit at a time between adjacent values. When a binary pointer crosses clock domains, multiple bits change simultaneously — can corrupt full/empty detection. With Gray code, only 1 bit changes → the synchronized pointer is always the old or new value (off by 1 at worst), which is safe for FIFO control logic.
Use set_false_path or set_max_delay -datapath_only from the source clock to the first sync FF. Do NOT set false path between sync FF1 and FF2 — that path must be timed (the hold check on FF1→FF2 is real and must pass in the destination clock domain).