Topic 05 — RTL Design

Clock Domain
Crossing (CDC)

Safely transferring signals across asynchronous clock boundaries is one of the most critical — and most dangerous — challenges in RTL design. A single unsynchronized signal can cause silent, intermittent failures that only appear in silicon at full speed, impossible to catch in simulation. Understanding CDC physics, synchronizer design, and MTBF analysis is essential for every VLSI engineer.

2-FF / 3-FF Synchronizer
MTBF Analysis
Async FIFO
Gray Code Pointers

Why CDC Is Dangerous

Modern SoCs have dozens of clock domains — a CPU core at 2 GHz, a USB PHY at 480 MHz, an audio DAC at 22.579 MHz, a DDR controller at 1066 MHz. Whenever data flows from one domain to another, it crosses an asynchronous boundary.

The danger: if a flip-flop's setup or hold time is violated — which is statistically inevitable when two unrelated clocks interact — the output can become metastable: neither a clean 0 nor a clean 1, but an indeterminate voltage that can propagate through logic and cause unpredictable failures. CDC bugs are especially treacherous because they pass RTL simulation (which uses zero-delay clocks) and only manifest at speed in real silicon.

Root Cause

Setup/Hold Violation

When the source domain asserts data too close to a destination clock edge — within the setup or hold window — the destination FF cannot resolve to a stable value. The window is typically 50–200 ps wide, making violations statistically certain over billions of clock cycles.

Symptom

Intermittent Silicon Failures

CDC bugs cause random, rate-dependent failures. A chip may pass qualification at 25°C but fail at 85°C (slower resolution). Or pass at nominal voltage but fail at min VDD. These failures can take months to diagnose in post-silicon debug.

Why Sim Misses It

RTL Simulation Is Zero-Delay

Standard RTL simulators assume clocks are ideal and propagation is instantaneous. They never model metastability or resolution time. A signal that crosses domains appears to work perfectly in simulation but fails at speed in silicon.

Metastability — What Actually Happens

A flip-flop is a bistable circuit — it has two stable states (0 and 1) separated by an unstable equilibrium point. When the input changes within the setup/hold window, the FF is forced toward this unstable equilibrium and gets stuck there — neither resolving to 0 nor 1.

The probability of remaining metastable decays exponentially with time. The key parameter is τ (tau) — the metastability resolution time constant of the flip-flop, which depends on the semiconductor process and is typically 20–50 ps for modern CMOS. The longer the FF has to resolve before the next stage samples it, the safer the design.

Key Insight:

You cannot eliminate metastability — you can only reduce its probability to an acceptable level by giving the FF sufficient resolution time. The 2-FF synchronizer gives one full destination clock period for resolution, reducing failure probability to near zero.

MTBF Formula — Full Derivation
// Probability of metastability NOT resolving in time T_res:
// P_fail = T_w × f_data × exp(-T_res / τ)
//
// MTBF = 1 / (f_dest × P_fail)
//      = exp(T_res / τ) / (f_dest × f_data × T_w)
//
// Where:
//   T_res = T_clk_dest - T_setup_FF2 - T_cq_FF1   (resolution time)
//   τ     = metastability time constant (~30 ps, process dependent)
//   T_w   = metastability window (~100 ps = T_setup + T_hold)
//   f_dest = destination clock frequency (Hz)
//   f_data = data toggle rate at source (Hz)
//
// For N-stage synchronizer:
//   MTBF_N = exp(N × T_clk_dest / τ) / (f_dest × f_data × T_w)
//
// Example: 500 MHz dst clk, 100 MHz data, τ=30ps, T_w=100ps, 2 stages
//   T_res per stage = 2ns - 0.05ns - 0.1ns = 1.85ns
//   MTBF = exp(2 × 1.85e-9 / 30e-12) / (500e6 × 100e6 × 100e-12)
//        = exp(123.3) / 5000 ≈ 10^51 seconds ≈ 10^44 years ✓

The 2-FF Synchronizer

The 2-FF synchronizer is the fundamental CDC building block for single-bit signals. Two flip-flops are connected in series, both clocked by the destination domain clock. The first FF samples the async input and may go metastable. The second FF samples after one full destination clock period — enough time for the first FF to almost certainly resolve.

Verilog — 2-FF Synchronizer
module sync_2ff #(
  parameter STAGES = 2,   // number of sync stages (2 or 3)
  parameter RESET_VAL = 1'b0
) (
  input  wire clk_dst,    // destination domain clock
  input  wire rst_n_dst, // destination domain async reset
  input  wire data_src,  // async input from source domain
  output wire data_dst   // synchronized output in dst domain
);
  // Shift register — each stage is one FF in dst domain
  reg [STAGES-1:0] sync_chain;

  // (* ASYNC_REG = "TRUE" *) — tells tools this is a sync chain
  // Prevents optimization/retiming across these registers!
  (* ASYNC_REG = "TRUE" *)

  always @(posedge clk_dst or negedge rst_n_dst) begin
    if (!rst_n_dst)
      sync_chain <= {STAGES{RESET_VAL}};
    else
      sync_chain <= {sync_chain[STAGES-2:0], data_src};
  end

  assign data_dst = sync_chain[STAGES-1];

endmodule
Critical: ASYNC_REG Attribute

Always mark synchronizer flip-flops with (* ASYNC_REG = "TRUE" *) in Vivado or equivalent attributes in your tool. This prevents synthesis from optimizing, merging, or retiming across the synchronizer chain — which would destroy its metastability protection properties.

3-FF Synchronizer — When to Use It

A 3-FF synchronizer adds a third stage, giving the first FF two full destination clock periods to resolve instead of one. This increases MTBF exponentially — typically from millions of years to astronomical numbers.

ConditionUse 2-FFUse 3-FF
Destination clock ≥ Source clock✅ Standard choiceOverkill
Destination clock < Source clock (fast-to-slow)Marginal — pulse may be missed✅ Preferred for safety
High-reliability (medical, automotive, aerospace)May not meet MTBF spec✅ Required
Very high destination clock (>1 GHz)Short T_res, lower MTBF✅ Use 3-FF or more
Low-cost FPGA with poor τMarginal✅ Safer choice

Why You Cannot 2-FF a Bus

A common beginner mistake: applying a 2-FF synchronizer to a multi-bit data bus. This is incorrect and dangerous. Each bit of the bus has its own independent first FF that may go metastable. Different bits can resolve in different destination clock cycles — producing an intermediate value that was never a valid state in the source domain.

Never Do This:

Passing an 8-bit counter directly through a 2-FF synchronizer. If bits 3 and 4 resolve on different cycles, you might see the bus value jump from 0x07 to 0x18 — a value the counter never held. This corrupts state silently.

Signal TypeCDC TechniqueWhy
Single-bit control (enable, valid)2-FF / 3-FF synchronizerOnly one bit — no coherency issue
Multi-bit data busAsync FIFOGuarantees coherent multi-bit capture
Multi-bit pointer/counterGray code + 2-FF syncGray code changes only 1 bit at a time
Slow-changing config registerQualified synchronizer or handshakeData stable for multiple cycles
Pulse (short signal)Pulse synchronizer2-FF may miss fast pulses

Gray Code for Multi-Bit Pointers

Gray code is a number system where consecutive values differ by exactly one bit. This makes it safe to synchronize a multi-bit counter across clock domains — even if the synchronizer captures the pointer mid-transition, only one bit is changing, so the captured value is always either the old count or the new count — never an invalid intermediate.

Verilog — Binary to Gray Code Conversion
// Convert binary counter to Gray code before crossing domain
module bin2gray #(parameter W = 4) (
  input  [W-1:0] bin,
  output [W-1:0] gray
);
  assign gray = bin ^ (bin >> 1);   // G[i] = B[i] XOR B[i+1]
endmodule

// Convert Gray code back to binary (in destination domain)
module gray2bin #(parameter W = 4) (
  input  [W-1:0] gray,
  output [W-1:0] bin
);
  genvar i;
  generate
    for (i = 0; i < W; i = i + 1) begin
      assign bin[i] = ^gray[W-1:i];   // XOR reduction
    end
  endgenerate
endmodule

Request-Acknowledge Handshake

When data changes infrequently and latency is acceptable, a req/ack handshake provides a fully safe multi-bit CDC mechanism. The source asserts req (synchronized to destination), destination processes data and asserts ack (synchronized back to source). No data can change until the full handshake completes.

Verilog — Req/Ack Handshake CDC
// Source domain: assert req, hold data stable until ack received
always @(posedge clk_src or negedge rst_src_n) begin
  if (!rst_src_n) begin
    req_src <= 1'b0; data_reg <= '0;
  end else if (send && !req_src && !ack_sync) begin
    data_reg <= data_in;   // capture data
    req_src  <= 1'b1;      // assert request
  end else if (ack_sync) begin
    req_src  <= 1'b0;      // deassert after ack seen
  end
end

// Destination domain: sync req, capture data, assert ack
always @(posedge clk_dst or negedge rst_dst_n) begin
  if (!rst_dst_n) begin
    ack_dst <= 1'b0; data_out <= '0;
  end else if (req_sync && !ack_dst) begin
    data_out <= data_reg;  // data stable — safe to capture
    ack_dst  <= 1'b1;      // acknowledge
  end else if (!req_sync) begin
    ack_dst  <= 1'b0;      // deassert ack after req gone
  end
end

// Sync req into dst domain, sync ack back into src domain
sync_2ff req_sync_inst (.clk_dst(clk_dst), .data_src(req_src), .data_dst(req_sync));
sync_2ff ack_sync_inst (.clk_dst(clk_src), .data_src(ack_dst), .data_dst(ack_sync));

Common CDC Bugs That Escape Simulation

Bug

Multi-bit Bus Through 2-FF

Applying a single 2-FF synchronizer across all bits of a data bus. Each bit's first FF resolves independently — bits can land in different destination cycles, creating invalid intermediate values.

Bug

Fan-Out from Sync Output

Using the output of a synchronizer to drive combinational logic before registering. The long combinational path adds delay, reducing T_res for any downstream FF that captures the result.

Bug

Missing Reconvergence

Two related signals (e.g., data and valid) synchronized separately. They may arrive at the destination in different clock cycles — valid asserts before data is stable or vice versa.

Bug

Fast-to-Slow Pulse Loss

A pulse shorter than one destination clock period may not be captured at all by a 2-FF synchronizer. Fast-to-slow crossing requires pulse stretching or a dedicated pulse synchronizer circuit.

Bug

Optimized-Away Synchronizer

Synthesis tools may optimize or retime registers it doesn't know are synchronizers. Without ASYNC_REG or equivalent constraints, the two FFs may be merged into one, or logic inserted between them.

Bug

Reset Synchronizer Missing

Applying a global asynchronous reset directly to flip-flops in multiple clock domains. Different domains come out of reset at different clock edges, causing state machines to start in inconsistent states.

CDC Synchronizer Explorer
Adjust synchronizer stages, source & destination clocks, and data toggle rate. See the live waveform and MTBF analysis update in real time.
Sync FF Stages 2
Source Clock 200 MHz
Dest Clock 500 MHz
Data Toggle Rate 50 MHz
τ (Tau) Process 30 ps
MTBF
Mean Time Between Failures
Timing Parameters
T_clk_dst  =
T_res/stage =
Total T_res =
Stages     =
τ (tau)    =
MTBF Formula
MTBF = exp(N × T_res / τ) / (f_dst × f_data × T_w)
     = exp() / ()
     =

Frequently Asked Questions

Can I use false path constraints instead of a synchronizer?

Setting a false path on a CDC crossing tells STA to ignore timing — it does NOT fix metastability. False paths are only valid if the signal is truly static during operation (e.g., a config bit loaded once at power-up). Dynamic CDC crossings always need a synchronizer.

How do CDC verification tools work?

Tools like Questa CDC, SpyGlass CDC, and JasperGold CDC analyze the netlist for paths that cross clock domains without synchronizers, check for reconvergence issues, and verify that synchronizers are coded correctly (2-FF chain, ASYNC_REG attributes, no logic between stages).

What is the difference between a pulse synchronizer and a 2-FF sync?

A 2-FF sync may miss a pulse shorter than one destination clock period. A pulse synchronizer converts the pulse into a level toggle using an SR latch or toggle FF in the source domain, then synchronizes the level — ensuring even short pulses are reliably captured.

Does MTBF depend on temperature and voltage?

Yes. Higher temperature slows transistors, increasing τ and reducing resolution speed. Lower VDD also degrades τ. MTBF calculations should be performed at worst-case (max temp, min VDD) with the slowest-corner τ value from the process characterization data.