Advanced CDC Problems

CDC Convergence & Divergence
— The Sneakiest Clock Domain Bugs

Your 2-FF synchronizer is correct. Your metastability math checks out. And your chip still silently corrupts state. These two structural CDC bugs are invisible to simulation, missed by code review, and guaranteed to appear in silicon.

18 min read
Advanced level
Verilog RTL

Two bugs. Both invisible in simulation.

CDC divergence and CDC convergence are structural design errors — not timing violations. RTL simulation runs with unit delays and almost never exercises metastability windows, so both bugs pass simulation cleanly. They also pass RTL lint. They fail in silicon, intermittently, under specific clock frequency combinations, and only after you've shipped.

This page walks through each problem with a concrete scenario, a timing diagram, the correct fix, and exactly what to say in an interview.

⚠️
Key insight: both bugs share the same root cause — multiple synchronizer chains resolving metastability independently when they should be coordinated. The difference is topology: divergence is one source fanning to multiple destinations; convergence is multiple signals arriving at one destination from separate chains.

CDC Divergence — One Signal, Two Destinations, Two Different Answers

You have a single control signal en generated in CLK_A. Module B (in domain CLK_B) and Module C (in domain CLK_C) both need it. The natural instinct is to add a 2-FF synchronizer in each domain.

Divergence — Structural Bug
CLK_A Domain en FF1 FF2 CLK_B Module B clk_b clk_b FF1 FF2 CLK_C Module C clk_c clk_c DANGER ZONE Module B sees en=1 Module C sees en=0 ← System state corrupted
Buggy RTL — Divergence
The Bug: Independent Synchronizers into Two Domains
Each domain gets its own 2-FF synchronizer directly from the source signal. This looks safe — every crossing has a synchronizer. But during a metastability event, FF1 in chain B and FF1 in chain C resolve independently.
verilog — WRONG (divergence bug)
// Source domain: CLK_A generates en
// ─────────────────────────────────────────────────────────────────
// WRONG: en fans out to two independent synchronizer chains.
// During metastability: Module B may see en_b=1, Module C en_c=0.

module top (
  input  clk_a, clk_b, clk_c, rst_n,
  input  en_a,           // source signal in CLK_A domain
  output out_b, out_c
);

  // Synchronizer into CLK_B domain
  reg [1:0] sync_b;
  always @(posedge clk_b or negedge rst_n)
    if (!rst_n) sync_b <= 2'b00;
    else        sync_b <= {sync_b[0], en_a}; // ← directly samples en_a

  // Synchronizer into CLK_C domain
  reg [1:0] sync_c;
  always @(posedge clk_c or negedge rst_n)
    if (!rst_n) sync_c <= 2'b00;
    else        sync_c <= {sync_c[0], en_a}; // ← directly samples en_a
                                               //   different clock → may disagree!

  assign out_b = sync_b[1];
  assign out_c = sync_c[1];

endmodule
💀
What fails: When en_a transitions, both FF1 chains enter their setup-violation window simultaneously. Chain B's FF1 resolves high. Chain C's FF1 resolves low. Both then propagate their resolved values through FF2 and into logic. For the remainder of this enable pulse, Module B's enable is asserted and Module C's is not. The two modules now have an inconsistent view of the same control signal — and this is a permanent discrepancy, not a glitch.
Fixed RTL — Hub Synchronization
The Fix: Synchronize Once, Fan Out Within One Domain
Pick one destination domain as the "hub." Synchronize en_a into that domain. Then distribute the already-synchronized signal to every other destination in that same domain, or re-synchronize from the hub into any remaining domains.
verilog — CORRECT (hub synchronization)
// CORRECT: Synchronize en_a → CLK_B first (hub).
// Fan en_b to Module B directly.
// Re-synchronize en_b → CLK_C for Module C.
// Both now agree: they see the SAME source value.

module top (
  input  clk_a, clk_b, clk_c, rst_n,
  input  en_a,
  output out_b, out_c
);

  // Stage 1: Sync en_a → CLK_B (single synchronizer, single resolution)
  reg [1:0] sync_to_b;
  always @(posedge clk_b or negedge rst_n)
    if (!rst_n) sync_to_b <= 2'b00;
    else        sync_to_b <= {sync_to_b[0], en_a};

  wire en_b = sync_to_b[1]; // Now stable in CLK_B domain

  // Stage 2a: Module B uses en_b directly — same clock domain
  assign out_b = en_b;

  // Stage 2b: Re-sync en_b → CLK_C. Source is now a stable CLK_B signal.
  // No metastability risk: en_b changed in CLK_B, not CLK_C, but it is
  // a stable level by the time it enters this second synchronizer.
  reg [1:0] sync_to_c;
  always @(posedge clk_c or negedge rst_n)
    if (!rst_n) sync_to_c <= 2'b00;
    else        sync_to_c <= {sync_to_c[0], en_b};

  assign out_c = sync_to_c[1];

endmodule
Why this is correct: There is exactly one metastability resolution event — in the first sync chain targeting CLK_B. After FF2, en_b is a fully resolved, stable signal in the CLK_B domain. When it fans out to the second synchronizer chain (CLK_C), it is a clean digital level. There is no longer any possibility of divergent resolution — both domains will eventually agree on the same value.
📌
Golden rule — CDC Divergence: "Synchronize once, distribute many." A synchronized output is a clean digital signal. Fan it as much as you want — but only within one domain, or through a chain of synchronizers, never in parallel from the same unresolved source.

CDC Convergence — Two Signals, One Destination, One Race

Two logically coupled signals — a request flag req and an 8-bit data bus data[7:0] — originate in domain CLK_A. Module B in CLK_B needs both. Again, the natural approach is a separate 2-FF synchronizer for each.

Convergence — Structural Bug
CLK_A req data[7:0] (8 independent bit-syncs) 2-FF req sync 2-FF data sync CLK_B / Module B req_b arrives cycle N data_b still metastable cycle N! Module reads garbage RACE CONDITION req resolves in N cycles data may need N+1 ← independent chains
Buggy RTL — Convergence
The Bug: req and data Synchronized Independently
Both req and each bit of data[7:0] cross the clock boundary through their own separate 2-FF chains. The designer assumes they arrive at the same time. They almost certainly do not.
verilog — WRONG (convergence bug)
// WRONG: req and data synchronized by separate, independent chains.
// req may resolve to valid 1 cycle before data bits have all settled.

module cdc_conv_bug (
  input        clk_a, clk_b, rst_n,
  input        req_a,
  input  [7:0] data_a,
  output       req_b,
  output [7:0] data_b
);

  // req synchronizer
  reg [1:0] req_sync;
  always @(posedge clk_b or negedge rst_n)
    if (!rst_n) req_sync <= 2'b00;
    else        req_sync <= {req_sync[0], req_a};

  assign req_b = req_sync[1];

  // data synchronizer — 8 independent 2-FF chains, one per bit
  reg [7:0] data_s1, data_s2;
  always @(posedge clk_b or negedge rst_n)
    if (!rst_n) begin data_s1 <= 8'h00; data_s2 <= 8'h00; end
    else        begin data_s1 <= data_a; data_s2 <= data_s1; end
                        // ↑ each bit resolves metastability independently!

  assign data_b = data_s2;

  // Module B logic: act when req_b is seen
  always @(posedge clk_b)
    if (req_b)
      process(data_b); // ← data_b may not be valid yet!

endmodule
⏱️
The race in slow motion: At cycle T in CLK_A, req_a goes high and data_a is loaded with a new value. Both signals are simultaneously setup-violated at their respective destination FF1s in CLK_B.

req chain: FF1 resolves in CLK_B cycle N. FF2 sees it in cycle N+1. req_b = 1.
data[3] chain: FF1 takes until CLK_B cycle N+1 to resolve. data_b[3] is correct in cycle N+2.

Module B sees req_b=1 in cycle N+1, reads data_b — but data_b[3] is still metastable. The data word is corrupted.
Fixed RTL — req-first Handshake Sampling
The Fix: Sync req, Then Sample data
Only synchronize req across the clock boundary. Keep data stable in the source domain until the synchronized req tells us to latch it. Sample data in the destination domain only after the synchronized req edge is confirmed valid.
verilog — CORRECT (req-first sampling)
// CORRECT: Only req crosses the boundary via a 2-FF synchronizer.
// data_a is sampled in the CLK_B domain the cycle AFTER req_b goes high.
// By then, data_a has been stable for many CLK_B periods — it is safe to latch.
//
// Rule: source must hold data_a stable until the CLK_B ack is received.

module cdc_conv_fix (
  input        clk_a, clk_b, rst_n,
  input        req_a,
  input  [7:0] data_a,     // held stable in CLK_A until ack
  output       ack_b,      // pulsed back when data latched
  output [7:0] data_captured
);

  // Step 1: Synchronize req_a into CLK_B (single source of truth)
  reg [1:0] req_sync;
  always @(posedge clk_b or negedge rst_n)
    if (!rst_n) req_sync <= 2'b00;
    else        req_sync <= {req_sync[0], req_a};

  wire req_b      = req_sync[1];
  wire req_b_prev = req_sync[0]; // detect rising edge
  wire req_rise   = req_b & ~req_b_prev;

  // Step 2: Sample data_a only on the rising edge of req_b.
  //   At this point, req has been synchronized (≥2 CLK_B cycles old).
  //   data_a has been stable in CLK_A for at least those same cycles.
  //   Crossing a stable level never causes metastability.
  reg [7:0] data_latch;
  always @(posedge clk_b or negedge rst_n)
    if (!rst_n)      data_latch <= 8'h00;
    else if (req_rise) data_latch <= data_a; // safe: data_a is stable

  assign data_captured = data_latch;
  assign ack_b         = req_rise;   // pulse ack back to source

endmodule
Why this is correct: Only one bit crosses the CDC boundary through a synchronizer — req_a. The multi-bit data_a is sampled from the source domain in the destination's clock, but only after the synchronized req has told us the data is stable. A value that is already stable at the time of crossing cannot be metastable — metastability only occurs when a signal transitions within the setup/hold window. Since the source holds data_a static until ack, sampling it in CLK_B after the req sync is always safe.
Alternative Fix
Alternative: Full REQ/ACK Handshake
For cases where source cannot easily hold data stable, or when the source needs confirmation before releasing the bus, use a full REQ/ACK handshake with separate synchronizers for each direction.
verilog — Full Handshake with ACK return path
// Full REQ/ACK handshake:
//  1. Source sets req_a=1, holds data_a stable.
//  2. req_a → 2-FF sync → req_b in CLK_B.
//  3. CLK_B captures data_a, asserts ack_b.
//  4. ack_b → 2-FF sync → ack_a in CLK_A.
//  5. CLK_A de-asserts req_a, may update data_a.

module req_ack_handshake (
  input        clk_a, clk_b, rst_n,
  // Source side
  input        send_a,       // pulse: source wants to send
  input  [7:0] data_a,
  output       ready_a,      // source can send next word
  // Destination side
  output       valid_b,
  output [7:0] data_b
);

  // ── Source FSM ──
  reg       req_a, ack_a;
  reg [7:0] data_hold;

  always @(posedge clk_a or negedge rst_n) begin
    if (!rst_n) begin req_a <= 0; data_hold <= 0; end
    else if (send_a && !req_a) begin
      req_a     <= 1;
      data_hold <= data_a;     // latch and hold until ack
    end else if (ack_a) begin
      req_a <= 0;
    end
  end
  assign ready_a = !req_a;

  // ── REQ sync: CLK_A → CLK_B ──
  reg [1:0] req_s;
  always @(posedge clk_b or negedge rst_n)
    if (!rst_n) req_s <= 0;
    else        req_s <= {req_s[0], req_a};

  wire req_b = req_s[1];

  // ── Destination: latch data on req_b rising edge ──
  reg [7:0] data_cap;
  reg       req_b_r, ack_b;
  always @(posedge clk_b or negedge rst_n) begin
    if (!rst_n) begin req_b_r <= 0; ack_b <= 0; data_cap <= 0; end
    else begin
      req_b_r <= req_b;
      if (req_b && !req_b_r) begin   // rising edge of req_b
        data_cap <= data_hold;        // safe: held stable in CLK_A
        ack_b    <= 1;
      end else
        ack_b <= 0;
    end
  end
  assign valid_b = (req_b && !req_b_r);
  assign data_b  = data_cap;

  // ── ACK sync: CLK_B → CLK_A ──
  reg [1:0] ack_s;
  always @(posedge clk_a or negedge rst_n)
    if (!rst_n) ack_s <= 0;
    else        ack_s <= {ack_s[0], ack_b};

  always @(posedge clk_a) ack_a <= ack_s[1];

endmodule
📌
Golden rule — CDC Convergence: "Never infer data validity from a separately-synchronized control signal." If req and data cross through independent chains, their valid windows in the destination domain are independent too. Always gate data sampling on the synchronized req — not on the req and data arriving "at the same time," because they never do.

Side-by-Side: Divergence vs Convergence

Property CDC Divergence CDC Convergence
Topology 1 source signal → 2+ destination domains 2+ source signals → 1 destination domain
Root cause Independent metastability resolution in each destination's FF1 Independent metastability resolution per-bit or per-signal at destination
Failure mode Two modules permanently disagree on same signal Control signal valid before data bus is stable
Visible in simulation? Never Never
Fix strategy Sync to one hub domain first; fan out from there Sync control only; sample data after control resolves
Best structural fix Hub synchronization REQ/ACK handshake or req-first sampling
Detection tool SpyGlass CDC, Questa CDC SpyGlass CDC (multi-domain convergence)

Q&A — What Interviewers Actually Ask

Answer: No — this is the CDC divergence bug. Each synchronizer chain resolves metastability independently. During a metastability window, domain B's FF1 may resolve to 1 while domain C's FF1 resolves to 0. Both then propagate their resolved values, leaving the two modules permanently disagreeing on the same signal.

The fix: synchronize the signal into one domain first ("hub"). Then either fan out the already-stable output within that domain, or run it through a second synchronizer into any remaining domains. There is only one metastability event — in the hub synchronizer — and all domains then see a consistent value.
Answer: This is the CDC convergence bug. req and every bit of data each resolve metastability independently. It is entirely possible for req to resolve in destination clock cycle N and some bit of data to still be metastable in cycle N. When the destination samples data upon seeing req valid, it reads a corrupted word.

The fix: synchronize only req through the 2-FF chain. Only after the synchronized req edge is confirmed in CLK_B is it safe to sample data_a — at that point, data_a has been held stable in CLK_A for at least 2 CLK_B cycles (the synchronizer latency), so it cannot be metastable when read. For full reliability, use a REQ/ACK handshake so the source holds data stable until the destination confirms receipt.
RTL simulation uses zero or unit delays and resolves every signal to a definite 0 or 1 in the same simulation timestep. It has no concept of metastability. When a transition meets the setup violation window, the simulator picks 0 or 1 deterministically — it never drives X unless you explicitly model metastability (e.g., inject Xs via a force statement). Both divergence and convergence failures depend on two independent resolution events differing — which simulation cannot represent.

CDC static analysis tools (SpyGlass CDC, Questa CDC, JasperGold) model the domain structure and catch these structural violations at the RTL level, before synthesis.
Synchronize once, distribute many.

A synchronized signal is a clean digital value — fan it to as many consumers as you want, but only within one clock domain. If consumers live in different clock domains, run the signal through a chain of synchronizers (one per domain hop), never in parallel from the same unsynchronized source.

Corollary: the pre-synchronized (source-domain) copy of a CDC signal must never fan out to flip-flops in more than one destination domain simultaneously.
For divergence: SpyGlass identifies every net that crosses a clock domain boundary (a "CDC crossing point"). If that net fans out to flip-flops in two different destination clock domains — each with its own synchronizer — it flags a "DIVERGE" or "multi-domain fan-out" violation.

For convergence: SpyGlass checks if two or more signals that originated in the same source domain arrive at a single combinational logic cone or flip-flop in a destination domain through independent synchronizer chains. It reports this as a "CONVERGENCE" or "multi-bit CDC" violation, because those chains may resolve at different times.

Both violations are waiveable if the design intent is understood and architecture is safe, but the default policy in most tape-out flows is to treat them as errors.