HomeCDCDay 14 Enhanced

Debugging CDC Issues

Systematic debugging of CDC failures. Common failure modes, root cause analysis, metastability detection, and post-silicon investigation.

By EcrioniX · Published June 13, 2026 · ~4100 words · 12 min read

1. Common CDC Failure Modes

Failure Mode 1: Data Corruption (Glitches)

Symptom: Multi-bit data arrives corrupted (bits swapped, garbled).

Root cause: Multi-bit binary crossing without Gray code. All bits changed simultaneously, some captured mid-transition.

Example: Write 0x00, read 0x80 (not a valid transition).

Fix: Use Gray code synchronization for multi-bit monotonic signals.

Failure Mode 2: Intermittent Data Loss

Symptom: Occasionally, data written to FIFO doesn't appear on read side.

Root cause: Single-bit FIFO write pointer not synchronized. Read pointer sees stale pointer, thinks FIFO is full.

Fix: Ensure Gray code synchronizes FIFO pointers, verify dual-FF stages.

Failure Mode 3: Deadlock (FIFO stuck)

Symptom: FIFO full flag asserted indefinitely, data can't flow.

Root cause: Synchronized read pointer is incorrect (sync failed), so full flag wrong.

Investigation: Check that synchronized read pointer in write domain equals actual read pointer (accounting for 2-cycle sync latency).

Failure Mode 4: Random Failures (Metastability)

Symptom: System works 99% of the time, but occasionally fails in specific conditions (specific frequency, temperature, voltage).

Root cause: CDC synchronizer insufficient. Single-FF or incomplete dual-FF, metastability not resolved fully.

Fix: Add FF stages, verify timing closure, formal verification.

2. Debugging Flow (Simulation)

Step 1: Identify failing behavior

Step 2: Isolate the CDC crossing

Step 3: Check synchronizer quality

Step 4: Test under stress

3. Debugging Flow (Post-Silicon)

Step 1: Reproduce the failure

Step 2: Add instrumentation**

Step 3: Compare with simulation**

Step 4: Root cause determination**

4. Post-Silicon Metastability Indicators

Signs of metastability issues in silicon:

5. Forensic Analysis: Trace Examination

If you have logic analyzer traces (simulation or real silicon):

  1. Identify CDC crossing point: Where does source signal change?
  2. Check sync stages: Does FF1 output transition correctly after input change?
  3. Check FF2 stability: Is FF2 output stable before downstream logic uses it?
  4. Look for glitches: Brief spikes (< 1ns) in signals indicate metastable state
  5. Check timing:**Between signal change and FF capture edge: is setup time violated?

6. Common Debugging Mistakes

  • Mistake: Assuming simulation proves metastability safety (it doesn't, unless explicitly injected)
  • Fix: Inject metastability in simulation, use formal verification
  • Mistake: Debugging only at nominal conditions (misses corner cases)
  • Fix: Test at worst-case PVT first (slow-slow), then other corners
  • Mistake: Assuming all CDC violations show up immediately in simulation
  • Fix: Some CDC bugs are probabilistic (MTBF hours), need long sim runs or formal proof
  • Mistake: Not checking synthesis netlist against RTL
  • Fix: Formal equivalence check (Conformal) to ensure synthesis didn't break CDC

7. Emergency Fixes (Band-Aids)

If you're post-silicon and discover a CDC bug:

  • Reduce frequency: Slower clocks reduce metastability risk (MTBF improves with time budget)
  • Add delay: Insert pipeline stages (buffer time before downstream logic uses data)
  • Temperature management: Run at lower temp if possible (reduces metastability)
  • Add guard band: Operate at lower voltage/frequency than specs (larger margin)

These are temporary. Real fix requires redesign and respin.

8. Prevention: Design Review Checklist

  • Every async signal synchronized?
  • Dual-FF or better minimum?
  • Gray code on multi-bit monotonic?
  • FIFO pointers synchronized?
  • Reset synchronized?
  • No combinational logic from async input?
  • Timing constraints set (false paths marked)?
  • CDC lint passed?
  • Formal verification on critical paths?
  • Testbench includes metastability injection?

9. Incident Reporting Template

If CDC bug occurs, document it for future reference:

  • Title: Brief description
  • Symptom: What went wrong (data corruption, deadlock, intermittent)
  • Reproduction conditions: Frequency, temperature, voltage, data patterns
  • Root cause: Which CDC crossing failed and why
  • Fix:**RTL change + new constraints + verification plan
  • Prevention: What lint/formal checks would have caught this

10. Checklist: CDC Debugging

  • Simulation first: Reproduce in sim before post-silicon
  • Inject metastability: Random clock skew, setup/hold violations
  • Test all PVT corners: Especially slow-slow
  • Check synchronizer stages: Count FFs, verify dual minimum
  • Examine traces: Look for glitches, wrong timing
  • Formal verification: Prove MTBF bounds
  • Compare netlist: Ensure synthesis didn't break CDC
  • Document findings: Root cause, fix, prevention

Next (Day 15): Production verification and final checklist.