Animate the metastability window. Watch 1-FF fail and 2-FF rescue it. Simulate pulse sync and handshake sync — the #1 SoC silicon bug, finally visible.
Every flip-flop has a setup time (data must be stable before clock edge) and a hold time (data must stay stable after clock edge). If the data transition falls inside this window, the FF output becomes metastable — stuck between 0 and 1 for an unpredictable duration.
MTBF = eTr/τ / (fclk × fdata × Tw)
Tr = resolution time available | τ = metastability time constant | Tw = danger window width
DATA_A crosses from the fast clock (CLK_A) to the slow clock (CLK_B). FF1_Q — first FF, can go metastable. FF2_Q — second FF, always resolves cleanly.
| Property | 1-FF Synchronizer | 2-FF Synchronizer |
|---|---|---|
| Resolution time (Tr) | 1 × Tclk | 2 × Tclk |
| Failure probability / cycle | ∝ e−Tr/τ | ∝ e−2Tr/τ |
| MTBF improvement | Baseline | ~eTclk/τ × better |
| Latency added | 1 cycle (CLK_B) | 2 cycles (CLK_B) |
| Safe for single-bit? | Risky | Yes ✓ |
| Safe for multi-bit bus? | No ✗ | No — use Gray code or FIFO ✗ |
// 2-FF Synchronizer (single-bit CDC)
// Place both FFs in the DESTINATION clock domain
module sync_2ff #(
parameter STAGES = 2 // increase to 3 for very high-speed designs
)(
input wire clk_dst, // destination clock
input wire rst_n,
input wire d_src, // async input from source domain
output wire q_sync // synchronized output
);
reg [STAGES-1:0] sync_ff;
always @(posedge clk_dst or negedge rst_n) begin
if (!rst_n)
sync_ff <= '0;
else
sync_ff <= {sync_ff[STAGES-2:0], d_src};
end
assign q_sync = sync_ff[STAGES-1];
// Synthesis attributes to prevent optimization
// Xilinx: (* ASYNC_REG = "TRUE" *) reg [STAGES-1:0] sync_ff;
// Intel: (* altera_attribute = "-name SYNCHRONIZER_IDENTIFICATION FORCED_IF_ASYNCHRONOUS" *)
endmodule
A simple 2-FF sync can miss pulses if the source clock is faster. The toggle approach converts each pulse into a level change — guaranteed to be captured regardless of clock ratio.
// Toggle-based Pulse Synchronizer
module pulse_sync (
input wire clk_src, // source clock
input wire clk_dst, // destination clock
input wire rst_n,
input wire pulse_in, // single-cycle pulse in clk_src domain
output wire pulse_out // pulse in clk_dst domain
);
// Source domain: toggle on each pulse
reg toggle_src;
always @(posedge clk_src or negedge rst_n)
if (!rst_n) toggle_src <= 1'b0;
else if (pulse_in) toggle_src <= ~toggle_src;
// 2-FF synchronizer to destination clock
(* ASYNC_REG = "TRUE" *) reg [1:0] sync_ff;
always @(posedge clk_dst or negedge rst_n)
if (!rst_n) sync_ff <= 2'b00;
else sync_ff <= {sync_ff[0], toggle_src};
// Edge detect: any transition on sync output = one pulse
reg sync_prev;
always @(posedge clk_dst or negedge rst_n)
if (!rst_n) sync_prev <= 1'b0;
else sync_prev <= sync_ff[1];
assign pulse_out = sync_ff[1] ^ sync_prev;
endmodule
For multi-cycle control or when acknowledgment is required. Total latency ≈ 4 destination clock cycles (2 forward + 2 back). Data must be held stable throughout.
// Request-Acknowledge Handshake Synchronizer
module handshake_sync (
// Source domain
input wire clk_src, rst_src_n,
input wire req_in, // request from source logic
input wire [7:0] data_in, // data to transfer (held stable during handshake)
output wire ack_out, // acknowledgment back to source
// Destination domain
input wire clk_dst, rst_dst_n,
output reg [7:0] data_out,
output wire data_valid
);
// Source: register REQ
reg req_src;
always @(posedge clk_src or negedge rst_src_n)
if (!rst_src_n) req_src <= 0;
else if (req_in & ~ack_out) req_src <= 1; // assert
else if (ack_out) req_src <= 0; // de-assert on ack
// 2-FF sync REQ to destination
(* ASYNC_REG = "TRUE" *) reg [1:0] req_sync;
always @(posedge clk_dst or negedge rst_dst_n)
if (!rst_dst_n) req_sync <= 0;
else req_sync <= {req_sync[0], req_src};
// Destination: capture data, assert ACK
reg ack_dst;
always @(posedge clk_dst or negedge rst_dst_n)
if (!rst_dst_n) begin ack_dst <= 0; data_out <= 0; end
else if (req_sync[1] & ~ack_dst) begin
data_out <= data_in; // safe: data_in stable during handshake
ack_dst <= 1;
end else if (~req_sync[1]) ack_dst <= 0;
assign data_valid = ack_dst;
// 2-FF sync ACK back to source
(* ASYNC_REG = "TRUE" *) reg [1:0] ack_sync;
always @(posedge clk_src or negedge rst_src_n)
if (!rst_src_n) ack_sync <= 0;
else ack_sync <= {ack_sync[0], ack_dst};
assign ack_out = ack_sync[1];
endmodule
CDC occurs when a signal generated in one clock domain is sampled by logic clocked by a different, asynchronous clock. Because the two clocks have no fixed phase relationship, the signal can change at any time relative to the capturing clock's setup-hold window, potentially causing metastability — one of the leading sources of silicon bugs.
Metastability occurs when a flip-flop's D input changes within its setup-hold window. The FF output enters an indeterminate state — neither 0 nor 1 — and resolves exponentially over time (P ∝ e−t/τ, where τ ≈ 50–200 ps for modern CMOS). If the output is still undefined when sampled by downstream logic, the result is unpredictable chip behavior that cannot be caught by simulation.
A 2-FF synchronizer gives the metastable signal a full clock period of resolution time before it is sampled by the second FF. Since metastability probability decays as e−Tr/τ, doubling Tr reduces failure probability by eTclk/τ — typically hundreds of millions to billions. The second FF then captures a fully resolved 0 or 1 with extremely high probability, pushing MTBF to billions of years.
A pulse synchronizer transfers a single-cycle pulse across clock domains using a toggle-based approach. The source domain toggles a FF on each input pulse; the toggle signal is synchronized to the destination domain via 2-FF; an XOR gate detects the edge and regenerates the pulse. Use it when the source clock is faster than the destination clock (where a simple 2-FF sync might miss the pulse) or when you need guaranteed transfer of single-cycle events.
Use a handshake synchronizer when you need to: (1) transfer multi-bit control signals where atomic capture matters, (2) ensure the destination has acknowledged receipt before the source sends the next transaction, or (3) transfer data that must remain stable for multiple cycles. The cost is latency (≈ 4 destination clock cycles). For bulk data, use a gray-coded async FIFO instead.
No — this is a common CDC bug. Each bit of the bus might be metastable independently and resolve to a different value than intended, creating an invalid combination that never existed in the source domain. For multi-bit signals use: gray code encoding (only 1 bit changes per transition — as in async FIFOs), handshake synchronizers, or fully asynchronous FIFOs.