The Two Golden Rules — Memorize These
Golden Rules for Synthesis-Safe RTL
- Combinational logic (always @(*) or always_comb) → use blocking (=). Sequential order of statements correctly computes intermediate values.
- Sequential / clocked logic (always @(posedge clk) or always_ff) → use non-blocking (<=). All flip-flops sample old values and update simultaneously.
- Never mix = and <= inside the same always block. Ever.
- Never use non-blocking assignments in continuous assign statements or functions.
What Each Assignment Type Actually Does
Executes Immediately, In Order
When the simulator reaches a blocking assignment, it evaluates the RHS and immediately updates the LHS variable before executing the next statement. Subsequent statements in the same always block see the updated value. Behaviour is identical to a software programming language assignment.
Evaluates Now, Updates Later
When the simulator reaches a non-blocking assignment, it evaluates the RHS immediately using current signal values, then schedules the LHS update for the NBA (Non-Blocking Assignment) region — after all active events at this time step are resolved. All non-blocking updates take effect simultaneously.
Active Region → NBA Region
A Verilog simulation time step has two key phases: the Active region (where blocking assignments execute and NBA RHS expressions are evaluated) and the NBA region (where all scheduled non-blocking LHS updates are applied). This two-phase model is what makes <= correctly model synchronous hardware.
The Verilog Event Scheduler
Understanding why non-blocking works requires seeing how the simulator processes events. Both RHS expressions are evaluated in the Active region; only the LHS update is deferred.
Fig 1 — The Verilog simulation time step. Blocking (=) assignments complete in the Active region. Non-blocking (<=) RHS is read in the Active region but LHS updates happen in the NBA region — making all registers appear to update simultaneously, just like real flip-flops.
The 4-Stage Shift Register — Where Everyone Goes Wrong
This is the most famous Verilog example for demonstrating the difference. The goal is a 4-stage shift register: on each clock edge, data propagates one stage forward.
// WRONG: blocking in clocked always always @(posedge clk) begin b = a; // b updated IMMEDIATELY c = b; // sees NEW b, not old b! d = c; // sees NEW c! e = d; // sees NEW d! end // Result: ALL stages get value of 'a' // in same clock cycle — it's just // one register, not a shift register!
// CORRECT: non-blocking in clocked always always @(posedge clk) begin b <= a; // RHS read: old 'a' c <= b; // RHS read: old 'b' d <= c; // RHS read: old 'c' e <= d; // RHS read: old 'd' end // Result: all RHS read SIMULTANEOUSLY // then all LHS updated SIMULTANEOUSLY // → correct 4-stage shift register ✓
Why the wrong version fails: With blocking assignments, b = a executes completely before c = b. So when c = b runs, b already holds the new value of a. Every stage propagates the same value through in one clock cycle — you get a 1-stage register, not a 4-stage one. In simulation AND in synthesis this is wrong.
Fig 2 — Correct 4-stage shift register waveform using non-blocking (<=). Each stage is delayed by exactly one clock cycle. With blocking (=) all four stages would update to the same value of 'a' in the same clock cycle.
The Right Tool for the Right Job
always @(*) begin // or always_comb // Intermediate values: order matters, = is correct sum = a + b; carry = (sum > 8'hFF); // uses updated sum ✓ result = carry ? sum[7:0] : sum; // uses updated carry ✓ end // Priority MUX — correct with blocking: always @(*) begin out = 8'h00; // default (prevents latch!) if (sel[2]) out = c; else if (sel[1]) out = b; else if (sel[0]) out = a; end
// D flip-flop with async reset always @(posedge clk or negedge rst_n) begin if (!rst_n) q <= 1'b0; else q <= d; end // Multi-bit register with enable always @(posedge clk) begin if (rst) data_r <= 8'h00; else if (en) data_r <= data_in; // no else needed — hold is implied for a register end // State machine — always <= for state register always @(posedge clk) begin if (rst) state <= IDLE; else state <= next_state; // next_state from comb block end
Pitfalls That Break Simulation and Synthesis
// DANGEROUS: mixing blocking and non-blocking always @(posedge clk) begin temp = a + b; // blocking — executes first out <= temp; // non-blocking — but which 'temp'? end // Result: simulator behaviour is implementation- // defined. Different simulators may give different // results. NEVER do this. // ALSO WRONG: non-blocking in combinational block always @(*) begin y <= a & b; // <= in combinational = latch + race end // Synthesis tool will likely warn/error
// Two always blocks both write 'x' with blocking always @(posedge clk) x = a; // Block 1 always @(posedge clk) x = b; // Block 2 // The final value of 'x' depends on which block // the simulator executes last — undefined by IEEE! // With non-blocking, the LAST assignment wins // deterministically (defined by IEEE 1364). // NON-BLOCKING version is at least deterministic: always @(posedge clk) x <= a; // Block 1 schedules always @(posedge clk) x <= b; // Block 2 schedules // Last-scheduled update wins — still bad practice // but at least reproducible across simulators
What Synthesis Tools See
| Context | Assignment Used | Synthesizes To | Correct? |
|---|---|---|---|
| always @(*) | = blocking | Combinational gates (AND, OR, MUX…) | ✓ Correct |
| always @(posedge clk) | <= non-blocking | D flip-flops + any required combo logic | ✓ Correct |
| always @(posedge clk) | = blocking | Tool-dependent — may synthesize incorrectly or warn | ✗ Wrong |
| always @(*) | <= non-blocking | Latch or error — tools typically warn | ✗ Wrong |
| always @(*), no default | = blocking (incomplete if/case) | Inferred latch (unintentional) | ⚠ Latch |
Synthesis tools are forgiving — but don't rely on it. Modern synthesis tools (Synopsys DC, Cadence Genus, Vivado) will often produce the correct netlist even with blocking assignments in clocked blocks — but the simulation behaviour during RTL sim will differ from gate-level sim, causing sign-off mismatches. Write RTL that is correct in simulation, not just in synthesis.
Side-by-Side Comparison
| Property | Blocking (=) | Non-Blocking (<=) |
|---|---|---|
| When LHS updates | Immediately, before next statement | End of time step (NBA region), simultaneously |
| RHS evaluated | Immediately | Immediately (but LHS deferred) |
| Models | Procedural software variable | Hardware flip-flop / register |
| Use in always @(*) | ✓ Correct | ✗ Avoid |
| Use in always @(posedge clk) | ✗ Avoid | ✓ Correct |
| Mix both in same block | ✗ Never — race conditions and undefined behavior | |
| Intermediate variables | Works correctly (sequential) | Must be careful — old values used |
| Allowed in function | ✓ Yes | ✗ No |
Frequently Asked Questions
The = (blocking) assignment executes immediately and sequentially — the LHS is updated before the next statement in the always block runs, just like a variable assignment in C or Python.
The <= (non-blocking) assignment evaluates the RHS immediately but defers the LHS update to the NBA (Non-Blocking Assignment) region at the end of the current simulation time step. All non-blocking assignments in all always blocks at the same time step take effect simultaneously — correctly modeling how all flip-flops in real hardware clock their outputs at the same edge.
With blocking assignments in a clocked always block, each statement executes before the next. So b = a updates b immediately, then c = b reads the already-updated b. The result is that all stages get the value of the first input (a) in the same clock cycle — you have a 1-stage register that copies to everything at once, not a shift register.
Fix: replace every = with <= in the clocked always block. Non-blocking reads all RHS values simultaneously (old values) then writes all LHS simultaneously — giving you one clock cycle of propagation per stage.
No. The always block type determines the rule, not what's inside it. If you're inside always @(posedge clk), use <= everywhere inside it — in if-else branches, case statements, and loops. The only exception would be local temporary variables that you use as intermediate calculations (which some styles allow with blocking), but even then mixing is risky and most coding guidelines forbid it entirely.
A race condition occurs when two always blocks both read and write the same signal using blocking assignments at the same simulation time. The result depends on which block the simulator happens to evaluate first — which is not guaranteed by the IEEE standard. Different simulators, or even different runs of the same simulator, can produce different results.
Non-blocking assignments prevent this for registers: all RHS expressions are evaluated in the Active region (reading old values), and all LHS updates happen in the NBA region simultaneously. No block can "see" another block's non-blocking update during the Active region — eliminating the ordering dependency entirely.
Some experienced RTL engineers use blocking for local temporary variables inside clocked blocks to avoid declaring separate always @(*) blocks. For example: temp = a + b; result <= temp; — here temp is a local variable only used within this block, so the blocking update is safe. However, most coding guidelines (including ARM and Intel RTL guidelines) prohibit mixing to avoid confusion. The safest rule for beginners and professionals alike: never mix = and <= in the same always block.
Both. Synthesis tools infer flip-flops based on the structure of clocked always blocks — most will produce the correct netlist even if you mistakenly use blocking. However, RTL simulation will show the wrong behavior, meaning your testbench results are invalid. When gate-level simulation (post-synthesis) is run, the netlist will behave differently from your RTL simulation — this mismatch is called a simulation-synthesis mismatch and is a serious sign-off issue in ASIC design. Using non-blocking correctly ensures RTL sim, gate-level sim, and silicon all behave identically.