RTL Design · Best Practices

RTL Coding Guidelines —
Synthesis-Ready Verilog & SystemVerilog

EcrioniX · RTL Design· ~18 min read· Naming · Reset · Latches · Clock Enable · Pragmas

Bad RTL passes simulation and still burns hours in synthesis, fails timing, or produces silicon bugs that cost a respin. These guidelines represent what professional ASIC teams enforce at code review to prevent the most common synthesis-simulation mismatches and lint failures.

RTL to Silicon — Where Coding Mistakes Hide RTL Code Verilog / SV Simulation Passes? ✓ Synthesis Netlist Gate Sim SDF timing Silicon Tapeout ⚠ Race conditions hidden in sim ⚠ Latch inference wrong topology ⚠ RTL-gate mismatch timing violations

1. Naming Conventions

Consistent naming makes code review faster and lint tools more effective. These conventions are used across most professional ASIC and FPGA teams:

Signal TypeConventionExampleNotes
Clockclk or clk_*clk, clk_axi, clk_100mOne per clock domain; never rename mid-hierarchy
Active-low reset*_nrst_n, arst_n, rstn_n suffix is universal; never use _b (confusing in formal)
Active-high resetrst or *_rstrst, sys_rstLess common in ASIC; document clearly
Registered signal*_r or *_regdata_r, count_regDistinguishes FF output from combinational wire
Next-state*_nxt or *_nextstate_nxt, addr_nextCombinational input to the DFF
Module parameterUPPER_SNAKE_CASEDATA_WIDTH, FIFO_DEPTHParameters and defines always uppercase
Local variablelowercase_snake_casebyte_count, wr_ptrConsistent for all ports and internal signals
Module nameMatch filenameaxi_master (file: axi_master.v)Mismatch breaks tool flows and version control
Instantiationu_* or i_* prefixu_fifo, u_clk_genMakes hierarchy visible in waveform tools
Generate blockgen_*gen_pipeline_stageAvoids tool-generated cryptic hierarchy names

2. Reset Style

The Industry Standard: Async-Assert, Sync-Deassert

This pattern gives you fast reset assertion (no clock needed) and glitch-free, metastability-safe deassert. Every flip-flop in the design uses the same pattern:

// ✓ CORRECT — async assert, sync deassert via reset synchronizer
// Reset synchronizer (place once per clock domain)
module rst_sync #(parameter STAGES = 2) (
  input  logic clk, arst_n,   // arst_n: asynchronous, active-low
  output logic srst_n         // srst_n: synchronous, active-low
);
  logic [STAGES-1:0] sync_r;
  always_ff @(posedge clk or negedge arst_n)
    if (!arst_n) sync_r <= '0;
    else         sync_r <= {sync_r[STAGES-2:0], 1'b1};
  assign srst_n = sync_r[STAGES-1];
endmodule

// Design flip-flop — use srst_n from the synchronizer above
always_ff @(posedge clk or negedge srst_n)
  if (!srst_n) q <= '0;   // resets on either arst_n assertion or sync deassert
  else         q <= d;
Never — Purely Async Deassert

All FFs deassert at unpredictable times relative to the clock. Different FFs wake up at different clock edges — causing functional corruption in FSMs and counters.

Never — Gated Clock for Reset

Generating reset by ANDing the clock is a clock gating error that creates glitches. Resets must be routed separately through the reset network.

Correct — Synchronous-Only (FPGA)

For FPGA, pure synchronous reset is acceptable: always_ff @(posedge clk) if (rst) q <= '0; FPGA FFs typically have only one reset pin (SR).

3. Sequential vs Combinational Blocks

Use the SystemVerilog procedural keywords — they enforce correct behavior at the tool level:

// ✓ Sequential: always_ff — tool flags any non-FF inference as error
always_ff @(posedge clk or negedge rst_n) begin
  if (!rst_n)  count <= '0;
  else         count <= count_nxt;
end

// ✓ Combinational: always_comb — auto sensitivity, tool flags latch inference
always_comb begin
  count_nxt = count;          // default assignment prevents latch
  if (enable) count_nxt = count + 1'b1;
end

// ✓ Continuous: assign for simple combinational
assign overflow = (count == MAX_COUNT);

// ✗ AVOID: Verilog always @(*) — no tool enforcement
always @(*) begin  // latch can appear here without warning
  if (enable) out = in;  // missing else: LATCH INFERRED
end

Blocking vs Non-Blocking — The Absolute Rule

// ✓ Sequential block: ALWAYS use non-blocking (<=)
always_ff @(posedge clk) begin
  a <= b;    // reads OLD b, writes new a at end of timestep
  c <= a;    // reads OLD a (correct flip-flop chain behavior)
end

// ✗ WRONG — blocking in sequential: race condition
always @(posedge clk) begin
  a = b;     // writes a immediately
  c = a;     // reads NEW a — simulates wrong AND synthesizes differently!
end

// ✓ Combinational block: ALWAYS use blocking (=)
always_comb begin
  tmp = a & b;   // blocking: immediate, models wire/gate behavior
  out = tmp | c;
end

4. Latch-Free Coding

Unintended latches are one of the most common RTL bugs. They cause timing issues (latches are level-sensitive, not edge-triggered) and are harder to test with scan-based DFT.

The Default Assignment Pattern

// ✗ LATCH INFERRED — output not assigned in every path
always_comb begin
  if (sel == 2'b00) out = a;  // what is out when sel == 2'b01, 10, 11?
  if (sel == 2'b01) out = b;  // missing else → latch holds last value
end

// ✓ FIX 1 — default assignment at top
always_comb begin
  out = '0;            // default: covers all unspecified paths
  case (sel)
    2'b00: out = a;
    2'b01: out = b;
    2'b10: out = c;
    2'b11: out = d;
  endcase
end

// ✓ FIX 2 — unique case (SV): all cases covered, mutually exclusive
always_comb begin
  unique case (sel)
    2'b00: out = a;
    2'b01: out = b;
    2'b10: out = c;
    2'b11: out = d;
  endcase
end

// ✓ FIX 3 — priority case (SV): overlapping OK, priority given to first match
always_comb begin
  priority case (1'b1)
    req[0]: grant = 3'b001;
    req[1]: grant = 3'b010;
    req[2]: grant = 3'b100;
    default: grant = 3'b000;
  endcase
end

5. Clock Enable Pattern

Never gate the clock signal directly with combinational logic in RTL — this creates glitches that corrupt flip-flop state. Use the clock enable (CE) pattern instead:

// ✗ WRONG — gated clock: glitch if cond changes while clk=1
wire gated_clk = clk & cond;        // NEVER do this in RTL
always @(posedge gated_clk) q <= d;

// ✓ CORRECT — clock enable: synthesizes to FF with CE pin
always_ff @(posedge clk or negedge rst_n) begin
  if (!rst_n)       q <= '0;
  else if (enable)  q <= d;  // enable maps to flip-flop CE pin
end

// ICG (Integrated Clock Gate) — inserted by synthesis/power intent
// Power-intent file (UPF/CPF) specifies ICG insertion globally.
// You write CE pattern; tool inserts ICG library cell automatically.

// ✓ Parameterized register bank with CE — common pattern
module reg_bank #(parameter W=32, D=8) (
  input  logic       clk, rst_n,
  input  logic [2:0] wr_addr, rd_addr,
  input  logic [W-1:0] wr_data,
  input  logic       wr_en,
  output logic [W-1:0] rd_data
);
  logic [W-1:0] mem [D];
  always_ff @(posedge clk)
    if (wr_en) mem[wr_addr] <= wr_data;  // CE pattern on each write
  assign rd_data = mem[rd_addr];
endmodule

6. Parameterization & Portability

// ✓ Parameterize widths and depths — never hardcode magic numbers
module fifo #(
  parameter int DATA_WIDTH = 8,
  parameter int DEPTH      = 16,
  parameter int PTR_W      = $clog2(DEPTH) + 1  // auto-computed
)(
  input  logic             clk, rst_n, wr_en, rd_en,
  input  logic [DATA_WIDTH-1:0] wr_data,
  output logic [DATA_WIDTH-1:0] rd_data,
  output logic             full, empty
);
  logic [DATA_WIDTH-1:0] mem [DEPTH];
  logic [PTR_W-1:0] wr_ptr, rd_ptr;
  // ... MSB trick for full/empty detection ...
endmodule

// ✓ Use localparam for derived constants inside modules
localparam ADDR_W = $clog2(DEPTH);  // avoids magic number in bit slices

// ✗ AVOID — hardcoded widths break on parameter changes
logic [3:0] ptr;  // breaks when DEPTH changes from 16 to 32

7. Synthesis Pragmas & Attributes

Pragmas guide the synthesis tool without affecting RTL simulation. Use sparingly — prefer explicit coding over pragma workarounds.

Pragma / AttributeToolEffectWhen to Use
/* synthesis keep */GenericPrevents net from being optimized awayDebug probe signals you need in waveforms
(* dont_touch="true" *)Vivado / DCPrevents logic optimization on cell/netTiming-critical cells, reset synchronizers
/* synopsys full_case */Synopsys DCTreats case as exhaustive (no else-branch logic)Only when you are certain and want smaller area
/* synopsys parallel_case */Synopsys DCSynthesizes case as mux-tree not priority muxWhen all branches truly mutually exclusive — RISKY if wrong
(* max_fanout=16 *)Vivado / DCLimits fanout, forces buffer insertionHigh-fanout control signals in FPGA
(* shreg_extract="no" *)VivadoPrevents shift register → SRL inferenceWhen you need full FF chain for timing control
/* pragma translate_off */GenericExcludes code block from synthesisSimulation-only checkers, $display, initial blocks
(* ram_style="block" *)VivadoForces BRAM inference for memoryLarge memories that should use dedicated BRAM
// ✓ Reset synchronizer — mark dont_touch so optimizer leaves it alone
(* dont_touch = "true" *)
logic [1:0] rst_sync_r;
always_ff @(posedge clk or negedge arst_n)
  if (!arst_n) rst_sync_r <= 2'b00;
  else         rst_sync_r <= {rst_sync_r[0], 1'b1};

// ✓ Simulation-only block excluded from synthesis
/* pragma translate_off */
initial begin
  $display("FIFO depth = %0d, width = %0d", DEPTH, DATA_WIDTH);
end
/* pragma translate_on */

// ✗ RISKY — full_case/parallel_case: must match simulation exactly
// If the case IS NOT exhaustive, synthesis ignores branches, simulation doesn't.
// RTL-gate mismatch results. Prefer explicit default: instead.

8. Common Pitfalls & RTL Anti-Patterns

Anti-PatternWhat Goes WrongFix
Incomplete sensitivity list
always @(a) but reads b
Simulation event-driven mismatch with synthesis (always_comb behavior) Use always_comb (auto-complete sensitivity)
Initial blocks in RTL Ignored by synthesis — signals start undefined in silicon Use reset to initialize all state. initial only in testbenches
Integer in port declarations integer synthesizes as 32-bit signed — unexpected width Use logic [N-1:0] with explicit width everywhere
Implicit wire declarations A typo creates a new implicit 1-bit wire instead of an error Add `default_nettype none at top of every file
posedge clk in two always blocks driving the same signal Multi-driven nets — synthesis may silently pick one driver; formal detects it Each signal has exactly one always block driver. Use mux logic, not multiple drivers
Comparing different widths without cast Verilog auto-extends narrower operand — can mask bugs in equality checks Always explicitly size constants: 8'hFF not 8'hFF (oh wait, same) — and cast in SV: 8'(expr)
for loops with variable bound Some synthesis tools cannot unroll a loop with non-constant upper bound Loop bounds must resolve to constants at elaboration. Use parameters.
X-propagation: x ? a : b Ternary with X condition simulates X; synthesis picks 0 or 1 — mismatch Avoid X as condition. Initialize all state. Use 4-state simulation with Xprop mode.

9. Pre-Synthesis RTL Checklist

Run through this before every lint / synthesis run

Frequently Asked Questions

Should I use synchronous or asynchronous reset?+

ASIC: async-assert / sync-deassert is the standard. Fast, glitch-free assertion; metastability-safe deassert. Requires a 2-stage reset synchronizer per clock domain. FPGA: synchronous reset preferred because FPGA FFs have a single SR/CE pin and async reset can complicate timing in the routing fabric. The cardinal rule either way: never gate the reset signal with logic in RTL — reset must come from a properly synchronized source.

How do I prevent latch inference in RTL?+

Three techniques: (1) Always write a default assignment at the top of every always_comb block before any if/case. (2) Ensure every output is assigned in every branch of every if/case. (3) Use unique case or priority case in SystemVerilog — the compiler will flag unintended latches from always_comb blocks as errors. If you need to hold a value across cycles, use a flip-flop (always_ff), not a latch — latches are intentional only for ICG (Integrated Clock Gate) cells and are never coded directly in RTL.

What is the clock enable pattern and why use it instead of gated clocks?+

The CE pattern writes always_ff with an if(enable) condition, which synthesis maps to the flip-flop's dedicated CE input. This is glitch-free because the CE pin samples before the clock edge — it cannot create a runt pulse. Gating the clock signal with combinational logic (clk AND cond) creates glitches whenever cond changes while clk=1. These glitches are hard to detect in simulation (which uses unit delay) and can corrupt flip-flop state in silicon. Clock gating for power reduction is done by synthesis inserting ICG (Integrated Clock Gating) library cells — never by the RTL designer directly.

What does "`default_nettype none" do?+

By default, Verilog implicitly declares any undeclared identifier as a 1-bit wire. This means a typo in a signal name silently creates a new net rather than producing an error. `default_nettype none disables this — any undeclared identifier becomes a compile error. This catches typos, missing port connections, and copy-paste errors that would otherwise silently propagate. Put it at the top of every RTL file before the module declaration. Pair it with `resetall at the top to ensure it applies even when files are included.

When is it safe to use full_case and parallel_case pragmas?+

full_case tells synthesis the case is exhaustive — it can omit the default branch logic. parallel_case tells synthesis all branches are mutually exclusive — it can implement a mux-tree instead of priority mux. Both are SAFE only when the assertion is true in silicon (not just in simulation). If the case is not actually full/parallel and you add the pragma, synthesis generates smaller logic, but simulation still evaluates the unreachable branches — causing an RTL-gate mismatch that will fail gate-level simulation. Best practice: prefer explicit coding (unique case, default: branch) over pragmas. Reserve pragmas for legacy Verilog where SV keywords aren't available, and always verify with gate-level sim.