Register Transfer Level (RTL) design is the front-end of the ASIC flow — where hardware behavior is described in Verilog or SystemVerilog before synthesis converts it into gates. This guide covers everything from module structure to synthesis-safe FSM coding.
Every design in Verilog is organized as a module — the fundamental building block of HDL design, equivalent to a black box with ports.
Ports declare the interface. input drives into the module, output drives out, inout is bidirectional (used for bus interfaces).
wire is a net driven continuously. reg is a variable holding its last assigned value. Despite the name, reg does not always infer a flip-flop.
Use parameter to make modules reusable with configurable widths, depths, or modes — avoiding duplicate code for different bus sizes.
module reg_file #(
parameter WIDTH = 8,
parameter DEPTH = 16
)(
input wire clk,
input wire rst_n,
input wire wr_en,
input wire [$clog2(DEPTH)-1:0] wr_addr,
input wire [WIDTH-1:0] wr_data,
input wire [$clog2(DEPTH)-1:0] rd_addr,
output reg [WIDTH-1:0] rd_data
);
reg [WIDTH-1:0] mem [0:DEPTH-1];
// Write port
always @(posedge clk or negedge rst_n) begin
if (!rst_n) begin
integer i;
for (i = 0; i < DEPTH; i = i + 1)
mem[i] <= {WIDTH{1'b0}};
end else if (wr_en) begin
mem[wr_addr] <= wr_data;
end
end
// Read port (async)
always @(*) begin
rd_data = mem[rd_addr];
end
endmodule
The most critical distinction in Verilog RTL. Mixing the two incorrectly produces simulations that diverge from synthesized hardware.
| Property | Blocking ( = ) | Non-Blocking ( <= ) |
|---|---|---|
| Execution order | Sequential — each line waits for the previous | Parallel — RHS evaluated together, LHS updated together |
| Hardware target | Combinational logic (assign, always @(*)) | Sequential logic (always @(posedge clk)) |
| Race condition risk | High if used in clocked blocks | None — designed for clocked blocks |
| Synthesis inference | Wire / combinational logic | Flip-flop register |
| Simulation model | Immediate update in the active region | Update deferred to NBA (Non-Blocking Assignment) region |
<= inside clocked always blocks. Always use = inside combinational always @(*) blocks. Never mix both in the same always block.always @(posedge clk) begin
a <= b; // non-blocking
b <= a; // swaps a and b
end
// Both RHS evaluated before any LHS update
// Result: a gets old b, b gets old a
always @(posedge clk) begin
a = b; // blocking — a updated now
b = a; // b gets new a, not old a
end
// Both end up with original value of b
// Simulation & synthesis mismatch!
RTL design uses two always block templates — one for pure combinational logic, one for clocked registers. Keeping them separate is a synthesis best practice.
// Combinational MUX
always @(*) begin
case (sel)
2'b00: out = in0;
2'b01: out = in1;
2'b10: out = in2;
default: out = in3; // REQUIRED
endcase
end
// Equivalent continuous assign
assign out = (sel == 2'b00) ? in0 :
(sel == 2'b01) ? in1 :
(sel == 2'b10) ? in2 : in3;
// D Flip-Flop with sync reset
always @(posedge clk) begin
if (rst_n == 1'b0)
q <= 1'b0;
else
q <= d;
end
// With clock enable
always @(posedge clk) begin
if (!rst_n)
q <= 1'b0;
else if (en)
q <= d;
// else q holds — no latch,
// because this is clocked
end
always @(*) for combinational — the simulator automatically tracks all signals read inside. Never manually write sensitivity lists for combinational blocks; missing a signal causes simulation-synthesis mismatch.Unintended latches are one of the most common RTL bugs. They are level-sensitive storage elements that synthesis tools create when a combinational block has incomplete output assignments.
always @(*) begin
if (en) begin
out = data_in; // only assigned when en=1
end
// When en=0, 'out' not assigned
// → synthesis infers a latch to hold value
end
always @(*) begin
out = 1'b0; // default at top
if (en) begin
out = data_in; // overrides when en=1
end
// All paths assign 'out' → pure combo
end
| Cause | Example | Fix |
|---|---|---|
| Incomplete if without else | if (sel) y = a; | Add else y = 0; or default at top |
| case without default | case(op) ... endcase | Add default: y = 0; |
| Output not assigned all paths | Output only in one branch | Assign default before the if/case |
| Missing signal in sensitivity list | always @(a) — misses b | Use always @(*) |
FSMs are the backbone of control logic in RTL. The two-always-block template (state register + next-state/output logic) is the industry standard for synthesis-clean FSMs.
Output is computed in the combinational next-state block. Responds one cycle faster than Moore but output can glitch with input noise.
Output is registered — cleaner, glitch-free. Preferred for most ASIC control logic. Requires one extra state but is more robust.
// State encoding — use parameter for readability
parameter IDLE = 2'b00,
FETCH = 2'b01,
EXEC = 2'b10,
DONE = 2'b11;
reg [1:0] state, next_state;
// Block 1: State register (sequential)
always @(posedge clk or negedge rst_n) begin
if (!rst_n)
state <= IDLE;
else
state <= next_state;
end
// Block 2: Next-state + output logic (combinational)
always @(*) begin
next_state = state; // default: stay in state
out = 1'b0; // default output
case (state)
IDLE: begin
if (start)
next_state = FETCH;
end
FETCH: begin
next_state = EXEC;
end
EXEC: begin
out = 1'b1;
if (done)
next_state = DONE;
end
DONE: begin
next_state = IDLE;
end
default: next_state = IDLE;
endcase
end
Uses log₂(N) FFs for N states. More complex next-state logic. Good for large FSMs in area-constrained ASIC designs.
One FF per state. Simplest next-state logic — just check one bit. Preferred for FPGAs and high-speed ASIC control paths.
Adjacent states differ by one bit — reduces glitches and switching activity. Used in counters crossing clock domains.
RTL written without these rules may simulate correctly but synthesize into hardware that behaves differently — or fail timing closure.
| Rule | What to Do | Why |
|---|---|---|
| Clocking | One clock per always block | Avoids undefined behavior with multiple edges |
| Reset style | Synchronous reset preferred | Easier timing closure; async reset needs special care in CDC |
| Assignments | <= in clocked, = in combinational | Matches hardware semantics; prevents sim/synth mismatch |
| Sensitivity list | Use always @(*) for combo | Auto-updates, prevents incomplete sensitivity list bugs |
| Delays | No #delay in RTL | Delays are ignored by synthesis; simulation-only construct |
| Latches | Always assign defaults | Prevents unintended latch inference |
| Initial blocks | Avoid in synthesizable RTL | Not supported in all synthesis flows; use reset instead |
| Case statements | Always include default | Prevents priority encoder inference and latch creation |
| Arithmetic | Mind bit-width on overflow | Synthesis matches the declared width — truncation can corrupt values |
| Generate | Use generate for repetitive structures | Keeps code scalable and readable; synthesizes correctly |
Select a common RTL scenario to see the correct vs incorrect coding pattern and what hardware each infers.
always @(posedge clk) begin
q1 <= d; // FF
q2 <= q1; // FF — pipeline
end
// q1 and q2 are two pipeline stages
// Infers: 2 flip-flops
always @(posedge clk) begin
q1 = d; // q1 updated NOW
q2 = q1; // gets new q1, not old
end
// q2 immediately equals d — no pipeline!
// Simulation ≠ Synthesis behavior
Non-blocking assignments model pipelined registers correctly. Both RHS values are sampled before any LHS is updated — this is the fundamental flip-flop behavior.
always @(*) begin
y = 1'b0; // default
if (sel)
y = data;
end
// All paths assign y → pure combinational
// Infers: MUX, no latch
always @(*) begin
if (sel)
y = data;
// No else — y not assigned when sel=0
end
// y must "remember" its last value
// Infers: latch (level-sensitive storage)
Latches cause timing closure problems — they are transparent, not edge-triggered, and STA tools cannot easily constrain them. Always assign a default.
always @(*) begin
next_state = IDLE; // default
case (state)
IDLE: if (req) next_state = ACTIVE;
ACTIVE: next_state = DONE;
DONE: next_state = IDLE;
default: next_state = IDLE; // safety
endcase
end
// No latch — all states covered
always @(*) begin
case (state)
IDLE: next_state = ACTIVE;
ACTIVE: next_state = DONE;
DONE: next_state = IDLE;
// no default!
endcase
end
// Undefined states → latch inferred
// Or: X-propagation in simulation
The default assignment before the case statement AND the default case inside both serve different purposes. Use both for a fully robust FSM.
always @(posedge clk or negedge rst_n) begin
if (!rst_n)
q <= 1'b0; // async reset
else
q <= d;
end
// rst_n in sensitivity list → async behavior
// Reset takes effect immediately (no clock needed)
always @(posedge clk) begin
if (!rst_n)
q <= 1'b0; // sync reset
else
q <= d;
end
// rst_n NOT in sensitivity list → sync behavior
// Reset only takes effect on clock edge
// Easier timing closure — preferred in ASIC
Both are valid. Sync reset is easier to close timing on. Async reset is useful when the chip must reset without a running clock (power-on, brownout). In CDC designs, async reset must be synchronized before release.
Answers to the most frequently asked RTL design questions in VLSI interviews and design reviews.
wire is a net — it must be driven at all times by a continuous assignment or module output port. reg is a variable that retains its last assigned value between always block executions. In synthesis: reg inside always @(posedge clk) infers a flip-flop; reg inside always @(*) infers combinational logic (wire). The name "reg" is misleading — it does not necessarily mean a hardware register.=) executes sequentially — each statement completes before the next starts. Non-blocking (<=) evaluates all right-hand sides first, then updates all left-hand sides simultaneously at the end of the time step. Use = in combinational blocks and <= in clocked blocks. Mixing them in the same clocked block causes simulation-synthesis mismatch and race conditions.default branch. Tools like Synopsys Design Compiler will warn you about latch inference — treat these warnings as errors.