Models all 32 banks (8 bank groups × 4 banks) per HBM3 pseudo-channel. Tracks open/precharged state per bank and enforces inter-bank timing: tRRDs, tRRDl, tCCDs, tCCDl, tFAW and tWTR. Fully synthesizable Verilog.
HBM3 divides its 32 banks per pseudo-channel into 8 bank groups (BG0–BG7), each containing 4 banks (B0–B3). This is not just an addressing scheme — it unlocks tighter CAS-to-CAS spacing.
Within the same bank group, consecutive CAS commands need tCCDs = 4 cycles (2 ns). But if you alternate between different bank groups, the constraint loosens to tCCDl = 8 cycles (4 ns) — but the bus can still sustain near-continuous data because the BGs run in parallel. A smart scheduler interleaves across BGs to maximise throughput.
| Parameter | Symbol | Cycles (2 GHz) | Time | Applies To |
|---|---|---|---|---|
| ACT-to-ACT, same BG | tRRDs | 4 | 2 ns | Two ACTs targeting the same bank group |
| ACT-to-ACT, diff BG | tRRDl | 8 | 4 ns | Two ACTs targeting different bank groups |
| CAS-to-CAS, same BG | tCCDs | 4 | 2 ns | RD/WR back-to-back, same bank group |
| CAS-to-CAS, diff BG | tCCDl | 8 | 4 ns | RD/WR back-to-back, different bank groups |
| Four-Activate Window | tFAW | 32 | 16 ns | Max 4 ACTs in any rolling 32-cycle window |
| Write-to-Read, same BG | tWTRs | 8 | 4 ns | WR then RD on same bank group |
| Write-to-Read, diff BG | tWTRl | 16 | 8 ns | WR then RD on different bank groups |
| Port | Dir | Width | Description |
|---|---|---|---|
| i_clk / i_rst_n | in | 1 | Clock / active-low synchronous reset |
| i_cmd_act/rd/wr/pre | in | 1 each | ACTIVATE / READ / WRITE / PRECHARGE command pulses |
| i_cmd_prea | in | 1 | PRECHARGE ALL — closes every open bank in one command |
| i_cmd_ref | in | 1 | ALL-BANK REFRESH — all 32 banks enter refresh state |
| i_bg_sel[2:0] | in | 3 | Target bank group (0–7) |
| i_ba_sel[1:0] | in | 2 | Target bank within group (0–3) |
| o_bank_active | out | 1 | Selected bank has an open row (ACTIVE state) |
| o_bank_idle | out | 1 | Selected bank is precharged (IDLE state) |
| o_act_allowed | out | 1 | ACT permitted: tRRDs + tRRDl + tFAW all satisfied |
| o_cas_allowed | out | 1 | CAS permitted: tCCDs + tCCDl + tWTR all satisfied |
| o_banks_open[31:0] | out | 32 | Bitmap of all open banks (bit = BG*4 + BA) |
| o_open_count[4:0] | out | 5 | Number of currently open banks (0–32) |
// ================================================================
// hbm3_bank_fsm.v
// HBM3 Bank & Bank-Group State Machine — Phase 1 · Module 2
// 8 Bank Groups x 4 Banks = 32 banks per pseudo-channel
// Enforces tRRDs/tRRDl, tCCDs/tCCDl, tFAW, tWTRs/tWTRl
// Synthesizable Verilog — EcrioniX HBM3 Controller Series
// ================================================================
module hbm3_bank_fsm #(
// Defaults calibrated for 2 GHz controller clock (0.5 ns/cycle)
parameter tRRDs = 4, // ACT-to-ACT, same BG 2 ns
parameter tRRDl = 8, // ACT-to-ACT, diff BG 4 ns
parameter tCCDs = 4, // CAS-to-CAS, same BG 2 ns
parameter tCCDl = 8, // CAS-to-CAS, diff BG 4 ns
parameter tFAW = 32, // Four-Activate Window 16 ns
parameter tWTRs = 8, // Write-to-Read, same BG 4 ns
parameter tWTRl = 16 // Write-to-Read, diff BG 8 ns
)(
input wire i_clk,
input wire i_rst_n, // Active-low synchronous reset
// Command inputs (one-hot pulses from scheduler)
input wire i_cmd_act, // ACTIVATE
input wire i_cmd_rd, // READ
input wire i_cmd_wr, // WRITE
input wire i_cmd_pre, // PRECHARGE single bank
input wire i_cmd_prea, // PRECHARGE ALL banks
input wire i_cmd_ref, // ALL-BANK REFRESH
// Target address
input wire [2:0] i_bg_sel, // Bank group select (0-7)
input wire [1:0] i_ba_sel, // Bank address within group (0-3)
// Status outputs
output wire o_bank_active, // Selected bank is ACTIVE (row open)
output wire o_bank_idle, // Selected bank is IDLE (precharged)
output wire o_act_allowed, // ACT may be issued to selected bank
output wire o_cas_allowed, // RD/WR may be issued to selected bank
output wire [31:0] o_banks_open, // Bitmap: bit[BG*4+BA] = 1 if ACTIVE
output wire [4:0] o_open_count // Count of currently open banks
);
// -- Bank state encoding ------------------------------------------
localparam [1:0]
BS_IDLE = 2'd0, // Precharged, ready for ACT
BS_ACTIVE = 2'd1, // Row open, ready for RD/WR
BS_REFRESH = 2'd2; // Refresh in progress
// Per-bank state array [BG][Bank]
reg [1:0] bstate [0:7][0:3];
// Per-BG timing counters (same-BG constraints)
reg [4:0] bg_act_cnt [0:7]; // tRRDs countdown per BG
reg [4:0] bg_cas_cnt [0:7]; // tCCDs countdown per BG
reg [4:0] bg_wtr_cnt [0:7]; // tWTRs countdown per BG
// Global timing counters (cross-BG constraints)
reg [4:0] gl_act_cnt; // tRRDl countdown
reg [4:0] gl_cas_cnt; // tCCDl countdown
reg [4:0] gl_wtr_cnt; // tWTRl countdown
// Four-Activate Window (tFAW)
// Saturating counter: max 4 ACTs allowed in tFAW cycles
reg [2:0] act_in_faw; // ACTs in current window (0-4)
reg [5:0] faw_timer; // Reloads on each ACT
integer i, j;
// -- Sequential: bank states and timing counters ------------------
always @(posedge i_clk) begin
if (!i_rst_n) begin
for (i = 0; i < 8; i = i+1) begin
bg_act_cnt[i] <= 5'd0;
bg_cas_cnt[i] <= 5'd0;
bg_wtr_cnt[i] <= 5'd0;
for (j = 0; j < 4; j = j+1)
bstate[i][j] <= BS_IDLE;
end
gl_act_cnt <= 5'd0;
gl_cas_cnt <= 5'd0;
gl_wtr_cnt <= 5'd0;
act_in_faw <= 3'd0;
faw_timer <= 6'd0;
end else begin
// -- Decrement all per-BG timers -------------------------
for (i = 0; i < 8; i = i+1) begin
if (bg_act_cnt[i] > 0) bg_act_cnt[i] <= bg_act_cnt[i] - 1;
if (bg_cas_cnt[i] > 0) bg_cas_cnt[i] <= bg_cas_cnt[i] - 1;
if (bg_wtr_cnt[i] > 0) bg_wtr_cnt[i] <= bg_wtr_cnt[i] - 1;
end
if (gl_act_cnt > 0) gl_act_cnt <= gl_act_cnt - 1;
if (gl_cas_cnt > 0) gl_cas_cnt <= gl_cas_cnt - 1;
if (gl_wtr_cnt > 0) gl_wtr_cnt <= gl_wtr_cnt - 1;
// -- FAW sliding window timer ----------------------------
// Each ACT reloads the window; oldest ACT falls out as timer expires
if (faw_timer > 0) begin
faw_timer <= faw_timer - 1;
end else if (act_in_faw > 0) begin
act_in_faw <= act_in_faw - 1; // oldest ACT exits the window
end
// -- ACTIVATE --------------------------------------------
if (i_cmd_act && o_act_allowed &&
bstate[i_bg_sel][i_ba_sel] == BS_IDLE) begin
bstate[i_bg_sel][i_ba_sel] <= BS_ACTIVE;
bg_act_cnt[i_bg_sel] <= tRRDs[4:0] - 1;
gl_act_cnt <= tRRDl[4:0] - 1;
faw_timer <= tFAW[5:0] - 1;
if (act_in_faw < 4) act_in_faw <= act_in_faw + 1;
end
// -- READ ------------------------------------------------
if (i_cmd_rd && bstate[i_bg_sel][i_ba_sel] == BS_ACTIVE) begin
bg_cas_cnt[i_bg_sel] <= tCCDs[4:0] - 1;
gl_cas_cnt <= tCCDl[4:0] - 1;
end
// -- WRITE -----------------------------------------------
if (i_cmd_wr && bstate[i_bg_sel][i_ba_sel] == BS_ACTIVE) begin
bg_cas_cnt[i_bg_sel] <= tCCDs[4:0] - 1;
gl_cas_cnt <= tCCDl[4:0] - 1;
bg_wtr_cnt[i_bg_sel] <= tWTRs[4:0] - 1;
gl_wtr_cnt <= tWTRl[4:0] - 1;
end
// -- PRECHARGE single bank -------------------------------
if (i_cmd_pre && bstate[i_bg_sel][i_ba_sel] == BS_ACTIVE)
bstate[i_bg_sel][i_ba_sel] <= BS_IDLE;
// -- PRECHARGE ALL ---------------------------------------
if (i_cmd_prea) begin
for (i = 0; i < 8; i = i+1)
for (j = 0; j < 4; j = j+1)
if (bstate[i][j] == BS_ACTIVE)
bstate[i][j] <= BS_IDLE;
end
// -- ALL-BANK REFRESH ------------------------------------
if (i_cmd_ref) begin
for (i = 0; i < 8; i = i+1)
for (j = 0; j < 4; j = j+1)
bstate[i][j] <= BS_REFRESH;
end
end
end
// -- Combinational outputs ----------------------------------------
assign o_bank_active = (bstate[i_bg_sel][i_ba_sel] == BS_ACTIVE);
assign o_bank_idle = (bstate[i_bg_sel][i_ba_sel] == BS_IDLE);
// ACT allowed: same-BG tRRDs met AND global tRRDl met AND FAW < 4
assign o_act_allowed = (bg_act_cnt[i_bg_sel] == 0) &&
(gl_act_cnt == 0) &&
(act_in_faw < 4);
// CAS allowed: tCCDs + tCCDl + tWTR all satisfied
assign o_cas_allowed = (bg_cas_cnt[i_bg_sel] == 0) &&
(gl_cas_cnt == 0) &&
(bg_wtr_cnt[i_bg_sel] == 0) &&
(gl_wtr_cnt == 0);
// All-banks open bitmap (synthesises as 32 comparators)
genvar gi, gj;
generate
for (gi = 0; gi < 8; gi = gi+1)
for (gj = 0; gj < 4; gj = gj+1)
assign o_banks_open[gi*4 + gj] =
(bstate[gi][gj] == BS_ACTIVE);
endgenerate
// Popcount of o_banks_open (o_open_count)
reg [4:0] cnt_tmp;
integer k;
always @(*) begin
cnt_tmp = 5'd0;
for (k = 0; k < 32; k = k+1)
cnt_tmp = cnt_tmp + o_banks_open[k];
end
assign o_open_count = cnt_tmp;
endmodule
act_in_faw counter uses a simplified sliding window — it increments on each ACT and decrements when faw_timer expires. This is a conservative tFAW model: once 4 ACTs are counted, no more are allowed until the oldest exits the window. For a cycle-accurate model, use a 4-entry timestamp FIFO.// ================================================================
// tb_hbm3_bank_fsm.sv
// Self-checking testbench for hbm3_bank_fsm
// Uses reduced timing parameters for simulation speed
// ================================================================
`timescale 1ns/1ps
module tb_hbm3_bank_fsm;
localparam tRRDs = 3;
localparam tRRDl = 5;
localparam tCCDs = 3;
localparam tCCDl = 5;
localparam tFAW = 12;
localparam tWTRs = 4;
localparam tWTRl = 6;
reg i_clk=0, i_rst_n=0;
reg i_cmd_act=0, i_cmd_rd=0, i_cmd_wr=0;
reg i_cmd_pre=0, i_cmd_prea=0, i_cmd_ref=0;
reg [2:0] i_bg_sel=0;
reg [1:0] i_ba_sel=0;
wire o_bank_active, o_bank_idle, o_act_allowed, o_cas_allowed;
wire [31:0] o_banks_open;
wire [4:0] o_open_count;
hbm3_bank_fsm #(
.tRRDs(tRRDs),.tRRDl(tRRDl),.tCCDs(tCCDs),
.tCCDl(tCCDl),.tFAW(tFAW),.tWTRs(tWTRs),.tWTRl(tWTRl)
) dut (.*);
always #0.25 i_clk = ~i_clk;
// Helper
task pulse(ref reg sig);
@(negedge i_clk); sig=1; @(posedge i_clk); #0.1; sig=0;
endtask
task set_addr(input [2:0] bg, input [1:0] ba);
i_bg_sel = bg; i_ba_sel = ba;
endtask
integer pass=0, fail=0;
task check(input string name, input logic got, input logic exp);
if (got===exp) begin $display(" PASS: %s",name); pass++; end
else begin $display(" FAIL: %s (got=%b exp=%b)",name,got,exp); fail++; end
endtask
// -- SVA Assertions -----------------------------------------------
// Cannot activate a bank that is already open
property p_no_double_act;
@(posedge i_clk) (i_cmd_act && o_bank_active) |-> o_act_allowed == 0;
endproperty
assert property(p_no_double_act) else
$error("ASSERT FAIL: ACT issued to already-open bank");
// CAS on idle bank is illegal
property p_cas_needs_active;
@(posedge i_clk) (i_cmd_rd || i_cmd_wr) |-> o_bank_active;
endproperty
assert property(p_cas_needs_active) else
$error("ASSERT FAIL: CAS issued without active row");
// FAW must never exceed 4
property p_faw_limit;
@(posedge i_clk) o_open_count <= 32;
endproperty
assert property(p_faw_limit);
// -- Tests --------------------------------------------------------
initial begin
$dumpfile("tb_hbm3_bank_fsm.vcd");
$dumpvars(0, tb_hbm3_bank_fsm);
i_rst_n=0; repeat(4) @(posedge i_clk); i_rst_n=1; @(posedge i_clk);
// TEST 1: ACT BG0/B0
$display("\n[TEST 1] ACTIVATE BG0/B0");
set_addr(0,0);
check("Bank idle before ACT", o_bank_idle, 1'b1);
check("Bank not active before ACT", o_bank_active, 1'b0);
pulse(i_cmd_act);
@(posedge i_clk);
check("Bank active after ACT", o_bank_active, 1'b1);
check("BG0 bitmap bit set", o_banks_open[0], 1'b1);
// TEST 2: tRRDs — same BG ACT must wait
$display("\n[TEST 2] tRRDs enforcement — same BG");
set_addr(0,1); // BG0 B1
check("act_allowed low in tRRDs window", o_act_allowed, 1'b0);
repeat(tRRDs) @(posedge i_clk);
check("act_allowed after tRRDs", o_act_allowed, 1'b1);
// TEST 3: ACT different BG immediately (tRRDl)
$display("\n[TEST 3] tRRDl enforcement — diff BG");
set_addr(0,0); pulse(i_cmd_act); @(posedge i_clk); // re-ACT BG0
set_addr(1,0); // BG1
check("act_allowed low in tRRDl window (diff BG)", o_act_allowed, 1'b0);
repeat(tRRDl) @(posedge i_clk);
check("act_allowed after tRRDl", o_act_allowed, 1'b1);
// TEST 4: tFAW — 4 ACTs then blocked
$display("\n[TEST 4] Four-Activate Window (tFAW)");
set_addr(0,0); if(o_bank_idle) pulse(i_cmd_act); @(posedge i_clk);
repeat(tRRDl) @(posedge i_clk);
set_addr(1,0); pulse(i_cmd_act); @(posedge i_clk);
repeat(tRRDl) @(posedge i_clk);
set_addr(2,0); pulse(i_cmd_act); @(posedge i_clk);
repeat(tRRDl) @(posedge i_clk);
set_addr(3,0); pulse(i_cmd_act); @(posedge i_clk);
repeat(tRRDl) @(posedge i_clk);
set_addr(4,0); // 5th ACT — should be blocked
check("5th ACT blocked by tFAW", o_act_allowed, 1'b0);
repeat(tFAW) @(posedge i_clk);
check("ACT allowed after tFAW expires", o_act_allowed, 1'b1);
// TEST 5: PRECHARGE ALL
$display("\n[TEST 5] PRECHARGE ALL");
pulse(i_cmd_prea); @(posedge i_clk);
check("All banks closed after PREA", o_banks_open, 32'd0);
check("open_count = 0", o_open_count, 5'd0);
// TEST 6: tCCDs — same BG CAS spacing
$display("\n[TEST 6] tCCDs CAS spacing");
set_addr(0,0); pulse(i_cmd_act);
repeat(tRRDs+1) @(posedge i_clk);
pulse(i_cmd_rd); @(posedge i_clk);
check("cas_allowed low after first CAS", o_cas_allowed, 1'b0);
repeat(tCCDs) @(posedge i_clk);
check("cas_allowed after tCCDs", o_cas_allowed, 1'b1);
// TEST 7: Write-to-Read tWTRs
$display("\n[TEST 7] tWTRs write-to-read");
pulse(i_cmd_wr); @(posedge i_clk);
check("Read blocked by tWTRs after WR", o_cas_allowed, 1'b0);
repeat(tWTRs) @(posedge i_clk);
check("Read allowed after tWTRs", o_cas_allowed, 1'b1);
// Summary
$display("\n========================================");
$display(" RESULTS: %0d PASS | %0d FAIL", pass, fail);
if (fail==0) $display(" ALL TESTS PASSED ✅");
else $display(" FAILURES DETECTED ❌");
$display("========================================");
$finish;
end
endmodule
A bank group is a cluster of DRAM banks sharing sense amplifiers and internal data paths. Having 8 independent BGs lets the scheduler pipeline CAS commands across groups with tight tCCDs spacing (4 cycles) rather than waiting the full tCCDl (8 cycles). More BGs = more parallelism = higher effective bandwidth.
tFAW (Four Activate Window) limits peak current draw. Each ACTIVATE command charges the bitlines of an entire row — a high-current event. If you issue too many ACTs simultaneously, the DRAM's internal charge pump and VDD rails can droop, causing bit errors. tFAW is a rolling power budget: never more than 4 row activations in any 16 ns window.
After a WRITE, the DQ bus is still being driven by the controller. Before issuing a READ, the bus must turn around — the controller stops driving, tri-states its outputs, and the DRAM starts driving. tWTRs (same BG) and tWTRl (cross-BG) are the minimum bus turnaround delays before the first READ data can be sampled cleanly.
They run in parallel and both must grant permission. Module 1 tracks per-bank timing (tRCD, tRAS, tRP, CL) — it knows whether THIS bank's row is ready. Module 2 tracks inter-bank timing (tRRD, tCCD, tFAW) — it knows whether the BUS and CHARGE PUMP are ready. The scheduler AND-gates both ready/act_allowed/cas_allowed outputs before issuing any command.
No — it updates on the clock edge AFTER the ACT pulse, because bstate is registered. The bitmap reflects the state AFTER the last rising edge. The scheduler sees bank_active go high on the cycle following the ACT pulse, which is the correct pipeline timing — the row is being activated during that cycle.
It is a conservative approximation. The act_in_faw counter increments on each ACT and faw_timer reloads to tFAW. When the timer expires, one entry exits the window. This slightly over-constrains (may block one extra ACT in edge cases) compared to a precise timestamp FIFO, but it is always safe and synthesizes more cleanly. Module 18 (Integration) will include the precise 4-entry FIFO version.