The integration hub for one HBM3 pseudo-channel — instantiates the timing FSM, bank FSM, refresh controller, and page policy, wired together with a 7-state command sequencing FSM.
An HBM3 stack has 16 pseudo-channels, each a logically independent 64-bit memory interface. Every PC has its own bank array, command bus, data bus, and refresh timer. The hbm3_pc_ctrl module is the central coordinator for one pseudo-channel — it doesn't implement timing or banking logic itself; it orchestrates the specialised sub-modules built in Modules 1–9.
Think of it as a CPU's memory controller dispatch unit. Just as an out-of-order CPU has separate issue queues, execution units, and reorder buffers that the dispatcher coordinates, the PC controller has separate timing logic (Module 1), bank state (Module 2), refresh scheduling (Module 3), page policy (Module 4), and request arbitration (Module 9). The PC controller connects these components and runs the final command-issue FSM.
o_state_dbg for post-silicon debughbm3_pc_ctrl, one per pseudo-channel, plus a top-level arbiter that multiplexes host traffic across them. This article covers the single-PC module in isolation.Each sub-module has a clearly defined interface role inside the PC controller:
| Sub-Module | Role | Key Signals In | Key Signals Out |
|---|---|---|---|
| Module 1 — Timing FSM | Per-bank timing enforcement | cmd_act, cmd_rd, cmd_wr, cmd_pre | act_allowed, cas_allowed, pre_allowed |
| Module 2 — Bank FSM | 32-bank open/closed state | cmd_act, cmd_pre, bg, ba | banks_open[31:0], row_open[14:0] |
| Module 3 — Refresh Ctrl | tREFI countdown + REF issue | ref_done (from FSM) | ref_req |
| Module 4 — Page Policy | Hit/miss/close decision | req_addr, banks_open, row_open | policy_hit, policy_miss, policy_close |
| Module 9 — Scheduler | FR-FCFS request ordering | req_valid/wr/addr/data, banks_open, any_hit, ref_req | cmd_act/rd/wr/pre, cmd_bg/ba/row/col, cmd_data |
The wiring follows a producer-consumer pattern:
banks_open for hit detection.act_allowed / cas_allowed gate before actually issuing.banks_open + row_open to decide whether the next request is a hit/miss.ref_req, and the PC controller FSM handles the refresh sequence.The heart of the PC controller is a 7-state FSM that sequences every DRAM command in the correct order and with correct interlocks:
| State | Encoding | Description |
|---|---|---|
| IDLE | 4'd0 | No active transaction. Poll scheduler for next command. |
| DECODE | 4'd1 | Latch selected request. Decode address. Read page policy output. |
| ACTIVATE | 4'd2 | Issue ACT command. Wait tRCD cycles for row to open. |
| READ | 4'd3 | Issue RD command. Wait CL cycles for data return. |
| WRITE | 4'd4 | Issue WR command. Wait CWL cycles. Write data committed. |
| PRECHARGE | 4'd5 | Issue PRE command. Wait tRP cycles. Bank returns to idle. |
| REFRESH | 4'd6 | Issue REF. Wait tRFC (350 ns). All banks refreshed. |
| RETURN | 4'd7 | Assert o_rd_valid, drive o_rd_data to host. |
The FSM path depends on the page policy decision captured in DECODE state:
The target bank is already activated to the correct row. The controller skips ACT entirely:
T=0: IDLE — scheduler presents valid read, same row as open bank
T=1: DECODE — policy_hit=1, latch {bg,ba,col}, check cas_allowed
T=2: READ — issue o_cmd_rd, start CL counter (14 cycles)
T=3–15: READ — waiting CL
T=16: RETURN — assert o_rd_valid, drive o_rd_data[127:0]
T=17: IDLE — ready for next request
T=0: IDLE — new read, bank is precharged (idle) T=1: DECODE — policy_miss=1 (bank empty), check act_allowed T=2: ACTIVATE — issue o_cmd_act, start tRCD counter (14 cycles) T=3–15: ACTIVATE — waiting tRCD T=16: READ — issue o_cmd_rd, start CL (14 cycles) T=17–29: READ — waiting CL T=30: RETURN — assert o_rd_valid, drive o_rd_data T=31: IDLE
T=0: IDLE — new read, bank open to WRONG row T=1: DECODE — policy_close=1, must precharge first T=2: PRECHARGE — issue o_cmd_pre, wait tRP (14 cycles) T=3–15: PRECHARGE — waiting tRP T=16: ACTIVATE — issue o_cmd_act, wait tRCD (14 cycles) T=17–29: ACTIVATE — waiting tRCD T=30: READ — issue o_cmd_rd, wait CL (14 cycles) T=31–43: READ — waiting CL T=44: RETURN — o_rd_valid + o_rd_data T=45: IDLE
T=0: any state — ref_req asserted by Module 3 T=1: FSM completes current command (if mid-sequence, finish it) T=N: if banks_open != 0: PRECHARGE all, wait tRP T=N+14: REFRESH — issue o_cmd_ref (not in port list; handled internally) T=N+15 to N+714: REFRESH — waiting tRFC = 700 cycles (350 ns @ 2 GHz) T=N+715: IDLE — ref_done asserted, Module 3 clears ref_req
No command exits the PC controller without passing three independent checks. All three must be HIGH in the same cycle:
| Guard | Source | Blocks | Reason |
|---|---|---|---|
| act_allowed | Module 1 (Timing FSM) | ACT | tRP/tRRD/tFAW not yet satisfied |
| cas_allowed | Module 1 (Timing FSM) | RD/WR | tRCD/tCCD/tWTR not yet satisfied |
| !bank_conflict | Local (banks_in_flight reg) | Any cmd to busy bank | Previous command to same bank still in-flight |
The FSM's ACTIVATE and READ/WRITE states loop (hold state, no command output) until all three guards simultaneously pass. This simple stall mechanism ensures correctness without complex handshake protocols between sub-modules.
// Three-way issue gate — used in ACT and CAS states
wire act_gate = act_allowed_w && !banks_in_flight[cur_bank_id];
wire cas_gate = cas_allowed_w && !banks_in_flight[cur_bank_id];
// In ACTIVATE state:
S_ACTIVATE: begin
if (act_gate) begin
o_cmd_act <= 1'b1;
wait_cnt <= TRCD;
state <= S_READ_WAIT; // or S_WRITE_WAIT
banks_in_flight[cur_bank_id] <= 1'b1;
end
// else: stay in S_ACTIVATE, output nothing
end
The PC controller instantiates all sub-modules as black boxes and runs the command sequencing FSM. Sub-module ports use the same i_/o_ convention.
// hbm3_pc_ctrl.v — HBM3 Pseudo-Channel Controller (Module 10)
// Integrates: timing FSM (M1), bank FSM (M2), refresh ctrl (M3),
// page policy (M4), scheduler (M9)
// EcrioniX HBM3 Controller Build · Phase 3
`timescale 1ns/1ps
`default_nettype none
module hbm3_pc_ctrl #(
parameter TRCD = 14, // cycles — row-to-column delay
parameter TRP = 14, // cycles — precharge time
parameter TCL = 14, // cycles — CAS read latency
parameter TCWL = 8, // cycles — CAS write latency
parameter TRFC = 700 // cycles — refresh recovery (350 ns @ 2 GHz)
)(
input wire i_clk,
input wire i_rst_n,
// Which pseudo-channel this instance serves (0–15)
input wire [3:0] i_pc_id,
// Host request interface
input wire i_req_valid,
input wire i_req_wr,
input wire [33:0] i_req_addr,
input wire [127:0] i_req_data,
input wire [15:0] i_req_mask,
output wire o_req_ready,
// Host read data return
output reg [127:0] o_rd_data,
output reg o_rd_valid,
output reg o_wr_done,
// Status
output wire o_busy,
output wire [3:0] o_state_dbg
);
// -----------------------------------------------------------------------
// Internal wires — sub-module interconnect
// -----------------------------------------------------------------------
// Timing FSM (Module 1) outputs
wire act_allowed_w;
wire cas_allowed_w;
wire pre_allowed_w;
// Bank FSM (Module 2) outputs
wire [31:0] banks_open_w;
wire [14:0] row_open_w; // row open in selected bank (simplified)
wire any_hit_w;
// Refresh controller (Module 3) outputs
wire ref_req_w;
// Page policy (Module 4) outputs
wire policy_hit_w;
wire policy_miss_w;
wire policy_close_w;
// Scheduler (Module 9) outputs
wire sched_cmd_act;
wire sched_cmd_rd;
wire sched_cmd_wr;
wire sched_cmd_pre;
wire [2:0] sched_cmd_bg;
wire [1:0] sched_cmd_ba;
wire [14:0] sched_cmd_row;
wire [4:0] sched_cmd_col;
wire [127:0] sched_cmd_data;
wire sched_req_ready;
// Command outputs to timing FSM and bank FSM
reg cmd_act_r, cmd_rd_r, cmd_wr_r, cmd_pre_r, cmd_ref_r;
reg [2:0] cmd_bg_r;
reg [1:0] cmd_ba_r;
reg [14:0] cmd_row_r;
reg [4:0] cmd_col_r;
reg [127:0] cmd_data_r;
// Read data captured from DRAM (simulated: echo cmd_data with CL delay)
// In real silicon this comes from the PHY RX FIFO
reg [127:0] rd_data_pipe [0:15];
reg [3:0] rd_pipe_ptr;
reg rd_pipe_valid [0:15];
// -----------------------------------------------------------------------
// Sub-module instantiations (black boxes — full RTL in earlier modules)
// -----------------------------------------------------------------------
hbm3_timing_fsm #(
.TRCD(TRCD), .TRP(TRP), .TCL(TCL), .TCWL(TCWL)
) u_timing (
.i_clk (i_clk),
.i_rst_n (i_rst_n),
.i_cmd_act (cmd_act_r),
.i_cmd_rd (cmd_rd_r),
.i_cmd_wr (cmd_wr_r),
.i_cmd_pre (cmd_pre_r),
.i_cmd_bg (cmd_bg_r),
.i_cmd_ba (cmd_ba_r),
.o_act_allowed (act_allowed_w),
.o_cas_allowed (cas_allowed_w),
.o_pre_allowed (pre_allowed_w)
);
hbm3_bank_fsm u_bank (
.i_clk (i_clk),
.i_rst_n (i_rst_n),
.i_cmd_act (cmd_act_r),
.i_cmd_pre (cmd_pre_r),
.i_cmd_rd (cmd_rd_r),
.i_cmd_wr (cmd_wr_r),
.i_cmd_bg (cmd_bg_r),
.i_cmd_ba (cmd_ba_r),
.i_cmd_row (cmd_row_r),
.o_banks_open (banks_open_w),
.o_row_open (row_open_w),
.o_any_hit (any_hit_w)
);
hbm3_refresh_ctrl u_ref (
.i_clk (i_clk),
.i_rst_n (i_rst_n),
.i_ref_done (cmd_ref_r),
.o_ref_req (ref_req_w)
);
hbm3_page_policy u_policy (
.i_clk (i_clk),
.i_rst_n (i_rst_n),
.i_req_addr (i_req_addr),
.i_banks_open(banks_open_w),
.i_row_open (row_open_w),
.o_hit (policy_hit_w),
.o_miss (policy_miss_w),
.o_close (policy_close_w)
);
hbm3_scheduler #(
.RQ_DEPTH(16), .WQ_DEPTH(16),
.WQ_HWM(12), .WQ_LWM(4)
) u_sched (
.i_clk (i_clk),
.i_rst_n (i_rst_n),
.i_req_valid (i_req_valid),
.i_req_addr (i_req_addr),
.i_req_wr (i_req_wr),
.i_req_data (i_req_data),
.i_req_mask (i_req_mask),
.o_req_ready (sched_req_ready),
.i_banks_open (banks_open_w),
.i_any_hit (any_hit_w),
.i_act_allowed (act_allowed_w),
.i_cas_allowed (cas_allowed_w),
.i_ref_req (ref_req_w),
.o_cmd_act (sched_cmd_act),
.o_cmd_rd (sched_cmd_rd),
.o_cmd_wr (sched_cmd_wr),
.o_cmd_pre (sched_cmd_pre),
.o_cmd_bg (sched_cmd_bg),
.o_cmd_ba (sched_cmd_ba),
.o_cmd_row (sched_cmd_row),
.o_cmd_col (sched_cmd_col),
.o_cmd_data (sched_cmd_data),
.o_rd_queue_depth (),
.o_wr_queue_depth (),
.o_wr_drain ()
);
assign o_req_ready = sched_req_ready;
// -----------------------------------------------------------------------
// Command Sequencing FSM
// -----------------------------------------------------------------------
localparam [3:0]
S_IDLE = 4'd0,
S_DECODE = 4'd1,
S_ACTIVATE = 4'd2,
S_READ = 4'd3,
S_WRITE = 4'd4,
S_PRECHARGE = 4'd5,
S_REFRESH = 4'd6,
S_RETURN = 4'd7;
reg [3:0] state, state_next;
reg [9:0] wait_cnt;
reg cur_is_wr;
reg [4:0] cur_bank_id; // {bg[2:0], ba[1:0]}
reg [14:0] cur_row;
reg [4:0] cur_col;
reg [127:0] cur_wdata;
reg policy_hit_r, policy_close_r;
reg [31:0] banks_in_flight;
assign o_state_dbg = state;
assign o_busy = (state != S_IDLE) || ref_req_w;
// Three-way issue gates
wire act_gate = act_allowed_w && !banks_in_flight[cur_bank_id];
wire cas_gate = cas_allowed_w && !banks_in_flight[cur_bank_id];
wire pre_gate = pre_allowed_w;
// Synthesis helper: full case avoids latches
always @(posedge i_clk or negedge i_rst_n) begin
if (!i_rst_n) begin
state <= S_IDLE;
wait_cnt <= 10'd0;
banks_in_flight <= 32'd0;
cmd_act_r <= 0; cmd_rd_r <= 0; cmd_wr_r <= 0;
cmd_pre_r <= 0; cmd_ref_r <= 0;
cmd_bg_r <= 0; cmd_ba_r <= 0;
cmd_row_r <= 0; cmd_col_r <= 0; cmd_data_r <= 0;
o_rd_data <= 128'd0; o_rd_valid <= 0; o_wr_done <= 0;
cur_is_wr <= 0; cur_bank_id <= 0; cur_row <= 0;
cur_col <= 0; cur_wdata <= 0;
policy_hit_r <= 0; policy_close_r <= 0;
end else begin
// Default pulse clears
cmd_act_r <= 0; cmd_rd_r <= 0; cmd_wr_r <= 0;
cmd_pre_r <= 0; cmd_ref_r <= 0;
o_rd_valid <= 0; o_wr_done <= 0;
case (state)
// --------------------------------------------------------
S_IDLE: begin
if (ref_req_w) begin
// Refresh has highest priority
if (|banks_open_w) begin
// Must precharge all before refresh
cmd_pre_r <= 1'b1;
wait_cnt <= TRP;
state <= S_PRECHARGE;
end else begin
wait_cnt <= TRFC;
cmd_ref_r <= 1'b1;
state <= S_REFRESH;
end
end else if (sched_cmd_act || sched_cmd_rd ||
sched_cmd_wr || sched_cmd_pre) begin
// Scheduler has a command ready — latch and decode
cur_bank_id <= {sched_cmd_bg, sched_cmd_ba};
cur_row <= sched_cmd_row;
cur_col <= sched_cmd_col;
cur_wdata <= sched_cmd_data;
cur_is_wr <= sched_cmd_wr && !sched_cmd_rd;
cmd_bg_r <= sched_cmd_bg;
cmd_ba_r <= sched_cmd_ba;
policy_hit_r <= policy_hit_w;
policy_close_r <= policy_close_w;
state <= S_DECODE;
end
end
// --------------------------------------------------------
S_DECODE: begin
// One pipeline stage for page policy to settle
if (policy_hit_r) begin
// Row already open — skip ACT
state <= cur_is_wr ? S_WRITE : S_READ;
end else if (policy_close_r) begin
// Wrong row open — must PRE first
if (pre_gate) begin
cmd_pre_r <= 1'b1;
cmd_bg_r <= cur_bank_id[4:2];
cmd_ba_r <= cur_bank_id[1:0];
wait_cnt <= TRP;
state <= S_PRECHARGE;
end
end else begin
// Bank idle — go straight to ACT
state <= S_ACTIVATE;
end
end
// --------------------------------------------------------
S_ACTIVATE: begin
if (act_gate) begin
cmd_act_r <= 1'b1;
cmd_bg_r <= cur_bank_id[4:2];
cmd_ba_r <= cur_bank_id[1:0];
cmd_row_r <= cur_row;
wait_cnt <= TRCD;
banks_in_flight[cur_bank_id] <= 1'b1;
state <= S_READ; // placeholder; overridden below
if (cur_is_wr)
state <= S_WRITE;
else
state <= S_READ;
end
// else: stall — timing interlock
end
// --------------------------------------------------------
S_READ: begin
if (wait_cnt > 0) begin
wait_cnt <= wait_cnt - 1;
end else if (cas_gate) begin
cmd_rd_r <= 1'b1;
cmd_col_r <= cur_col;
wait_cnt <= TCL;
state <= S_RETURN;
end
end
// --------------------------------------------------------
S_WRITE: begin
if (wait_cnt > 0) begin
wait_cnt <= wait_cnt - 1;
end else if (cas_gate) begin
cmd_wr_r <= 1'b1;
cmd_col_r <= cur_col;
cmd_data_r <= cur_wdata;
banks_in_flight[cur_bank_id] <= 1'b0;
wait_cnt <= TCWL;
state <= S_IDLE;
end
end
// --------------------------------------------------------
S_PRECHARGE: begin
if (wait_cnt > 0) begin
wait_cnt <= wait_cnt - 1;
end else begin
banks_in_flight[cur_bank_id] <= 1'b0;
if (ref_req_w) begin
// After PRE, go to refresh
cmd_ref_r <= 1'b1;
wait_cnt <= TRFC;
state <= S_REFRESH;
end else begin
// After PRE, activate for miss
state <= S_ACTIVATE;
end
end
end
// --------------------------------------------------------
S_REFRESH: begin
if (wait_cnt > 0)
wait_cnt <= wait_cnt - 1;
else begin
cmd_ref_r <= 1'b0;
state <= S_IDLE;
end
end
// --------------------------------------------------------
S_RETURN: begin
if (wait_cnt > 0) begin
wait_cnt <= wait_cnt - 1;
end else begin
// Read data arrives from PHY (simulated)
o_rd_data <= cur_wdata; // in RTL: from PHY RX FIFO
o_rd_valid <= 1'b1;
banks_in_flight[cur_bank_id] <= 1'b0;
state <= S_IDLE;
end
end
default: state <= S_IDLE;
endcase
end
end
endmodule
// tb_hbm3_pc_ctrl.sv — Testbench for hbm3_pc_ctrl (Module 10)
// Tests: page-hit read, page-miss read, write, refresh preemption
// SVA assertions check state sequencing and interlock correctness
// EcrioniX HBM3 Controller Build
`timescale 1ns/1ps
`default_nettype none
module tb_hbm3_pc_ctrl;
// DUT ports
logic clk, rst_n;
logic [3:0] pc_id;
logic req_valid, req_wr;
logic [33:0] req_addr;
logic [127:0] req_data;
logic [15:0] req_mask;
logic req_ready;
logic [127:0] rd_data;
logic rd_valid;
logic wr_done;
logic busy;
logic [3:0] state_dbg;
// DUT instantiation
hbm3_pc_ctrl #(
.TRCD(14), .TRP(14), .TCL(14), .TCWL(8), .TRFC(700)
) dut (
.i_clk (clk),
.i_rst_n (rst_n),
.i_pc_id (pc_id),
.i_req_valid (req_valid),
.i_req_wr (req_wr),
.i_req_addr (req_addr),
.i_req_data (req_data),
.i_req_mask (req_mask),
.o_req_ready (req_ready),
.o_rd_data (rd_data),
.o_rd_valid (rd_valid),
.o_wr_done (wr_done),
.o_busy (busy),
.o_state_dbg (state_dbg)
);
// Clock: 2 GHz (0.5 ns period)
initial clk = 0;
always #0.25 clk = ~clk;
// Cycle counter for logging
int cycle_cnt;
always @(posedge clk) cycle_cnt++;
// ---------------------------------------------------------------
// SVA: rd_valid must be followed by valid data within 2 cycles
property rd_valid_data;
@(posedge clk) disable iff (!rst_n)
rd_valid |-> (rd_data !== 128'hx);
endproperty
assert property (rd_valid_data)
else $error("[SVA] rd_valid asserted but rd_data is X");
// SVA: ACT state must transition to READ or WRITE, not back to IDLE
// (Encoded: state 2 = ACTIVATE, must reach 3=READ or 4=WRITE)
property act_leads_to_cas;
@(posedge clk) disable iff (!rst_n)
(state_dbg === 4'd2) |-> ##[1:30] (state_dbg === 4'd3 || state_dbg === 4'd4);
endproperty
assert property (act_leads_to_cas)
else $error("[SVA] ACTIVATE state did not reach READ or WRITE within 30 cycles");
// SVA: REFRESH must take at least TRFC cycles (700)
property refresh_duration;
@(posedge clk) disable iff (!rst_n)
$rose(state_dbg === 4'd6) |-> ##[700:800] (state_dbg === 4'd0);
endproperty
assert property (refresh_duration)
else $warning("[SVA] REFRESH exit timing unexpected");
// SVA: o_busy must be high whenever not in IDLE
property busy_when_active;
@(posedge clk) disable iff (!rst_n)
(state_dbg !== 4'd0) |-> busy;
endproperty
assert property (busy_when_active)
else $error("[SVA] busy not asserted in non-IDLE state");
// ---------------------------------------------------------------
// Task: issue single read
task do_read(input [33:0] addr);
@(posedge clk);
req_valid <= 1; req_wr <= 0;
req_addr <= addr; req_data <= 0; req_mask <= 16'hFFFF;
@(posedge clk) req_valid <= 0;
// Wait for rd_valid
@(posedge clk iff rd_valid);
$display("[cyc %0d] READ complete: addr=%h data=%h", cycle_cnt, addr, rd_data);
endtask
// Task: issue single write
task do_write(input [33:0] addr, input [127:0] data);
@(posedge clk);
req_valid <= 1; req_wr <= 1;
req_addr <= addr; req_data <= data; req_mask <= 16'hFFFF;
@(posedge clk) req_valid <= 0;
repeat(30) @(posedge clk);
$display("[cyc %0d] WRITE submitted: addr=%h", cycle_cnt, addr);
endtask
initial begin
$dumpfile("tb_hbm3_pc_ctrl.vcd");
$dumpvars(0, tb_hbm3_pc_ctrl);
$display("=== hbm3_pc_ctrl testbench start ===");
cycle_cnt = 0;
rst_n = 0; pc_id = 4'd0;
req_valid = 0; req_wr = 0; req_addr = 0;
req_data = 0; req_mask = 0;
repeat(8) @(posedge clk);
rst_n = 1;
repeat(4) @(posedge clk);
// TEST 1: Page-empty read (ACT + RD path)
$display("[T1] Page-empty read — expect ACTIVATE then READ");
do_read(34'h0000_0100);
// TEST 2: Write to same bank (test write path)
$display("[T2] Write to new address");
do_write(34'h0002_0200, 128'hCAFE_BABE_DEAD_BEEF_0000_1111_2222_3333);
// TEST 3: Back-to-back reads (test pipelining)
$display("[T3] Back-to-back reads");
fork
do_read(34'h0004_0400);
do_read(34'h0006_0600);
join
// TEST 4: Wait for automatic refresh
$display("[T4] Waiting for refresh cycle (Module 3 will assert ref_req)");
// tREFI = ~7800 cycles; wait long enough for at least one refresh
repeat(8200) @(posedge clk);
$display("[T4] Refresh window passed");
// TEST 5: Verify state returns to IDLE after refresh
@(posedge clk iff state_dbg === 4'd0);
$display("[T5] PASS: FSM returned to IDLE after refresh");
repeat(20) @(posedge clk);
$display("=== All testbench scenarios complete ===");
$finish;
end
endmodule
| Current State | Condition | Next State | Output |
|---|---|---|---|
| IDLE | ref_req & banks_open | PRECHARGE | cmd_pre |
| IDLE | ref_req & !banks_open | REFRESH | cmd_ref |
| IDLE | sched has command | DECODE | latch cmd |
| IDLE | none | IDLE | — |
| DECODE | policy_hit | READ or WRITE | — |
| DECODE | policy_close | PRECHARGE | cmd_pre |
| DECODE | policy_miss (empty) | ACTIVATE | — |
| ACTIVATE | act_gate & is_rd | READ | cmd_act |
| ACTIVATE | act_gate & is_wr | WRITE | cmd_act |
| ACTIVATE | !act_gate | ACTIVATE | stall |
| READ | wait_cnt > 0 | READ | wait tRCD |
| READ | wait_cnt == 0 & cas_gate | RETURN | cmd_rd |
| WRITE | wait_cnt > 0 | WRITE | wait tRCD |
| WRITE | wait_cnt == 0 & cas_gate | IDLE | cmd_wr, wr_done |
| PRECHARGE | wait_cnt > 0 | PRECHARGE | wait tRP |
| PRECHARGE | wait_cnt == 0 & ref_req | REFRESH | cmd_ref |
| PRECHARGE | wait_cnt == 0 & !ref_req | ACTIVATE | — |
| REFRESH | wait_cnt > 0 | REFRESH | wait tRFC |
| REFRESH | wait_cnt == 0 | IDLE | ref_done |
| RETURN | wait_cnt > 0 | RETURN | wait CL |
| RETURN | wait_cnt == 0 | IDLE | rd_valid, rd_data |
| Port | Dir | Width | Description |
|---|---|---|---|
| i_clk | in | 1 | System clock (2 GHz) |
| i_rst_n | in | 1 | Active-low synchronous reset |
| i_pc_id | in | 4 | Pseudo-channel index 0–15 |
| i_req_valid | in | 1 | New host request |
| i_req_wr | in | 1 | 1 = write, 0 = read |
| i_req_addr | in | 34 | Full HBM3 address |
| i_req_data | in | 128 | Write data |
| i_req_mask | in | 16 | Byte write enables |
| o_req_ready | out | 1 | Scheduler has queue space |
| o_rd_data | out | 128 | Read data return to host |
| o_rd_valid | out | 1 | Read data valid pulse |
| o_wr_done | out | 1 | Write committed to DRAM |
| o_busy | out | 1 | PC is busy (refresh or active cmd) |
| o_state_dbg | out | 4 | FSM state encoding for debug |
An HBM3 stack contains 16 pseudo-channels, each operating as an independent 64-bit-wide memory interface with its own command bus, data bus, and 32 banks. PCs share the physical I/O pads and power rails but run independent command sequences. Each PC has its own timing FSM, bank FSM, refresh controller, page policy, and scheduler — the hbm3_pc_ctrl module integrates all of these for one PC.
A page hit occurs when the requested row is already activated (open) in the target bank. The controller issues READ/WRITE directly, saving tRCD latency (~14 ns). A page miss to an idle bank pays ACT + CAS = tRCD + CL ≈ 28 cycles. A page miss to a wrong-row bank pays PRE + ACT + CAS = tRP + tRCD + CL ≈ 42 cycles. The page policy module selects which path.
The timing FSM (Module 1) outputs act_allowed and cas_allowed which go HIGH only when all relevant timing parameters have been satisfied. The PC controller's sequencing FSM stalls in the ACTIVATE or READ/WRITE state (outputs nothing) until both allowed signals are HIGH simultaneously. This simple three-way interlock prevents timing violations without complex handshake protocols.
When Module 3 asserts ref_req, the PC controller FSM finishes its current command, issues PRECHARGE-ALL if any banks are open (waits tRP), then issues REF and waits tRFC (700 cycles = 350 ns at 2 GHz). No host requests are served during tRFC. After completion the FSM returns to IDLE and Module 3 de-asserts ref_req.
In a 16-PC HBM3 controller running at 2 GHz, timing bugs are nearly impossible to reproduce on a logic analyser. The 4-bit o_state_dbg encodes the FSM state, allowing an ILA, scan chain, or JTAG debug fabric to capture which PC was in which state at the moment of a failure. This is critical for post-silicon bring-up, yield analysis, and power regression testing.