HomeHBM3 ControllerModule 10 — Pseudo-Channel Controller
Phase 3 · Module 10 of 15

HBM3 Pseudo-Channel Controller

The integration hub for one HBM3 pseudo-channel — instantiates the timing FSM, bank FSM, refresh controller, and page policy, wired together with a 7-state command sequencing FSM.

hbm3_pc_ctrl.v tb_hbm3_pc_ctrl.sv Synthesizable RTL JEDEC JESD238 Module Integration

The PC Controller Role — Integration Layer

An HBM3 stack has 16 pseudo-channels, each a logically independent 64-bit memory interface. Every PC has its own bank array, command bus, data bus, and refresh timer. The hbm3_pc_ctrl module is the central coordinator for one pseudo-channel — it doesn't implement timing or banking logic itself; it orchestrates the specialised sub-modules built in Modules 1–9.

Think of it as a CPU's memory controller dispatch unit. Just as an out-of-order CPU has separate issue queues, execution units, and reorder buffers that the dispatcher coordinates, the PC controller has separate timing logic (Module 1), bank state (Module 2), refresh scheduling (Module 3), page policy (Module 4), and request arbitration (Module 9). The PC controller connects these components and runs the final command-issue FSM.

What the PC Controller Does

A full HBM3 controller instantiates 16 copies of hbm3_pc_ctrl, one per pseudo-channel, plus a top-level arbiter that multiplexes host traffic across them. This article covers the single-PC module in isolation.

Sub-Module Connections — How Modules 1–4 Wire Into the PC Controller

Each sub-module has a clearly defined interface role inside the PC controller:

Sub-ModuleRoleKey Signals InKey Signals Out
Module 1 — Timing FSMPer-bank timing enforcementcmd_act, cmd_rd, cmd_wr, cmd_preact_allowed, cas_allowed, pre_allowed
Module 2 — Bank FSM32-bank open/closed statecmd_act, cmd_pre, bg, babanks_open[31:0], row_open[14:0]
Module 3 — Refresh CtrltREFI countdown + REF issueref_done (from FSM)ref_req
Module 4 — Page PolicyHit/miss/close decisionreq_addr, banks_open, row_openpolicy_hit, policy_miss, policy_close
Module 9 — SchedulerFR-FCFS request orderingreq_valid/wr/addr/data, banks_open, any_hit, ref_reqcmd_act/rd/wr/pre, cmd_bg/ba/row/col, cmd_data

Signal Flow

The wiring follows a producer-consumer pattern:

  1. Host request arrives at PC controller input.
  2. PC controller forwards to Scheduler (Mod 9). Scheduler also consumes Module 2's banks_open for hit detection.
  3. Scheduler outputs a selected command. PC controller FSM checks Module 1's act_allowed / cas_allowed gate before actually issuing.
  4. Issued command is fed back to Module 2 (bank state update) and Module 1 (timing counters start).
  5. Module 4 (page policy) observes the same banks_open + row_open to decide whether the next request is a hit/miss.
  6. Module 3 runs independently on a counter, asserts ref_req, and the PC controller FSM handles the refresh sequence.

Command Sequencing FSM — All 7 States

The heart of the PC controller is a 7-state FSM that sequences every DRAM command in the correct order and with correct interlocks:

StateEncodingDescription
IDLE4'd0No active transaction. Poll scheduler for next command.
DECODE4'd1Latch selected request. Decode address. Read page policy output.
ACTIVATE4'd2Issue ACT command. Wait tRCD cycles for row to open.
READ4'd3Issue RD command. Wait CL cycles for data return.
WRITE4'd4Issue WR command. Wait CWL cycles. Write data committed.
PRECHARGE4'd5Issue PRE command. Wait tRP cycles. Bank returns to idle.
REFRESH4'd6Issue REF. Wait tRFC (350 ns). All banks refreshed.
RETURN4'd7Assert o_rd_valid, drive o_rd_data to host.

State Transitions

The FSM path depends on the page policy decision captured in DECODE state:

The DECODE state exists to avoid combinational path explosion. All address decoding and page-policy signals are registered here so the subsequent ACTIVATE/READ/WRITE states only see stable, registered inputs. This is crucial at 2 GHz where setup time is under 300 ps.

Request Flow Walkthroughs

Case A — Page Hit Read (Best Case, ~14 cycles total)

The target bank is already activated to the correct row. The controller skips ACT entirely:

Flow
T=0:  IDLE    — scheduler presents valid read, same row as open bank
T=1:  DECODE  — policy_hit=1, latch {bg,ba,col}, check cas_allowed
T=2:  READ    — issue o_cmd_rd, start CL counter (14 cycles)
T=3–15: READ  — waiting CL
T=16: RETURN  — assert o_rd_valid, drive o_rd_data[127:0]
T=17: IDLE    — ready for next request

Case B — Page Miss Read (Empty Bank, ~28 cycles)

Flow
T=0:  IDLE    — new read, bank is precharged (idle)
T=1:  DECODE  — policy_miss=1 (bank empty), check act_allowed
T=2:  ACTIVATE — issue o_cmd_act, start tRCD counter (14 cycles)
T=3–15: ACTIVATE — waiting tRCD
T=16: READ    — issue o_cmd_rd, start CL (14 cycles)
T=17–29: READ — waiting CL
T=30: RETURN  — assert o_rd_valid, drive o_rd_data
T=31: IDLE

Case C — Page Miss Read (Wrong Row Open, ~42 cycles)

Flow
T=0:  IDLE    — new read, bank open to WRONG row
T=1:  DECODE  — policy_close=1, must precharge first
T=2:  PRECHARGE — issue o_cmd_pre, wait tRP (14 cycles)
T=3–15: PRECHARGE — waiting tRP
T=16: ACTIVATE — issue o_cmd_act, wait tRCD (14 cycles)
T=17–29: ACTIVATE — waiting tRCD
T=30: READ    — issue o_cmd_rd, wait CL (14 cycles)
T=31–43: READ — waiting CL
T=44: RETURN  — o_rd_valid + o_rd_data
T=45: IDLE
Case C is the worst-case latency path. On a heavily thrashed working set (many rows, small cache) this path dominates. The page policy module (Module 4) uses adaptive open/close selection to minimise Case C occurrences by predicting future access patterns.

Case D — Refresh Sequence

Flow
T=0:  any state — ref_req asserted by Module 3
T=1:  FSM completes current command (if mid-sequence, finish it)
T=N:  if banks_open != 0: PRECHARGE all, wait tRP
T=N+14: REFRESH — issue o_cmd_ref (not in port list; handled internally)
T=N+15 to N+714: REFRESH — waiting tRFC = 700 cycles (350 ns @ 2 GHz)
T=N+715: IDLE — ref_done asserted, Module 3 clears ref_req

Interlock Logic — Three Guards Before Every Command

No command exits the PC controller without passing three independent checks. All three must be HIGH in the same cycle:

GuardSourceBlocksReason
act_allowedModule 1 (Timing FSM)ACTtRP/tRRD/tFAW not yet satisfied
cas_allowedModule 1 (Timing FSM)RD/WRtRCD/tCCD/tWTR not yet satisfied
!bank_conflictLocal (banks_in_flight reg)Any cmd to busy bankPrevious command to same bank still in-flight

The FSM's ACTIVATE and READ/WRITE states loop (hold state, no command output) until all three guards simultaneously pass. This simple stall mechanism ensures correctness without complex handshake protocols between sub-modules.

Verilog
// Three-way issue gate — used in ACT and CAS states
wire act_gate = act_allowed_w && !banks_in_flight[cur_bank_id];
wire cas_gate = cas_allowed_w && !banks_in_flight[cur_bank_id];

// In ACTIVATE state:
S_ACTIVATE: begin
    if (act_gate) begin
        o_cmd_act <= 1'b1;
        wait_cnt  <= TRCD;
        state     <= S_READ_WAIT; // or S_WRITE_WAIT
        banks_in_flight[cur_bank_id] <= 1'b1;
    end
    // else: stay in S_ACTIVATE, output nothing
end

PC Controller Block Diagram

hbm3_pc_ctrl — Sub-Module Integration (one pseudo-channel) hbm3_pc_ctrl boundary HOST req_valid/wr req_addr[33:0] req_data[127:0] ← rd_data/rd_valid Mod 9: Scheduler FR-FCFS RQ/WQ wr_drain / ref_req → cmd_act/rd/wr/pre → cmd_bg/ba/row/col → cmd_data[127:0] Mod 1: Timing FSM tRCD/tRAS/tRP/CL/CWL → act_allowed → cas_allowed → pre_allowed Mod 2: Bank FSM 32 banks state → banks_open[31:0] → row_open[14:0] Command Sequencing FSM IDLE → DECODE → ACTIVATE → READ / WRITE → PRECHARGE → REFRESH → RETURN 3-way interlock gate act/cas_allowed + !conflict Mod 3: Refresh Ctrl tREFI countdown → ref_req Mod 4: Page Policy hit / miss / close DRAM PHY HBM3 CA bus ACT/RD/WR/PRE/REF DQ data in/out per pseudo-channel 64-bit bus Debug / Status o_state_dbg[3:0] o_busy rd_data return path

Full Verilog Source — hbm3_pc_ctrl.v

The PC controller instantiates all sub-modules as black boxes and runs the command sequencing FSM. Sub-module ports use the same i_/o_ convention.

Verilog
// hbm3_pc_ctrl.v — HBM3 Pseudo-Channel Controller (Module 10)
// Integrates: timing FSM (M1), bank FSM (M2), refresh ctrl (M3),
//             page policy (M4), scheduler (M9)
// EcrioniX HBM3 Controller Build · Phase 3

`timescale 1ns/1ps
`default_nettype none

module hbm3_pc_ctrl #(
    parameter TRCD = 14,   // cycles — row-to-column delay
    parameter TRP  = 14,   // cycles — precharge time
    parameter TCL  = 14,   // cycles — CAS read latency
    parameter TCWL = 8,    // cycles — CAS write latency
    parameter TRFC = 700   // cycles — refresh recovery (350 ns @ 2 GHz)
)(
    input  wire        i_clk,
    input  wire        i_rst_n,

    // Which pseudo-channel this instance serves (0–15)
    input  wire [3:0]  i_pc_id,

    // Host request interface
    input  wire        i_req_valid,
    input  wire        i_req_wr,
    input  wire [33:0] i_req_addr,
    input  wire [127:0] i_req_data,
    input  wire [15:0] i_req_mask,
    output wire        o_req_ready,

    // Host read data return
    output reg  [127:0] o_rd_data,
    output reg          o_rd_valid,
    output reg          o_wr_done,

    // Status
    output wire        o_busy,
    output wire [3:0]  o_state_dbg
);

// -----------------------------------------------------------------------
// Internal wires — sub-module interconnect
// -----------------------------------------------------------------------

// Timing FSM (Module 1) outputs
wire act_allowed_w;
wire cas_allowed_w;
wire pre_allowed_w;

// Bank FSM (Module 2) outputs
wire [31:0] banks_open_w;
wire [14:0] row_open_w;       // row open in selected bank (simplified)
wire        any_hit_w;

// Refresh controller (Module 3) outputs
wire ref_req_w;

// Page policy (Module 4) outputs
wire policy_hit_w;
wire policy_miss_w;
wire policy_close_w;

// Scheduler (Module 9) outputs
wire        sched_cmd_act;
wire        sched_cmd_rd;
wire        sched_cmd_wr;
wire        sched_cmd_pre;
wire [2:0]  sched_cmd_bg;
wire [1:0]  sched_cmd_ba;
wire [14:0] sched_cmd_row;
wire [4:0]  sched_cmd_col;
wire [127:0] sched_cmd_data;
wire        sched_req_ready;

// Command outputs to timing FSM and bank FSM
reg  cmd_act_r, cmd_rd_r, cmd_wr_r, cmd_pre_r, cmd_ref_r;
reg  [2:0]  cmd_bg_r;
reg  [1:0]  cmd_ba_r;
reg  [14:0] cmd_row_r;
reg  [4:0]  cmd_col_r;
reg  [127:0] cmd_data_r;

// Read data captured from DRAM (simulated: echo cmd_data with CL delay)
// In real silicon this comes from the PHY RX FIFO
reg  [127:0] rd_data_pipe [0:15];
reg  [3:0]   rd_pipe_ptr;
reg          rd_pipe_valid [0:15];

// -----------------------------------------------------------------------
// Sub-module instantiations (black boxes — full RTL in earlier modules)
// -----------------------------------------------------------------------

hbm3_timing_fsm #(
    .TRCD(TRCD), .TRP(TRP), .TCL(TCL), .TCWL(TCWL)
) u_timing (
    .i_clk      (i_clk),
    .i_rst_n    (i_rst_n),
    .i_cmd_act  (cmd_act_r),
    .i_cmd_rd   (cmd_rd_r),
    .i_cmd_wr   (cmd_wr_r),
    .i_cmd_pre  (cmd_pre_r),
    .i_cmd_bg   (cmd_bg_r),
    .i_cmd_ba   (cmd_ba_r),
    .o_act_allowed (act_allowed_w),
    .o_cas_allowed (cas_allowed_w),
    .o_pre_allowed (pre_allowed_w)
);

hbm3_bank_fsm u_bank (
    .i_clk      (i_clk),
    .i_rst_n    (i_rst_n),
    .i_cmd_act  (cmd_act_r),
    .i_cmd_pre  (cmd_pre_r),
    .i_cmd_rd   (cmd_rd_r),
    .i_cmd_wr   (cmd_wr_r),
    .i_cmd_bg   (cmd_bg_r),
    .i_cmd_ba   (cmd_ba_r),
    .i_cmd_row  (cmd_row_r),
    .o_banks_open  (banks_open_w),
    .o_row_open    (row_open_w),
    .o_any_hit     (any_hit_w)
);

hbm3_refresh_ctrl u_ref (
    .i_clk      (i_clk),
    .i_rst_n    (i_rst_n),
    .i_ref_done (cmd_ref_r),
    .o_ref_req  (ref_req_w)
);

hbm3_page_policy u_policy (
    .i_clk       (i_clk),
    .i_rst_n     (i_rst_n),
    .i_req_addr  (i_req_addr),
    .i_banks_open(banks_open_w),
    .i_row_open  (row_open_w),
    .o_hit       (policy_hit_w),
    .o_miss      (policy_miss_w),
    .o_close     (policy_close_w)
);

hbm3_scheduler #(
    .RQ_DEPTH(16), .WQ_DEPTH(16),
    .WQ_HWM(12),   .WQ_LWM(4)
) u_sched (
    .i_clk         (i_clk),
    .i_rst_n       (i_rst_n),
    .i_req_valid   (i_req_valid),
    .i_req_addr    (i_req_addr),
    .i_req_wr      (i_req_wr),
    .i_req_data    (i_req_data),
    .i_req_mask    (i_req_mask),
    .o_req_ready   (sched_req_ready),
    .i_banks_open  (banks_open_w),
    .i_any_hit     (any_hit_w),
    .i_act_allowed (act_allowed_w),
    .i_cas_allowed (cas_allowed_w),
    .i_ref_req     (ref_req_w),
    .o_cmd_act     (sched_cmd_act),
    .o_cmd_rd      (sched_cmd_rd),
    .o_cmd_wr      (sched_cmd_wr),
    .o_cmd_pre     (sched_cmd_pre),
    .o_cmd_bg      (sched_cmd_bg),
    .o_cmd_ba      (sched_cmd_ba),
    .o_cmd_row     (sched_cmd_row),
    .o_cmd_col     (sched_cmd_col),
    .o_cmd_data    (sched_cmd_data),
    .o_rd_queue_depth (),
    .o_wr_queue_depth (),
    .o_wr_drain    ()
);

assign o_req_ready = sched_req_ready;

// -----------------------------------------------------------------------
// Command Sequencing FSM
// -----------------------------------------------------------------------
localparam [3:0]
    S_IDLE      = 4'd0,
    S_DECODE    = 4'd1,
    S_ACTIVATE  = 4'd2,
    S_READ      = 4'd3,
    S_WRITE     = 4'd4,
    S_PRECHARGE = 4'd5,
    S_REFRESH   = 4'd6,
    S_RETURN    = 4'd7;

reg [3:0]  state, state_next;
reg [9:0]  wait_cnt;
reg        cur_is_wr;
reg [4:0]  cur_bank_id;   // {bg[2:0], ba[1:0]}
reg [14:0] cur_row;
reg [4:0]  cur_col;
reg [127:0] cur_wdata;
reg        policy_hit_r, policy_close_r;
reg [31:0] banks_in_flight;

assign o_state_dbg = state;
assign o_busy = (state != S_IDLE) || ref_req_w;

// Three-way issue gates
wire act_gate = act_allowed_w && !banks_in_flight[cur_bank_id];
wire cas_gate = cas_allowed_w && !banks_in_flight[cur_bank_id];
wire pre_gate = pre_allowed_w;

// Synthesis helper: full case avoids latches
always @(posedge i_clk or negedge i_rst_n) begin
    if (!i_rst_n) begin
        state          <= S_IDLE;
        wait_cnt       <= 10'd0;
        banks_in_flight <= 32'd0;
        cmd_act_r <= 0; cmd_rd_r <= 0; cmd_wr_r <= 0;
        cmd_pre_r <= 0; cmd_ref_r <= 0;
        cmd_bg_r  <= 0; cmd_ba_r <= 0;
        cmd_row_r <= 0; cmd_col_r <= 0; cmd_data_r <= 0;
        o_rd_data  <= 128'd0; o_rd_valid <= 0; o_wr_done <= 0;
        cur_is_wr  <= 0; cur_bank_id <= 0; cur_row <= 0;
        cur_col    <= 0; cur_wdata  <= 0;
        policy_hit_r <= 0; policy_close_r <= 0;
    end else begin
        // Default pulse clears
        cmd_act_r <= 0; cmd_rd_r <= 0; cmd_wr_r <= 0;
        cmd_pre_r <= 0; cmd_ref_r <= 0;
        o_rd_valid <= 0; o_wr_done <= 0;

        case (state)
            // --------------------------------------------------------
            S_IDLE: begin
                if (ref_req_w) begin
                    // Refresh has highest priority
                    if (|banks_open_w) begin
                        // Must precharge all before refresh
                        cmd_pre_r <= 1'b1;
                        wait_cnt  <= TRP;
                        state     <= S_PRECHARGE;
                    end else begin
                        wait_cnt <= TRFC;
                        cmd_ref_r <= 1'b1;
                        state     <= S_REFRESH;
                    end
                end else if (sched_cmd_act || sched_cmd_rd ||
                             sched_cmd_wr || sched_cmd_pre) begin
                    // Scheduler has a command ready — latch and decode
                    cur_bank_id    <= {sched_cmd_bg, sched_cmd_ba};
                    cur_row        <= sched_cmd_row;
                    cur_col        <= sched_cmd_col;
                    cur_wdata      <= sched_cmd_data;
                    cur_is_wr      <= sched_cmd_wr && !sched_cmd_rd;
                    cmd_bg_r       <= sched_cmd_bg;
                    cmd_ba_r       <= sched_cmd_ba;
                    policy_hit_r   <= policy_hit_w;
                    policy_close_r <= policy_close_w;
                    state          <= S_DECODE;
                end
            end

            // --------------------------------------------------------
            S_DECODE: begin
                // One pipeline stage for page policy to settle
                if (policy_hit_r) begin
                    // Row already open — skip ACT
                    state <= cur_is_wr ? S_WRITE : S_READ;
                end else if (policy_close_r) begin
                    // Wrong row open — must PRE first
                    if (pre_gate) begin
                        cmd_pre_r <= 1'b1;
                        cmd_bg_r  <= cur_bank_id[4:2];
                        cmd_ba_r  <= cur_bank_id[1:0];
                        wait_cnt  <= TRP;
                        state     <= S_PRECHARGE;
                    end
                end else begin
                    // Bank idle — go straight to ACT
                    state <= S_ACTIVATE;
                end
            end

            // --------------------------------------------------------
            S_ACTIVATE: begin
                if (act_gate) begin
                    cmd_act_r <= 1'b1;
                    cmd_bg_r  <= cur_bank_id[4:2];
                    cmd_ba_r  <= cur_bank_id[1:0];
                    cmd_row_r <= cur_row;
                    wait_cnt  <= TRCD;
                    banks_in_flight[cur_bank_id] <= 1'b1;
                    state     <= S_READ; // placeholder; overridden below
                    if (cur_is_wr)
                        state <= S_WRITE;
                    else
                        state <= S_READ;
                end
                // else: stall — timing interlock
            end

            // --------------------------------------------------------
            S_READ: begin
                if (wait_cnt > 0) begin
                    wait_cnt <= wait_cnt - 1;
                end else if (cas_gate) begin
                    cmd_rd_r  <= 1'b1;
                    cmd_col_r <= cur_col;
                    wait_cnt  <= TCL;
                    state     <= S_RETURN;
                end
            end

            // --------------------------------------------------------
            S_WRITE: begin
                if (wait_cnt > 0) begin
                    wait_cnt <= wait_cnt - 1;
                end else if (cas_gate) begin
                    cmd_wr_r   <= 1'b1;
                    cmd_col_r  <= cur_col;
                    cmd_data_r <= cur_wdata;
                    banks_in_flight[cur_bank_id] <= 1'b0;
                    wait_cnt   <= TCWL;
                    state      <= S_IDLE;
                end
            end

            // --------------------------------------------------------
            S_PRECHARGE: begin
                if (wait_cnt > 0) begin
                    wait_cnt <= wait_cnt - 1;
                end else begin
                    banks_in_flight[cur_bank_id] <= 1'b0;
                    if (ref_req_w) begin
                        // After PRE, go to refresh
                        cmd_ref_r <= 1'b1;
                        wait_cnt  <= TRFC;
                        state     <= S_REFRESH;
                    end else begin
                        // After PRE, activate for miss
                        state <= S_ACTIVATE;
                    end
                end
            end

            // --------------------------------------------------------
            S_REFRESH: begin
                if (wait_cnt > 0)
                    wait_cnt <= wait_cnt - 1;
                else begin
                    cmd_ref_r <= 1'b0;
                    state     <= S_IDLE;
                end
            end

            // --------------------------------------------------------
            S_RETURN: begin
                if (wait_cnt > 0) begin
                    wait_cnt <= wait_cnt - 1;
                end else begin
                    // Read data arrives from PHY (simulated)
                    o_rd_data  <= cur_wdata; // in RTL: from PHY RX FIFO
                    o_rd_valid <= 1'b1;
                    banks_in_flight[cur_bank_id] <= 1'b0;
                    state      <= S_IDLE;
                end
            end

            default: state <= S_IDLE;
        endcase
    end
end

endmodule

SystemVerilog Testbench with SVA Assertions

SystemVerilog
// tb_hbm3_pc_ctrl.sv — Testbench for hbm3_pc_ctrl (Module 10)
// Tests: page-hit read, page-miss read, write, refresh preemption
// SVA assertions check state sequencing and interlock correctness
// EcrioniX HBM3 Controller Build

`timescale 1ns/1ps
`default_nettype none

module tb_hbm3_pc_ctrl;

    // DUT ports
    logic        clk, rst_n;
    logic [3:0]  pc_id;
    logic        req_valid, req_wr;
    logic [33:0] req_addr;
    logic [127:0] req_data;
    logic [15:0] req_mask;
    logic        req_ready;
    logic [127:0] rd_data;
    logic        rd_valid;
    logic        wr_done;
    logic        busy;
    logic [3:0]  state_dbg;

    // DUT instantiation
    hbm3_pc_ctrl #(
        .TRCD(14), .TRP(14), .TCL(14), .TCWL(8), .TRFC(700)
    ) dut (
        .i_clk       (clk),
        .i_rst_n     (rst_n),
        .i_pc_id     (pc_id),
        .i_req_valid (req_valid),
        .i_req_wr    (req_wr),
        .i_req_addr  (req_addr),
        .i_req_data  (req_data),
        .i_req_mask  (req_mask),
        .o_req_ready (req_ready),
        .o_rd_data   (rd_data),
        .o_rd_valid  (rd_valid),
        .o_wr_done   (wr_done),
        .o_busy      (busy),
        .o_state_dbg (state_dbg)
    );

    // Clock: 2 GHz (0.5 ns period)
    initial clk = 0;
    always #0.25 clk = ~clk;

    // Cycle counter for logging
    int cycle_cnt;
    always @(posedge clk) cycle_cnt++;

    // ---------------------------------------------------------------
    // SVA: rd_valid must be followed by valid data within 2 cycles
    property rd_valid_data;
        @(posedge clk) disable iff (!rst_n)
        rd_valid |-> (rd_data !== 128'hx);
    endproperty
    assert property (rd_valid_data)
        else $error("[SVA] rd_valid asserted but rd_data is X");

    // SVA: ACT state must transition to READ or WRITE, not back to IDLE
    // (Encoded: state 2 = ACTIVATE, must reach 3=READ or 4=WRITE)
    property act_leads_to_cas;
        @(posedge clk) disable iff (!rst_n)
        (state_dbg === 4'd2) |-> ##[1:30] (state_dbg === 4'd3 || state_dbg === 4'd4);
    endproperty
    assert property (act_leads_to_cas)
        else $error("[SVA] ACTIVATE state did not reach READ or WRITE within 30 cycles");

    // SVA: REFRESH must take at least TRFC cycles (700)
    property refresh_duration;
        @(posedge clk) disable iff (!rst_n)
        $rose(state_dbg === 4'd6) |-> ##[700:800] (state_dbg === 4'd0);
    endproperty
    assert property (refresh_duration)
        else $warning("[SVA] REFRESH exit timing unexpected");

    // SVA: o_busy must be high whenever not in IDLE
    property busy_when_active;
        @(posedge clk) disable iff (!rst_n)
        (state_dbg !== 4'd0) |-> busy;
    endproperty
    assert property (busy_when_active)
        else $error("[SVA] busy not asserted in non-IDLE state");

    // ---------------------------------------------------------------
    // Task: issue single read
    task do_read(input [33:0] addr);
        @(posedge clk);
        req_valid <= 1; req_wr <= 0;
        req_addr  <= addr; req_data <= 0; req_mask <= 16'hFFFF;
        @(posedge clk) req_valid <= 0;
        // Wait for rd_valid
        @(posedge clk iff rd_valid);
        $display("[cyc %0d] READ complete: addr=%h data=%h", cycle_cnt, addr, rd_data);
    endtask

    // Task: issue single write
    task do_write(input [33:0] addr, input [127:0] data);
        @(posedge clk);
        req_valid <= 1; req_wr <= 1;
        req_addr  <= addr; req_data <= data; req_mask <= 16'hFFFF;
        @(posedge clk) req_valid <= 0;
        repeat(30) @(posedge clk);
        $display("[cyc %0d] WRITE submitted: addr=%h", cycle_cnt, addr);
    endtask

    initial begin
        $dumpfile("tb_hbm3_pc_ctrl.vcd");
        $dumpvars(0, tb_hbm3_pc_ctrl);
        $display("=== hbm3_pc_ctrl testbench start ===");
        cycle_cnt = 0;
        rst_n = 0; pc_id = 4'd0;
        req_valid = 0; req_wr = 0; req_addr = 0;
        req_data = 0; req_mask = 0;
        repeat(8) @(posedge clk);
        rst_n = 1;
        repeat(4) @(posedge clk);

        // TEST 1: Page-empty read (ACT + RD path)
        $display("[T1] Page-empty read — expect ACTIVATE then READ");
        do_read(34'h0000_0100);

        // TEST 2: Write to same bank (test write path)
        $display("[T2] Write to new address");
        do_write(34'h0002_0200, 128'hCAFE_BABE_DEAD_BEEF_0000_1111_2222_3333);

        // TEST 3: Back-to-back reads (test pipelining)
        $display("[T3] Back-to-back reads");
        fork
            do_read(34'h0004_0400);
            do_read(34'h0006_0600);
        join

        // TEST 4: Wait for automatic refresh
        $display("[T4] Waiting for refresh cycle (Module 3 will assert ref_req)");
        // tREFI = ~7800 cycles; wait long enough for at least one refresh
        repeat(8200) @(posedge clk);
        $display("[T4] Refresh window passed");

        // TEST 5: Verify state returns to IDLE after refresh
        @(posedge clk iff state_dbg === 4'd0);
        $display("[T5] PASS: FSM returned to IDLE after refresh");

        repeat(20) @(posedge clk);
        $display("=== All testbench scenarios complete ===");
        $finish;
    end

endmodule

FSM State Transition Table

Current StateConditionNext StateOutput
IDLEref_req & banks_openPRECHARGEcmd_pre
IDLEref_req & !banks_openREFRESHcmd_ref
IDLEsched has commandDECODElatch cmd
IDLEnoneIDLE
DECODEpolicy_hitREAD or WRITE
DECODEpolicy_closePRECHARGEcmd_pre
DECODEpolicy_miss (empty)ACTIVATE
ACTIVATEact_gate & is_rdREADcmd_act
ACTIVATEact_gate & is_wrWRITEcmd_act
ACTIVATE!act_gateACTIVATEstall
READwait_cnt > 0READwait tRCD
READwait_cnt == 0 & cas_gateRETURNcmd_rd
WRITEwait_cnt > 0WRITEwait tRCD
WRITEwait_cnt == 0 & cas_gateIDLEcmd_wr, wr_done
PRECHARGEwait_cnt > 0PRECHARGEwait tRP
PRECHARGEwait_cnt == 0 & ref_reqREFRESHcmd_ref
PRECHARGEwait_cnt == 0 & !ref_reqACTIVATE
REFRESHwait_cnt > 0REFRESHwait tRFC
REFRESHwait_cnt == 0IDLEref_done
RETURNwait_cnt > 0RETURNwait CL
RETURNwait_cnt == 0IDLErd_valid, rd_data

Port Reference Table

PortDirWidthDescription
i_clkin1System clock (2 GHz)
i_rst_nin1Active-low synchronous reset
i_pc_idin4Pseudo-channel index 0–15
i_req_validin1New host request
i_req_wrin11 = write, 0 = read
i_req_addrin34Full HBM3 address
i_req_datain128Write data
i_req_maskin16Byte write enables
o_req_readyout1Scheduler has queue space
o_rd_dataout128Read data return to host
o_rd_validout1Read data valid pulse
o_wr_doneout1Write committed to DRAM
o_busyout1PC is busy (refresh or active cmd)
o_state_dbgout4FSM state encoding for debug

Frequently Asked Questions

What is a pseudo-channel in HBM3?

An HBM3 stack contains 16 pseudo-channels, each operating as an independent 64-bit-wide memory interface with its own command bus, data bus, and 32 banks. PCs share the physical I/O pads and power rails but run independent command sequences. Each PC has its own timing FSM, bank FSM, refresh controller, page policy, and scheduler — the hbm3_pc_ctrl module integrates all of these for one PC.

What is the difference between a page hit and a page miss?

A page hit occurs when the requested row is already activated (open) in the target bank. The controller issues READ/WRITE directly, saving tRCD latency (~14 ns). A page miss to an idle bank pays ACT + CAS = tRCD + CL ≈ 28 cycles. A page miss to a wrong-row bank pays PRE + ACT + CAS = tRP + tRCD + CL ≈ 42 cycles. The page policy module selects which path.

How do timing interlocks work between sub-modules?

The timing FSM (Module 1) outputs act_allowed and cas_allowed which go HIGH only when all relevant timing parameters have been satisfied. The PC controller's sequencing FSM stalls in the ACTIVATE or READ/WRITE state (outputs nothing) until both allowed signals are HIGH simultaneously. This simple three-way interlock prevents timing violations without complex handshake protocols.

What happens during a refresh cycle?

When Module 3 asserts ref_req, the PC controller FSM finishes its current command, issues PRECHARGE-ALL if any banks are open (waits tRP), then issues REF and waits tRFC (700 cycles = 350 ns at 2 GHz). No host requests are served during tRFC. After completion the FSM returns to IDLE and Module 3 de-asserts ref_req.

Why does the PC controller expose o_state_dbg?

In a 16-PC HBM3 controller running at 2 GHz, timing bugs are nearly impossible to reproduce on a logic analyser. The 4-bit o_state_dbg encodes the FSM state, allowing an ILA, scan chain, or JTAG debug fabric to capture which PC was in which state at the moment of a failure. This is critical for post-silicon bring-up, yield analysis, and power regression testing.