HomeHBM3 ControllerModule 3 — Refresh Controller
⚡ Phase 1 · Module 3 of 4

HBM3 Refresh Controller

tREFI watchdog counter, refresh debt accumulation, all-bank and per-bank refresh scheduling, tRFC blocking window, and collision arbitration. Fully synthesizable Verilog with self-checking SV testbench.

📁 hbm3_refresh_ctrl.v 🧪 tb_hbm3_refresh_ctrl.sv ✅ Synthesizable RTL tREFI = 7800 cy tRFC = 440 cy JEDEC JESD238

Why DRAM Needs Periodic Refresh

DRAM stores each bit as charge on a tiny capacitor — typically 10–30 fF. Unlike SRAM (which uses a cross-coupled latch), DRAM capacitors are not self-refreshing. Transistor junction leakage continuously drains the charge at a rate of roughly 1–10 fA per cell. At room temperature, the charge decays to an unreadable level within 32–64 ms; at higher temperatures (e.g. 85 °C), retention drops to around 16 ms.

To prevent data loss, the memory controller must periodically issue a REFRESH command that reads and rewrites every row in the array. JEDEC specifies the maximum time between refresh operations as tREFI. In HBM3 at 2 GHz, tREFI = 7,800 clock cycles (3.9 µs). Every row must be refreshed at least once per retention interval — with HBM3's 32,768 rows per bank, this works out to one refresh command every 3.9 µs.

HBM3 operates in pseudo-channels, each with its own independent refresh controller. The refresh signal does not need to be coordinated across pseudo-channels, which allows for refresh staggering to avoid simultaneous power spikes across the die stack.

The Refresh Penalty

Every refresh command forces the memory array offline for the refresh cycle time (tRFC). During tRFC, no ACT, RD, WR, or PRE can be issued to the affected banks. The HBM3 specification defines two refresh modes with different tradeoffs:

ModeBanks RefreshedDurationArray Unavailability
ABR — All-Bank RefreshAll 32 bankstRFC = 440 cy (220 ns)100% — full blackout
PBR — Per-Bank Refresh1 of 32 bankstRFCpb = 140 cy (70 ns)3.1% — other 31 banks stay accessible

The choice between ABR and PBR is set by i_pbr_mode. Most modern HBM3 controllers default to PBR during sustained traffic and fall back to ABR during idle periods where the latency penalty is irrelevant.

ABR vs PBR — All-Bank Refresh vs Per-Bank Refresh

ABR — All-Bank Refresh

One REFRESH command refreshes all 32 banks simultaneously. The entire pseudo-channel stalls for 440 cycles (220 ns). Simple to implement — the tREFI counter fires, the controller issues one REF, waits tRFC, resumes. Latency spike is predictable and easy to hide with write buffers. Best for bursty or idle workloads where occasional 220 ns stalls are acceptable.

PBR — Per-Bank Refresh

One PBR command targets a single bank for 140 cycles (70 ns). The other 31 banks remain active. The scheduler must track which banks need refresh (32-bit bitmap) and service each within tREFI. Full PBR requires 32 × 140 = 4,480 cycles per tREFI interval — 57% of tREFI. Delivers 3.1× lower average unavailability but demands precise per-bank scheduling.

For a high-bandwidth memory system like HBM3, PBR is strongly preferred during active traffic. The 31 available banks mean a scheduler can almost always find a non-refreshing bank to service, hiding the refresh latency entirely behind ongoing transactions.

JEDEC requires that when operating in PBR mode, each bank must receive its own refresh within tREFI. A controller that issues ABR to "catch up" after PBR misses is non-compliant. Choose one mode and stick to it unless transitioning via the defined mode-switch sequence.

tREFI Watchdog and Refresh Debt Accumulation

The core of the refresh controller is a 13-bit watchdog counter that counts down from 7,800 to zero. When it reaches zero, it fires o_ref_req and reloads. If the timing FSM or scheduler cannot accept the refresh immediately (e.g., a critical write burst is finishing), the controller increments a refresh debt counter instead of waiting.

Refresh Postponement Rules

JEDEC allows up to 8 consecutive refreshes to be postponed (the "pull-in" window). This gives the scheduler up to 8 × 3.9 µs = 31.2 µs of freedom to avoid issuing refresh at the worst possible moment. However:

Debt Levelo_ref_count[3:0]o_ref_urgentScheduler Action
0–3 refreshes owed0x0 – 0x30Normal — can postpone for timing-critical ops
4–7 refreshes owed0x4 – 0x71Urgent — must prioritize refresh over new requests
8 refreshes owed0x81 (critical)Mandatory — cannot issue any ACT/RD/WR until serviced

When i_ref_ack pulses (the timing FSM has accepted a refresh), the debt counter decrements and the tRFC blocking window begins. The watchdog counter does not pause during tRFC — it keeps counting so the next tREFI deadline is correctly tracked.

The refresh debt counter caps at 8. Once at 8, additional tREFI expirations do not increment further — instead, o_ref_urgent stays asserted and the scheduler must not issue commands until the debt drains below 8.

Refresh Scheduling — Postpone and Priority Logic

The refresh controller does not directly issue DRAM commands — it generates request signals that the command scheduler arbitrates against read/write traffic. The key outputs are:

Collision Arbitration

A refresh collision occurs when a refresh request arrives while a bank is in the middle of an ACT→RD/WR sequence. The scheduler has three options:

SituationResponseCost
Bank idle when refresh firesIssue REF immediatelytRFC only
Bank active, read/write pending, debt < 4Finish current command, then refreshtRFC + small delay
Bank active, debt = 8 (critical)Abort read/write, issue PRE, then REFtRP + tRFC (full penalty)

The refresh controller itself does not make this arbitration decision — it merely signals urgency via o_ref_urgent and o_ref_count. The scheduler (covered in Module 5) implements the policy.

Timing Diagram — tREFI → Refresh → tRFC Block

HBM3 Refresh Timing — tREFI Watchdog → ref_req → ref_ack → tRFC Block clk tREFI cnt o_ref_req i_ref_ack o_refreshing 7800 → 0 (3.9 µs / 7800 cy) EXPIRE 7800 → … (next interval) tRFC = 440 cycles (220 ns) tREFI = 7800 cy (3.9 µs) tRFC = 440 cy (220 ns)

Timing Parameters Reference

ParameterSymbolCycles (2 GHz)TimeDescription
Refresh intervaltREFI7,8003.9 µsMaximum time between refresh commands
Refresh cycle time (ABR)tRFC440220 nsAll-bank refresh unavailability window
Per-bank refresh cycle timetRFCpb14070 nsPer-bank refresh unavailability for targeted bank
Max refresh postponementtREFW (debt)8 × tREFI31.2 µsMaximum accumulated refresh debt before mandatory service
Refresh urgent threshold4 × tREFI15.6 µsPoint at which scheduler must prioritize refresh

Port Reference — hbm3_refresh_ctrl

PortDirWidthDescription
i_clkin1System clock (2 GHz)
i_rst_nin1Active-low synchronous reset
i_ref_ackin1Pulse: timing FSM accepted ABR refresh, start tRFC
i_pbr_ackin1Pulse: per-bank refresh accepted, bank identified by i_pbr_bank
i_pbr_bank[4:0]in5Which bank (0–31) was just refreshed via PBR
o_ref_reqout1ABR refresh request — tREFI watchdog has expired
o_pbr_reqout1PBR refresh request — at least one bank needs refresh
o_pbr_target[4:0]out5Lowest-priority bank number for next PBR (round-robin)
o_refreshingout1tRFC/tRFCpb blocking window active — no ACT/RD/WR allowed
o_ref_count[3:0]out4Current refresh debt (0–8); 8 = critical, must service now
o_ref_urgentout1Asserted when debt ≥ 4 — scheduler must prioritize refresh

Verilog Source — hbm3_refresh_ctrl.v

verilog · hbm3_refresh_ctrl.v
// ================================================================
//  hbm3_refresh_ctrl.v
//  HBM3 Refresh Controller  —  Phase 1 · Module 3
//  tREFI watchdog, refresh debt counter, ABR/PBR scheduling,
//  tRFC blocking window, per-bank refresh bitmap tracking
//  Synthesizable Verilog — EcrioniX HBM3 Controller Series
// ================================================================
`timescale 1ns/1ps
`default_nettype none

module hbm3_refresh_ctrl (
    input  wire        i_clk,
    input  wire        i_rst_n,

    // --- Refresh Acknowledgements from Timing FSM ---
    input  wire        i_ref_ack,        // ABR refresh accepted — start tRFC
    input  wire        i_pbr_ack,        // PBR accepted — clear target bank bit
    input  wire [4:0]  i_pbr_bank,       // Bank refreshed via PBR (0-31)

    // --- Refresh Request Outputs to Scheduler ---
    output reg         o_ref_req,        // ABR needed (tREFI expired)
    output reg         o_pbr_req,        // PBR needed (any bank bit set)
    output reg  [4:0]  o_pbr_target,     // Lowest unrefreshed bank number
    output reg         o_refreshing,     // tRFC blocking window active
    output reg  [3:0]  o_ref_count,      // Refresh debt (0-8)
    output reg         o_ref_urgent      // Debt >= 4: scheduler must prioritize
);

// ================================================================
//  Timing Parameters (2 GHz clock)
// ================================================================
localparam TREFI    = 13'd7800;   // 3.9 us  — refresh interval
localparam TRFC     = 9'd440;     // 220 ns  — all-bank refresh cycle
localparam TRFCPB   = 8'd140;     // 70 ns   — per-bank refresh cycle
localparam DEBT_MAX = 4'd8;       // JEDEC max postpone limit
localparam DEBT_URG = 4'd4;       // Urgency threshold

// ================================================================
//  tREFI Watchdog Counter (13 bits, counts down from 7800)
// ================================================================
reg [12:0] r_refi_cnt;
reg        r_refi_expired;

always @(posedge i_clk) begin
    if (!i_rst_n) begin
        r_refi_cnt     <= TREFI - 1;
        r_refi_expired <= 1'b0;
    end else begin
        r_refi_expired <= 1'b0;
        if (r_refi_cnt == 13'd0) begin
            r_refi_cnt     <= TREFI - 1;
            r_refi_expired <= 1'b1;
        end else begin
            r_refi_cnt <= r_refi_cnt - 13'd1;
        end
    end
end

// ================================================================
//  Refresh Debt Counter (0 to 8)
// ================================================================
always @(posedge i_clk) begin
    if (!i_rst_n) begin
        o_ref_count <= 4'd0;
    end else begin
        if (r_refi_expired && !i_ref_ack) begin
            // tREFI expired without service — accumulate debt (cap at 8)
            if (o_ref_count < DEBT_MAX)
                o_ref_count <= o_ref_count + 4'd1;
        end else if (!r_refi_expired && i_ref_ack && o_ref_count > 4'd0) begin
            // Refresh acknowledged — pay down debt
            o_ref_count <= o_ref_count - 4'd1;
        end
        // Both expired and ack in same cycle: net zero change
    end
end

// ================================================================
//  ABR Request and Urgent Flags
// ================================================================
always @(posedge i_clk) begin
    if (!i_rst_n) begin
        o_ref_req    <= 1'b0;
        o_ref_urgent <= 1'b0;
    end else begin
        // ref_req asserted any time debt > 0 OR watchdog just fired
        o_ref_req    <= (o_ref_count > 4'd0) | r_refi_expired;
        o_ref_urgent <= (o_ref_count >= DEBT_URG);
    end
end

// ================================================================
//  tRFC Blocking Window Counter
// ================================================================
reg [8:0] r_rfc_cnt;

always @(posedge i_clk) begin
    if (!i_rst_n) begin
        r_rfc_cnt    <= 9'd0;
        o_refreshing <= 1'b0;
    end else begin
        if (i_ref_ack && !o_refreshing) begin
            // ABR accepted — start tRFC countdown
            r_rfc_cnt    <= TRFC - 9'd1;
            o_refreshing <= 1'b1;
        end else if (o_refreshing) begin
            if (r_rfc_cnt == 9'd0) begin
                o_refreshing <= 1'b0;
            end else begin
                r_rfc_cnt <= r_rfc_cnt - 9'd1;
            end
        end
    end
end

// ================================================================
//  Per-Bank Refresh Bitmap (32 banks)
//  Bit[n] = 1 means bank n needs a PBR refresh this interval
// ================================================================
reg [31:0] r_pbr_bitmap;
integer    pbr_i;

// Find lowest set bit (priority encoder)
reg [4:0] r_pbr_next;
always @(*) begin
    r_pbr_next = 5'd31;
    for (pbr_i = 30; pbr_i >= 0; pbr_i = pbr_i - 1)
        if (r_pbr_bitmap[pbr_i]) r_pbr_next = pbr_i[4:0];
end

always @(posedge i_clk) begin
    if (!i_rst_n) begin
        r_pbr_bitmap <= 32'hFFFF_FFFF; // all banks need refresh on reset
        o_pbr_req    <= 1'b0;
        o_pbr_target <= 5'd0;
    end else begin
        // On tREFI expiry, reload all bits (new refresh interval)
        if (r_refi_expired)
            r_pbr_bitmap <= 32'hFFFF_FFFF;

        // Clear bit for bank that just received PBR
        if (i_pbr_ack)
            r_pbr_bitmap[i_pbr_bank] <= 1'b0;

        // Update outputs
        o_pbr_req    <= |r_pbr_bitmap;
        o_pbr_target <= r_pbr_next;
    end
end

endmodule
`default_nettype wire

SystemVerilog Testbench — tb_hbm3_refresh_ctrl.sv

systemverilog · tb_hbm3_refresh_ctrl.sv
// ================================================================
//  tb_hbm3_refresh_ctrl.sv
//  Self-checking testbench for hbm3_refresh_ctrl
//  5 Tests: basic refresh, debt accumulation, critical debt (8),
//           PBR sequence, tRFC blocking enforcement
// ================================================================
`timescale 1ns/1ps

module tb_hbm3_refresh_ctrl;

// ---- DUT Ports ----
logic        clk, rst_n;
logic        ref_ack, pbr_ack;
logic [4:0]  pbr_bank;
wire         ref_req, pbr_req, refreshing, ref_urgent;
wire  [4:0]  pbr_target;
wire  [3:0]  ref_count;

hbm3_refresh_ctrl dut (
    .i_clk       (clk),
    .i_rst_n     (rst_n),
    .i_ref_ack   (ref_ack),
    .i_pbr_ack   (pbr_ack),
    .i_pbr_bank  (pbr_bank),
    .o_ref_req   (ref_req),
    .o_pbr_req   (pbr_req),
    .o_pbr_target(pbr_target),
    .o_refreshing(refreshing),
    .o_ref_count (ref_count),
    .o_ref_urgent(ref_urgent)
);

// 2 GHz clock: 0.5 ns period
initial clk = 0;
always #0.25 clk = ~clk;

// ---- Helpers ----
int pass_cnt = 0, fail_cnt = 0;

task check(input string name, input logic got, input logic exp);
    if (got === exp) begin
        $display("  PASS  %s", name); pass_cnt++;
    end else begin
        $display("  FAIL  %s: got=%0b exp=%0b", name, got, exp); fail_cnt++;
    end
endtask

task tick(input int n = 1);
    repeat(n) @(posedge clk);
    #0.1;
endtask

task reset_dut;
    rst_n = 0; ref_ack = 0; pbr_ack = 0; pbr_bank = 0;
    tick(4);
    rst_n = 1;
    tick(2);
endtask

// ================================================================
//  SVA Assertions
// ================================================================
// ref_count must never exceed 8
property debt_cap;
    @(posedge clk) disable iff (!rst_n)
    ref_count <= 4'd8;
endproperty
assert property (debt_cap) else $error("SVA FAIL: ref_count exceeded 8");

// o_ref_urgent must be 1 whenever ref_count >= 4
property urgent_correct;
    @(posedge clk) disable iff (!rst_n)
    (ref_count >= 4'd4) |-> ref_urgent;
endproperty
assert property (urgent_correct) else $error("SVA FAIL: ref_urgent not set when count>=4");

// refreshing must de-assert within 441 cycles of ref_ack
property rfc_window;
    @(posedge clk) disable iff (!rst_n)
    $rose(ref_ack) |-> ##[440:441] !refreshing;
endproperty
assert property (rfc_window) else $error("SVA FAIL: tRFC window wrong duration");

// ================================================================
//  TEST 1 — Basic ABR Refresh Cycle
// ================================================================
task test1_basic_refresh;
    $display("\n[TEST 1] Basic ABR Refresh Cycle");
    reset_dut();
    // Wait for watchdog to expire (7800 cycles)
    tick(7800);
    tick(2); // propagation
    check("ref_req after tREFI", ref_req, 1'b1);
    // Acknowledge refresh
    ref_ack = 1; tick(1); ref_ack = 0;
    tick(2);
    check("refreshing after ack", refreshing, 1'b1);
    check("ref_count after ack", ref_count == 4'd0, 1'b1);
    // Wait for tRFC to expire
    tick(440);
    check("refreshing clear after tRFC", refreshing, 1'b0);
endtask

// ================================================================
//  TEST 2 — Debt Accumulation to 4 (Urgent)
// ================================================================
task test2_debt_to_4;
    $display("\n[TEST 2] Debt Accumulation to 4 (Urgent)");
    reset_dut();
    // Miss 4 refresh intervals
    repeat(4) begin
        tick(7800);
        tick(2); // let counter reload
    end
    tick(2);
    check("ref_count == 4", ref_count == 4'd4, 1'b1);
    check("ref_urgent asserted", ref_urgent, 1'b1);
    // Service one refresh
    ref_ack = 1; tick(1); ref_ack = 0;
    tick(2);
    check("debt decremented to 3", ref_count == 4'd3, 1'b1);
    check("ref_urgent cleared", ref_urgent, 1'b0);
endtask

// ================================================================
//  TEST 3 — Critical Debt (8 Postponements)
// ================================================================
task test3_debt_critical;
    $display("\n[TEST 3] Critical Debt — 8 Postponements");
    reset_dut();
    // Miss 8 refresh intervals
    repeat(9) begin
        tick(7800);
        tick(2);
    end
    tick(2);
    check("ref_count capped at 8", ref_count == 4'd8, 1'b1);
    check("ref_urgent critical", ref_urgent, 1'b1);
    check("ref_req asserted", ref_req, 1'b1);
endtask

// ================================================================
//  TEST 4 — PBR Sequence (3 banks)
// ================================================================
task test4_pbr_sequence;
    $display("\n[TEST 4] PBR Bank Sequence");
    reset_dut();
    tick(5);
    check("PBR bitmap full on reset", pbr_req, 1'b1);
    check("PBR target starts at 0", pbr_target == 5'd0, 1'b1);
    // Refresh bank 0
    pbr_bank = 5'd0; pbr_ack = 1; tick(1); pbr_ack = 0;
    tick(2);
    check("target advances to 1", pbr_target == 5'd1, 1'b1);
    // Refresh banks 1 and 2
    pbr_bank = 5'd1; pbr_ack = 1; tick(1); pbr_ack = 0; tick(2);
    pbr_bank = 5'd2; pbr_ack = 1; tick(1); pbr_ack = 0; tick(2);
    check("target at 3 after 0-2 done", pbr_target == 5'd3, 1'b1);
    check("pbr_req still set", pbr_req, 1'b1);
endtask

// ================================================================
//  TEST 5 — tRFC Blocking Window
// ================================================================
task test5_rfc_blocking;
    $display("\n[TEST 5] tRFC Blocking Window Duration");
    reset_dut();
    tick(7800); tick(2);
    ref_ack = 1; tick(1); ref_ack = 0;
    tick(1);
    check("refreshing starts immediately", refreshing, 1'b1);
    tick(438); // 1 + 438 = 439 cycles into tRFC
    check("refreshing still high at 439", refreshing, 1'b1);
    tick(1);  // cycle 440
    check("refreshing clears at 440", refreshing, 1'b0);
endtask

// ================================================================
//  Main
// ================================================================
initial begin
    $dumpfile("dump.vcd"); $dumpvars(0, tb_hbm3_refresh_ctrl);
    test1_basic_refresh();
    test2_debt_to_4();
    test3_debt_critical();
    test4_pbr_sequence();
    test5_rfc_blocking();
    $display("\n========================================");
    $display("  RESULTS: %0d PASS  /  %0d FAIL", pass_cnt, fail_cnt);
    $display("========================================");
    $finish;
end

endmodule

Frequently Asked Questions

Why does DRAM need periodic refresh?

DRAM stores data as charge on leaky capacitors. Without refresh, the charge decays below the sense amplifier threshold within 32–64 ms at room temperature. The controller must issue a REFRESH command at least once per tREFI (3.9 µs in HBM3) to rewrite every row before the charge is lost.

What is the difference between ABR and PBR refresh?

ABR (All-Bank Refresh) refreshes all 32 banks simultaneously in tRFC=440 cycles (220 ns). The entire pseudo-channel stalls. PBR (Per-Bank Refresh) targets one bank in tRFCpb=140 cycles (70 ns), leaving 31 banks accessible. PBR reduces average memory unavailability by 3.1× but requires per-bank tracking and scheduling across 32 banks per tREFI interval.

What is refresh debt and why is it capped at 8?

Refresh debt is the count of tREFI intervals that passed without a refresh being serviced. JEDEC allows up to 8 consecutive refresh operations to be postponed (the "pull-in" window) so the scheduler can delay refresh during critical bursts. Beyond 8 postponements, retention time is exceeded and data corruption becomes possible — so the debt is hard-capped at 8 and the controller asserts urgent override signals.

What happens during the tRFC blocking window?

After ref_ack is asserted, all 32 banks (in ABR mode) or the targeted bank (in PBR mode) are unavailable for tRFC/tRFCpb cycles. The o_refreshing signal gates the scheduler — no ACT, RD, WR, or PRE can be issued. The tREFI watchdog continues counting during tRFC, so the next deadline is tracked correctly even while the array is busy.

How does the PBR priority encoder select the next target bank?

The refresh controller maintains a 32-bit bitmap where each bit represents one bank. After every tREFI expiry, all 32 bits are set (every bank needs refresh). When a bank receives PBR service (i_pbr_ack + i_pbr_bank), its bit is cleared. The priority encoder selects the lowest-numbered bank with its bit still set, implementing a deterministic round-robin order (bank 0, 1, 2, ... 31) that prevents any bank from starving.