tREFI watchdog counter, refresh debt accumulation, all-bank and per-bank refresh scheduling, tRFC blocking window, and collision arbitration. Fully synthesizable Verilog with self-checking SV testbench.
DRAM stores each bit as charge on a tiny capacitor — typically 10–30 fF. Unlike SRAM (which uses a cross-coupled latch), DRAM capacitors are not self-refreshing. Transistor junction leakage continuously drains the charge at a rate of roughly 1–10 fA per cell. At room temperature, the charge decays to an unreadable level within 32–64 ms; at higher temperatures (e.g. 85 °C), retention drops to around 16 ms.
To prevent data loss, the memory controller must periodically issue a REFRESH command that reads and rewrites every row in the array. JEDEC specifies the maximum time between refresh operations as tREFI. In HBM3 at 2 GHz, tREFI = 7,800 clock cycles (3.9 µs). Every row must be refreshed at least once per retention interval — with HBM3's 32,768 rows per bank, this works out to one refresh command every 3.9 µs.
Every refresh command forces the memory array offline for the refresh cycle time (tRFC). During tRFC, no ACT, RD, WR, or PRE can be issued to the affected banks. The HBM3 specification defines two refresh modes with different tradeoffs:
| Mode | Banks Refreshed | Duration | Array Unavailability |
|---|---|---|---|
| ABR — All-Bank Refresh | All 32 banks | tRFC = 440 cy (220 ns) | 100% — full blackout |
| PBR — Per-Bank Refresh | 1 of 32 banks | tRFCpb = 140 cy (70 ns) | 3.1% — other 31 banks stay accessible |
The choice between ABR and PBR is set by i_pbr_mode. Most modern HBM3 controllers default to PBR during sustained traffic and fall back to ABR during idle periods where the latency penalty is irrelevant.
One REFRESH command refreshes all 32 banks simultaneously. The entire pseudo-channel stalls for 440 cycles (220 ns). Simple to implement — the tREFI counter fires, the controller issues one REF, waits tRFC, resumes. Latency spike is predictable and easy to hide with write buffers. Best for bursty or idle workloads where occasional 220 ns stalls are acceptable.
One PBR command targets a single bank for 140 cycles (70 ns). The other 31 banks remain active. The scheduler must track which banks need refresh (32-bit bitmap) and service each within tREFI. Full PBR requires 32 × 140 = 4,480 cycles per tREFI interval — 57% of tREFI. Delivers 3.1× lower average unavailability but demands precise per-bank scheduling.
For a high-bandwidth memory system like HBM3, PBR is strongly preferred during active traffic. The 31 available banks mean a scheduler can almost always find a non-refreshing bank to service, hiding the refresh latency entirely behind ongoing transactions.
The core of the refresh controller is a 13-bit watchdog counter that counts down from 7,800 to zero. When it reaches zero, it fires o_ref_req and reloads. If the timing FSM or scheduler cannot accept the refresh immediately (e.g., a critical write burst is finishing), the controller increments a refresh debt counter instead of waiting.
JEDEC allows up to 8 consecutive refreshes to be postponed (the "pull-in" window). This gives the scheduler up to 8 × 3.9 µs = 31.2 µs of freedom to avoid issuing refresh at the worst possible moment. However:
| Debt Level | o_ref_count[3:0] | o_ref_urgent | Scheduler Action |
|---|---|---|---|
| 0–3 refreshes owed | 0x0 – 0x3 | 0 | Normal — can postpone for timing-critical ops |
| 4–7 refreshes owed | 0x4 – 0x7 | 1 | Urgent — must prioritize refresh over new requests |
| 8 refreshes owed | 0x8 | 1 (critical) | Mandatory — cannot issue any ACT/RD/WR until serviced |
When i_ref_ack pulses (the timing FSM has accepted a refresh), the debt counter decrements and the tRFC blocking window begins. The watchdog counter does not pause during tRFC — it keeps counting so the next tREFI deadline is correctly tracked.
o_ref_urgent stays asserted and the scheduler must not issue commands until the debt drains below 8.The refresh controller does not directly issue DRAM commands — it generates request signals that the command scheduler arbitrates against read/write traffic. The key outputs are:
o_pbr_target[4:0]A refresh collision occurs when a refresh request arrives while a bank is in the middle of an ACT→RD/WR sequence. The scheduler has three options:
| Situation | Response | Cost |
|---|---|---|
| Bank idle when refresh fires | Issue REF immediately | tRFC only |
| Bank active, read/write pending, debt < 4 | Finish current command, then refresh | tRFC + small delay |
| Bank active, debt = 8 (critical) | Abort read/write, issue PRE, then REF | tRP + tRFC (full penalty) |
The refresh controller itself does not make this arbitration decision — it merely signals urgency via o_ref_urgent and o_ref_count. The scheduler (covered in Module 5) implements the policy.
| Parameter | Symbol | Cycles (2 GHz) | Time | Description |
|---|---|---|---|---|
| Refresh interval | tREFI | 7,800 | 3.9 µs | Maximum time between refresh commands |
| Refresh cycle time (ABR) | tRFC | 440 | 220 ns | All-bank refresh unavailability window |
| Per-bank refresh cycle time | tRFCpb | 140 | 70 ns | Per-bank refresh unavailability for targeted bank |
| Max refresh postponement | tREFW (debt) | 8 × tREFI | 31.2 µs | Maximum accumulated refresh debt before mandatory service |
| Refresh urgent threshold | — | 4 × tREFI | 15.6 µs | Point at which scheduler must prioritize refresh |
| Port | Dir | Width | Description |
|---|---|---|---|
| i_clk | in | 1 | System clock (2 GHz) |
| i_rst_n | in | 1 | Active-low synchronous reset |
| i_ref_ack | in | 1 | Pulse: timing FSM accepted ABR refresh, start tRFC |
| i_pbr_ack | in | 1 | Pulse: per-bank refresh accepted, bank identified by i_pbr_bank |
| i_pbr_bank[4:0] | in | 5 | Which bank (0–31) was just refreshed via PBR |
| o_ref_req | out | 1 | ABR refresh request — tREFI watchdog has expired |
| o_pbr_req | out | 1 | PBR refresh request — at least one bank needs refresh |
| o_pbr_target[4:0] | out | 5 | Lowest-priority bank number for next PBR (round-robin) |
| o_refreshing | out | 1 | tRFC/tRFCpb blocking window active — no ACT/RD/WR allowed |
| o_ref_count[3:0] | out | 4 | Current refresh debt (0–8); 8 = critical, must service now |
| o_ref_urgent | out | 1 | Asserted when debt ≥ 4 — scheduler must prioritize refresh |
// ================================================================
// hbm3_refresh_ctrl.v
// HBM3 Refresh Controller — Phase 1 · Module 3
// tREFI watchdog, refresh debt counter, ABR/PBR scheduling,
// tRFC blocking window, per-bank refresh bitmap tracking
// Synthesizable Verilog — EcrioniX HBM3 Controller Series
// ================================================================
`timescale 1ns/1ps
`default_nettype none
module hbm3_refresh_ctrl (
input wire i_clk,
input wire i_rst_n,
// --- Refresh Acknowledgements from Timing FSM ---
input wire i_ref_ack, // ABR refresh accepted — start tRFC
input wire i_pbr_ack, // PBR accepted — clear target bank bit
input wire [4:0] i_pbr_bank, // Bank refreshed via PBR (0-31)
// --- Refresh Request Outputs to Scheduler ---
output reg o_ref_req, // ABR needed (tREFI expired)
output reg o_pbr_req, // PBR needed (any bank bit set)
output reg [4:0] o_pbr_target, // Lowest unrefreshed bank number
output reg o_refreshing, // tRFC blocking window active
output reg [3:0] o_ref_count, // Refresh debt (0-8)
output reg o_ref_urgent // Debt >= 4: scheduler must prioritize
);
// ================================================================
// Timing Parameters (2 GHz clock)
// ================================================================
localparam TREFI = 13'd7800; // 3.9 us — refresh interval
localparam TRFC = 9'd440; // 220 ns — all-bank refresh cycle
localparam TRFCPB = 8'd140; // 70 ns — per-bank refresh cycle
localparam DEBT_MAX = 4'd8; // JEDEC max postpone limit
localparam DEBT_URG = 4'd4; // Urgency threshold
// ================================================================
// tREFI Watchdog Counter (13 bits, counts down from 7800)
// ================================================================
reg [12:0] r_refi_cnt;
reg r_refi_expired;
always @(posedge i_clk) begin
if (!i_rst_n) begin
r_refi_cnt <= TREFI - 1;
r_refi_expired <= 1'b0;
end else begin
r_refi_expired <= 1'b0;
if (r_refi_cnt == 13'd0) begin
r_refi_cnt <= TREFI - 1;
r_refi_expired <= 1'b1;
end else begin
r_refi_cnt <= r_refi_cnt - 13'd1;
end
end
end
// ================================================================
// Refresh Debt Counter (0 to 8)
// ================================================================
always @(posedge i_clk) begin
if (!i_rst_n) begin
o_ref_count <= 4'd0;
end else begin
if (r_refi_expired && !i_ref_ack) begin
// tREFI expired without service — accumulate debt (cap at 8)
if (o_ref_count < DEBT_MAX)
o_ref_count <= o_ref_count + 4'd1;
end else if (!r_refi_expired && i_ref_ack && o_ref_count > 4'd0) begin
// Refresh acknowledged — pay down debt
o_ref_count <= o_ref_count - 4'd1;
end
// Both expired and ack in same cycle: net zero change
end
end
// ================================================================
// ABR Request and Urgent Flags
// ================================================================
always @(posedge i_clk) begin
if (!i_rst_n) begin
o_ref_req <= 1'b0;
o_ref_urgent <= 1'b0;
end else begin
// ref_req asserted any time debt > 0 OR watchdog just fired
o_ref_req <= (o_ref_count > 4'd0) | r_refi_expired;
o_ref_urgent <= (o_ref_count >= DEBT_URG);
end
end
// ================================================================
// tRFC Blocking Window Counter
// ================================================================
reg [8:0] r_rfc_cnt;
always @(posedge i_clk) begin
if (!i_rst_n) begin
r_rfc_cnt <= 9'd0;
o_refreshing <= 1'b0;
end else begin
if (i_ref_ack && !o_refreshing) begin
// ABR accepted — start tRFC countdown
r_rfc_cnt <= TRFC - 9'd1;
o_refreshing <= 1'b1;
end else if (o_refreshing) begin
if (r_rfc_cnt == 9'd0) begin
o_refreshing <= 1'b0;
end else begin
r_rfc_cnt <= r_rfc_cnt - 9'd1;
end
end
end
end
// ================================================================
// Per-Bank Refresh Bitmap (32 banks)
// Bit[n] = 1 means bank n needs a PBR refresh this interval
// ================================================================
reg [31:0] r_pbr_bitmap;
integer pbr_i;
// Find lowest set bit (priority encoder)
reg [4:0] r_pbr_next;
always @(*) begin
r_pbr_next = 5'd31;
for (pbr_i = 30; pbr_i >= 0; pbr_i = pbr_i - 1)
if (r_pbr_bitmap[pbr_i]) r_pbr_next = pbr_i[4:0];
end
always @(posedge i_clk) begin
if (!i_rst_n) begin
r_pbr_bitmap <= 32'hFFFF_FFFF; // all banks need refresh on reset
o_pbr_req <= 1'b0;
o_pbr_target <= 5'd0;
end else begin
// On tREFI expiry, reload all bits (new refresh interval)
if (r_refi_expired)
r_pbr_bitmap <= 32'hFFFF_FFFF;
// Clear bit for bank that just received PBR
if (i_pbr_ack)
r_pbr_bitmap[i_pbr_bank] <= 1'b0;
// Update outputs
o_pbr_req <= |r_pbr_bitmap;
o_pbr_target <= r_pbr_next;
end
end
endmodule
`default_nettype wire
// ================================================================
// tb_hbm3_refresh_ctrl.sv
// Self-checking testbench for hbm3_refresh_ctrl
// 5 Tests: basic refresh, debt accumulation, critical debt (8),
// PBR sequence, tRFC blocking enforcement
// ================================================================
`timescale 1ns/1ps
module tb_hbm3_refresh_ctrl;
// ---- DUT Ports ----
logic clk, rst_n;
logic ref_ack, pbr_ack;
logic [4:0] pbr_bank;
wire ref_req, pbr_req, refreshing, ref_urgent;
wire [4:0] pbr_target;
wire [3:0] ref_count;
hbm3_refresh_ctrl dut (
.i_clk (clk),
.i_rst_n (rst_n),
.i_ref_ack (ref_ack),
.i_pbr_ack (pbr_ack),
.i_pbr_bank (pbr_bank),
.o_ref_req (ref_req),
.o_pbr_req (pbr_req),
.o_pbr_target(pbr_target),
.o_refreshing(refreshing),
.o_ref_count (ref_count),
.o_ref_urgent(ref_urgent)
);
// 2 GHz clock: 0.5 ns period
initial clk = 0;
always #0.25 clk = ~clk;
// ---- Helpers ----
int pass_cnt = 0, fail_cnt = 0;
task check(input string name, input logic got, input logic exp);
if (got === exp) begin
$display(" PASS %s", name); pass_cnt++;
end else begin
$display(" FAIL %s: got=%0b exp=%0b", name, got, exp); fail_cnt++;
end
endtask
task tick(input int n = 1);
repeat(n) @(posedge clk);
#0.1;
endtask
task reset_dut;
rst_n = 0; ref_ack = 0; pbr_ack = 0; pbr_bank = 0;
tick(4);
rst_n = 1;
tick(2);
endtask
// ================================================================
// SVA Assertions
// ================================================================
// ref_count must never exceed 8
property debt_cap;
@(posedge clk) disable iff (!rst_n)
ref_count <= 4'd8;
endproperty
assert property (debt_cap) else $error("SVA FAIL: ref_count exceeded 8");
// o_ref_urgent must be 1 whenever ref_count >= 4
property urgent_correct;
@(posedge clk) disable iff (!rst_n)
(ref_count >= 4'd4) |-> ref_urgent;
endproperty
assert property (urgent_correct) else $error("SVA FAIL: ref_urgent not set when count>=4");
// refreshing must de-assert within 441 cycles of ref_ack
property rfc_window;
@(posedge clk) disable iff (!rst_n)
$rose(ref_ack) |-> ##[440:441] !refreshing;
endproperty
assert property (rfc_window) else $error("SVA FAIL: tRFC window wrong duration");
// ================================================================
// TEST 1 — Basic ABR Refresh Cycle
// ================================================================
task test1_basic_refresh;
$display("\n[TEST 1] Basic ABR Refresh Cycle");
reset_dut();
// Wait for watchdog to expire (7800 cycles)
tick(7800);
tick(2); // propagation
check("ref_req after tREFI", ref_req, 1'b1);
// Acknowledge refresh
ref_ack = 1; tick(1); ref_ack = 0;
tick(2);
check("refreshing after ack", refreshing, 1'b1);
check("ref_count after ack", ref_count == 4'd0, 1'b1);
// Wait for tRFC to expire
tick(440);
check("refreshing clear after tRFC", refreshing, 1'b0);
endtask
// ================================================================
// TEST 2 — Debt Accumulation to 4 (Urgent)
// ================================================================
task test2_debt_to_4;
$display("\n[TEST 2] Debt Accumulation to 4 (Urgent)");
reset_dut();
// Miss 4 refresh intervals
repeat(4) begin
tick(7800);
tick(2); // let counter reload
end
tick(2);
check("ref_count == 4", ref_count == 4'd4, 1'b1);
check("ref_urgent asserted", ref_urgent, 1'b1);
// Service one refresh
ref_ack = 1; tick(1); ref_ack = 0;
tick(2);
check("debt decremented to 3", ref_count == 4'd3, 1'b1);
check("ref_urgent cleared", ref_urgent, 1'b0);
endtask
// ================================================================
// TEST 3 — Critical Debt (8 Postponements)
// ================================================================
task test3_debt_critical;
$display("\n[TEST 3] Critical Debt — 8 Postponements");
reset_dut();
// Miss 8 refresh intervals
repeat(9) begin
tick(7800);
tick(2);
end
tick(2);
check("ref_count capped at 8", ref_count == 4'd8, 1'b1);
check("ref_urgent critical", ref_urgent, 1'b1);
check("ref_req asserted", ref_req, 1'b1);
endtask
// ================================================================
// TEST 4 — PBR Sequence (3 banks)
// ================================================================
task test4_pbr_sequence;
$display("\n[TEST 4] PBR Bank Sequence");
reset_dut();
tick(5);
check("PBR bitmap full on reset", pbr_req, 1'b1);
check("PBR target starts at 0", pbr_target == 5'd0, 1'b1);
// Refresh bank 0
pbr_bank = 5'd0; pbr_ack = 1; tick(1); pbr_ack = 0;
tick(2);
check("target advances to 1", pbr_target == 5'd1, 1'b1);
// Refresh banks 1 and 2
pbr_bank = 5'd1; pbr_ack = 1; tick(1); pbr_ack = 0; tick(2);
pbr_bank = 5'd2; pbr_ack = 1; tick(1); pbr_ack = 0; tick(2);
check("target at 3 after 0-2 done", pbr_target == 5'd3, 1'b1);
check("pbr_req still set", pbr_req, 1'b1);
endtask
// ================================================================
// TEST 5 — tRFC Blocking Window
// ================================================================
task test5_rfc_blocking;
$display("\n[TEST 5] tRFC Blocking Window Duration");
reset_dut();
tick(7800); tick(2);
ref_ack = 1; tick(1); ref_ack = 0;
tick(1);
check("refreshing starts immediately", refreshing, 1'b1);
tick(438); // 1 + 438 = 439 cycles into tRFC
check("refreshing still high at 439", refreshing, 1'b1);
tick(1); // cycle 440
check("refreshing clears at 440", refreshing, 1'b0);
endtask
// ================================================================
// Main
// ================================================================
initial begin
$dumpfile("dump.vcd"); $dumpvars(0, tb_hbm3_refresh_ctrl);
test1_basic_refresh();
test2_debt_to_4();
test3_debt_critical();
test4_pbr_sequence();
test5_rfc_blocking();
$display("\n========================================");
$display(" RESULTS: %0d PASS / %0d FAIL", pass_cnt, fail_cnt);
$display("========================================");
$finish;
end
endmodule
DRAM stores data as charge on leaky capacitors. Without refresh, the charge decays below the sense amplifier threshold within 32–64 ms at room temperature. The controller must issue a REFRESH command at least once per tREFI (3.9 µs in HBM3) to rewrite every row before the charge is lost.
ABR (All-Bank Refresh) refreshes all 32 banks simultaneously in tRFC=440 cycles (220 ns). The entire pseudo-channel stalls. PBR (Per-Bank Refresh) targets one bank in tRFCpb=140 cycles (70 ns), leaving 31 banks accessible. PBR reduces average memory unavailability by 3.1× but requires per-bank tracking and scheduling across 32 banks per tREFI interval.
Refresh debt is the count of tREFI intervals that passed without a refresh being serviced. JEDEC allows up to 8 consecutive refresh operations to be postponed (the "pull-in" window) so the scheduler can delay refresh during critical bursts. Beyond 8 postponements, retention time is exceeded and data corruption becomes possible — so the debt is hard-capped at 8 and the controller asserts urgent override signals.
After ref_ack is asserted, all 32 banks (in ABR mode) or the targeted bank (in PBR mode) are unavailable for tRFC/tRFCpb cycles. The o_refreshing signal gates the scheduler — no ACT, RD, WR, or PRE can be issued. The tREFI watchdog continues counting during tRFC, so the next deadline is tracked correctly even while the array is busy.
The refresh controller maintains a 32-bit bitmap where each bit represents one bank. After every tREFI expiry, all 32 bits are set (every bank needs refresh). When a bank receives PBR service (i_pbr_ack + i_pbr_bank), its bit is cleared. The priority encoder selects the lowest-numbered bank with its bit still set, implementing a deterministic round-robin order (bank 0, 1, 2, ... 31) that prevents any bank from starving.