HomeHBM3 ControllerModule 7 — ECC Engine
Phase 2 · Module 7 of 12

HBM3 ECC Engine (SECDED)

Single Error Correct, Double Error Detect — 32-bit data protected by 7 Hamming parity bits into a 39-bit inline codeword. Syndrome decoder, automatic bit-flip correction, and double-error flag in synthesizable Verilog.

📄 hbm3_ecc_engine.v 🕑 25 min read SECDED 39-bit codeword

1. Why HBM3 Needs ECC

DRAM cells are capacitors. They leak charge, they can be struck by alpha particles from packaging materials, and cosmic-ray neutrons cause soft errors at a measurable rate. The industry metric is BEER (Bit Error Rate) — for modern HBM3 at 64 Gb it sits around 10-12 to 10-14 raw errors per bit per hour. That sounds tiny until you have a GPU with 96 GB of HBM3: at 10-12 you statistically see a raw bit flip roughly every few hours of continuous use.

Without ECC that flip is silent data corruption. With SECDED ECC the controller catches and corrects it in hardware, transparently. The application never sees the error; the OS may log a corrected-error counter.

HBM3 uses inline ECC. Unlike DDR5 which pins a separate x8 ECC DRAM rank onto the bus, HBM3 stores its ECC bits in the same die array as data. Every 32-bit data burst carried over a pseudo-channel also carries 7 ECC bits, forming a 39-bit codeword. The PHY reads all 39 bits and passes them to this engine before forwarding corrected data to the memory subsystem.

Key numbers: 32 data bits + 7 ECC bits = 39 bits total per pseudo-channel per access. ECC overhead = 7/39 ≈ 18%. This is the price HBM3 pays for hardware-corrected reliability.

2. SECDED Theory — Hamming Distance

The Hamming distance between two binary strings is the number of bit positions where they differ. Error-correcting codes work by ensuring every valid codeword is sufficiently far from every other valid codeword.

SECDED requires d=4. The key insight: with d=4, a single-bit error moves the received word to distance 1 from exactly one valid codeword (the original) and distance 3+ from all others — so correction is unambiguous. A double-bit error moves the received word to distance 2 from the nearest valid codewords — not correctable, but detectable because it cannot be confused with a single-bit error from another codeword.

The extra bit that pushes from d=3 to d=4 is the overall parity bit P7 — an XOR of all 38 other bits in the codeword. After decoding: if syndrome ≠ 0 and overall parity is odd → single-bit error, correct it. If syndrome ≠ 0 and overall parity is even → double-bit error, flag it uncorrectable.

3. Hamming Code Construction

Parity Bit Positions (1-indexed)

In a Hamming code, parity bits occupy positions that are powers of 2: 1, 2, 4, 8, 16, 32, 64. Every other position carries a data bit. For our 39-bit codeword the positions are:

Codeword PositionRoleLabel
1ParityP1
2ParityP2
3DataD0
4ParityP4
5DataD1
6DataD2
7DataD3
8ParityP8
9–15DataD4–D10
16ParityP16
17–31DataD11–D25
32ParityP32
33–38DataD26–D31
39Overall parityP7 (P_all)

Which Bits Each Parity Covers

Each parity bit Pi covers all codeword positions whose binary representation has bit i set:

Each parity bit is set so that the XOR of all bits it covers (including itself) equals zero in a correct codeword. During decoding, re-computing these XORs over the received data gives the syndrome.

4. Data Flow Diagram

DATA IN i_enc_data[31:0] 32 bits ENCODER XOR tree 7 parity bits CODEWORD o_enc_codeword[38:0] 39 bits → DRAM SYNDROME CHECK 7-bit syndrome XOR recv vs calc CORRECTED o_dec_data[31:0] + SEC/DED flags encode write read correct

5. Encoder Logic

The encoder takes i_enc_data[31:0] and computes 6 Hamming parity bits plus one overall parity bit. Each of the 6 parity bits is the XOR of specific data bits — those whose codeword positions (after inserting parity slots) have the corresponding power-of-2 bit set.

After laying out data into the 38-position codeword (positions 1–38 excluding parity slots = data bits D0–D31), each Hamming parity bit covers:

Parity BitPositionData bits covered (D-index)
P11D0,D1,D3,D4,D6,D8,D10,D11,D13,D15,D17,D19,D21,D23,D25,D26,D28,D30
P22D0,D2,D3,D5,D6,D9,D10,D12,D13,D16,D17,D20,D21,D24,D25,D27,D28,D31
P44D1,D2,D3,D7,D8,D9,D10,D14,D15,D16,D17,D22,D23,D24,D25,D29,D30,D31
P88D4,D5,D6,D7,D8,D9,D10,D18,D19,D20,D21,D22,D23,D24,D25
P1616D11,D12,D13,D14,D15,D16,D17,D18,D19,D20,D21,D22,D23,D24,D25
P3232D26,D27,D28,D29,D30,D31
P_all39XOR of all 38 bits (P1..P32 + D0..D31)

The 39-bit output codeword layout places parity bits at their power-of-2 positions and data bits at the remaining positions. In zero-indexed o_enc_codeword[38:0]: index 0 = position 1 (P1), index 1 = position 2 (P2), index 2 = position 3 (D0), index 3 = position 4 (P4), and so on.

6. Full Verilog Source

Verilog — hbm3_ecc_engine.v
// ============================================================
// hbm3_ecc_engine.v
// SECDED (Single Error Correct, Double Error Detect) ECC
// for HBM3 pseudo-channel: 32-bit data + 7 parity = 39-bit
// codeword.  Inline ECC — no separate ECC DRAM needed.
// Phase 2, Module 7 — EcrioniX HBM3 Controller Series
// ============================================================
module hbm3_ecc_engine (
    input  wire        i_clk,
    input  wire        i_rst_n,

    // Encode path
    input  wire        i_enc_valid,          // encode request
    input  wire [31:0] i_enc_data,           // 32-bit data in
    output reg  [38:0] o_enc_codeword,       // 39-bit codeword out

    // Decode path
    input  wire        i_dec_valid,          // decode request
    input  wire [38:0] i_dec_codeword,       // received 39-bit word
    output reg  [31:0] o_dec_data,           // corrected data
    output reg         o_single_err,         // 1 = 1-bit error corrected
    output reg         o_double_err,         // 1 = 2-bit error, uncorrectable
    output reg  [5:0]  o_err_pos             // bit position of error (0=none)
);

// ─────────────────────────────────────────────────────────────
// Internal wires
// ─────────────────────────────────────────────────────────────
// Codeword layout (1-indexed position → 0-indexed array index):
// Pos 1  → cw[0]  : P1
// Pos 2  → cw[1]  : P2
// Pos 3  → cw[2]  : D[0]
// Pos 4  → cw[3]  : P4
// Pos 5  → cw[4]  : D[1]
// Pos 6  → cw[5]  : D[2]
// Pos 7  → cw[6]  : D[3]
// Pos 8  → cw[7]  : P8
// Pos 9  → cw[8]  : D[4]
// Pos 10 → cw[9]  : D[5]
// Pos 11 → cw[10] : D[6]
// Pos 12 → cw[11] : D[7]
// Pos 13 → cw[12] : D[8]
// Pos 14 → cw[13] : D[9]
// Pos 15 → cw[14] : D[10]
// Pos 16 → cw[15] : P16
// Pos 17 → cw[16] : D[11]
// Pos 18 → cw[17] : D[12]
// Pos 19 → cw[18] : D[13]
// Pos 20 → cw[19] : D[14]
// Pos 21 → cw[20] : D[15]
// Pos 22 → cw[21] : D[16]
// Pos 23 → cw[22] : D[17]
// Pos 24 → cw[23] : D[18]
// Pos 25 → cw[24] : D[19]
// Pos 26 → cw[25] : D[20]
// Pos 27 → cw[26] : D[21]
// Pos 28 → cw[27] : D[22]
// Pos 29 → cw[28] : D[23]
// Pos 30 → cw[29] : D[24]
// Pos 31 → cw[30] : D[25]
// Pos 32 → cw[31] : P32
// Pos 33 → cw[32] : D[26]
// Pos 34 → cw[33] : D[27]
// Pos 35 → cw[34] : D[28]
// Pos 36 → cw[35] : D[29]
// Pos 37 → cw[36] : D[30]
// Pos 38 → cw[37] : D[31]
// Pos 39 → cw[38] : P_all (overall parity)

// ─────────────────────────────────────────────────────────────
// ENCODER
// ─────────────────────────────────────────────────────────────
wire [5:0]  enc_p;      // 6 Hamming parity bits
wire        enc_pall;   // overall parity
wire [37:0] cw_pre;     // codeword before overall parity

// P1: covers positions with bit0=1: 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37
// → data bits at those positions: D0,D1,D3,D4,D6,D8,D10,D11,D13,D15,D17,D19,D21,D23,D25,D26,D28,D30
assign enc_p[0] = i_enc_data[0]  ^ i_enc_data[1]  ^ i_enc_data[3]  ^ i_enc_data[4]  ^
                   i_enc_data[6]  ^ i_enc_data[8]  ^ i_enc_data[10] ^ i_enc_data[11] ^
                   i_enc_data[13] ^ i_enc_data[15] ^ i_enc_data[17] ^ i_enc_data[19] ^
                   i_enc_data[21] ^ i_enc_data[23] ^ i_enc_data[25] ^ i_enc_data[26] ^
                   i_enc_data[28] ^ i_enc_data[30];

// P2: covers positions with bit1=1: 2,3,6,7,10,11,14,15,18,19,22,23,26,27,30,31,34,35,38
assign enc_p[1] = i_enc_data[0]  ^ i_enc_data[2]  ^ i_enc_data[3]  ^ i_enc_data[5]  ^
                   i_enc_data[6]  ^ i_enc_data[9]  ^ i_enc_data[10] ^ i_enc_data[12] ^
                   i_enc_data[13] ^ i_enc_data[16] ^ i_enc_data[17] ^ i_enc_data[20] ^
                   i_enc_data[21] ^ i_enc_data[24] ^ i_enc_data[25] ^ i_enc_data[27] ^
                   i_enc_data[28] ^ i_enc_data[31];

// P4: covers positions with bit2=1: 4-7,12-15,20-23,28-31,36-38
assign enc_p[2] = i_enc_data[1]  ^ i_enc_data[2]  ^ i_enc_data[3]  ^ i_enc_data[7]  ^
                   i_enc_data[8]  ^ i_enc_data[9]  ^ i_enc_data[10] ^ i_enc_data[14] ^
                   i_enc_data[15] ^ i_enc_data[16] ^ i_enc_data[17] ^ i_enc_data[22] ^
                   i_enc_data[23] ^ i_enc_data[24] ^ i_enc_data[25] ^ i_enc_data[29] ^
                   i_enc_data[30] ^ i_enc_data[31];

// P8: covers positions with bit3=1: 8-15,24-31
assign enc_p[3] = i_enc_data[4]  ^ i_enc_data[5]  ^ i_enc_data[6]  ^ i_enc_data[7]  ^
                   i_enc_data[8]  ^ i_enc_data[9]  ^ i_enc_data[10] ^ i_enc_data[18] ^
                   i_enc_data[19] ^ i_enc_data[20] ^ i_enc_data[21] ^ i_enc_data[22] ^
                   i_enc_data[23] ^ i_enc_data[24] ^ i_enc_data[25];

// P16: covers positions with bit4=1: 16-31
assign enc_p[4] = i_enc_data[11] ^ i_enc_data[12] ^ i_enc_data[13] ^ i_enc_data[14] ^
                   i_enc_data[15] ^ i_enc_data[16] ^ i_enc_data[17] ^ i_enc_data[18] ^
                   i_enc_data[19] ^ i_enc_data[20] ^ i_enc_data[21] ^ i_enc_data[22] ^
                   i_enc_data[23] ^ i_enc_data[24] ^ i_enc_data[25];

// P32: covers positions with bit5=1: 32-38
assign enc_p[5] = i_enc_data[26] ^ i_enc_data[27] ^ i_enc_data[28] ^ i_enc_data[29] ^
                   i_enc_data[30] ^ i_enc_data[31];

// Assemble pre-parity codeword (no P_all yet)
assign cw_pre = {
    i_enc_data[31], i_enc_data[30], i_enc_data[29], i_enc_data[28],
    i_enc_data[27], i_enc_data[26], enc_p[5],
    i_enc_data[25], i_enc_data[24], i_enc_data[23], i_enc_data[22],
    i_enc_data[21], i_enc_data[20], i_enc_data[19], i_enc_data[18],
    i_enc_data[17], i_enc_data[16], i_enc_data[15], i_enc_data[14],
    i_enc_data[13], i_enc_data[12], i_enc_data[11], enc_p[4],
    i_enc_data[10], i_enc_data[9],  i_enc_data[8],  i_enc_data[7],
    i_enc_data[6],  i_enc_data[5],  i_enc_data[4],  enc_p[3],
    i_enc_data[3],  i_enc_data[2],  i_enc_data[1],  enc_p[2],
    i_enc_data[0],  enc_p[1],       enc_p[0]
};

// Overall parity = XOR of all 38 bits
assign enc_pall = ^cw_pre;

always @(posedge i_clk or negedge i_rst_n) begin
    if (!i_rst_n)
        o_enc_codeword <= 39'd0;
    else if (i_enc_valid)
        o_enc_codeword <= {enc_pall, cw_pre};
end

// ─────────────────────────────────────────────────────────────
// DECODER — syndrome computation
// ─────────────────────────────────────────────────────────────
wire [5:0]  synd;           // 6-bit Hamming syndrome
wire        synd_pall;      // overall parity of received word
wire        any_err;
wire        sec;            // single error correctable
wire        ded;            // double error detected
wire [37:0] rx_cw;          // received codeword bits 37:0
wire        rx_pall;        // received overall parity bit (cw[38])

assign rx_cw   = i_dec_codeword[37:0];
assign rx_pall = i_dec_codeword[38];

// Recompute Hamming parities over received bits, XOR with received parity bits
assign synd[0] = rx_cw[0] ^ rx_cw[2] ^ rx_cw[4] ^ rx_cw[6] ^ rx_cw[8]  ^
                  rx_cw[10] ^ rx_cw[12] ^ rx_cw[14] ^ rx_cw[16] ^ rx_cw[18] ^
                  rx_cw[20] ^ rx_cw[22] ^ rx_cw[24] ^ rx_cw[26] ^ rx_cw[28] ^
                  rx_cw[30] ^ rx_cw[32] ^ rx_cw[34] ^ rx_cw[36];

assign synd[1] = rx_cw[1] ^ rx_cw[2] ^ rx_cw[5] ^ rx_cw[6] ^ rx_cw[9]  ^
                  rx_cw[10] ^ rx_cw[13] ^ rx_cw[14] ^ rx_cw[17] ^ rx_cw[18] ^
                  rx_cw[21] ^ rx_cw[22] ^ rx_cw[25] ^ rx_cw[26] ^ rx_cw[29] ^
                  rx_cw[30] ^ rx_cw[33] ^ rx_cw[34] ^ rx_cw[37];

assign synd[2] = rx_cw[3] ^ rx_cw[4] ^ rx_cw[5] ^ rx_cw[6] ^ rx_cw[11] ^
                  rx_cw[12] ^ rx_cw[13] ^ rx_cw[14] ^ rx_cw[19] ^ rx_cw[20] ^
                  rx_cw[21] ^ rx_cw[22] ^ rx_cw[27] ^ rx_cw[28] ^ rx_cw[29] ^
                  rx_cw[30] ^ rx_cw[35] ^ rx_cw[36] ^ rx_cw[37];

assign synd[3] = rx_cw[7]  ^ rx_cw[8]  ^ rx_cw[9]  ^ rx_cw[10] ^ rx_cw[11] ^
                  rx_cw[12] ^ rx_cw[13] ^ rx_cw[14] ^ rx_cw[23] ^ rx_cw[24] ^
                  rx_cw[25] ^ rx_cw[26] ^ rx_cw[27] ^ rx_cw[28] ^ rx_cw[29] ^ rx_cw[30];

assign synd[4] = rx_cw[15] ^ rx_cw[16] ^ rx_cw[17] ^ rx_cw[18] ^ rx_cw[19] ^
                  rx_cw[20] ^ rx_cw[21] ^ rx_cw[22] ^ rx_cw[23] ^ rx_cw[24] ^
                  rx_cw[25] ^ rx_cw[26] ^ rx_cw[27] ^ rx_cw[28] ^ rx_cw[29] ^ rx_cw[30];

assign synd[5] = rx_cw[31] ^ rx_cw[32] ^ rx_cw[33] ^ rx_cw[34] ^
                  rx_cw[35] ^ rx_cw[36] ^ rx_cw[37];

// Overall parity check (XOR of all 39 received bits)
assign synd_pall = ^i_dec_codeword;  // should be 0 if no error or even-count error

assign any_err = |synd;
assign sec     = any_err &  synd_pall;   // odd parity → correctable single error
assign ded     = any_err & ~synd_pall;   // even parity + syndrome → double error

// ─────────────────────────────────────────────────────────────
// Error correction — flip the bit at position synd (1-indexed)
// synd is the error position in the 1-indexed Hamming space
// Map it back to data bit index
// ─────────────────────────────────────────────────────────────
wire [38:0] corrected_cw;
genvar gi;
generate
    for (gi = 0; gi < 39; gi = gi + 1) begin : g_flip
        assign corrected_cw[gi] = (sec && (synd == (gi+1))) ?
                                   ~i_dec_codeword[gi] : i_dec_codeword[gi];
    end
endgenerate

// Extract data bits from corrected codeword
wire [31:0] ext_data;
assign ext_data = {
    corrected_cw[37], corrected_cw[36], corrected_cw[35], corrected_cw[34],
    corrected_cw[33], corrected_cw[32], corrected_cw[30], corrected_cw[29],
    corrected_cw[28], corrected_cw[27], corrected_cw[26], corrected_cw[25],
    corrected_cw[24], corrected_cw[23], corrected_cw[22], corrected_cw[21],
    corrected_cw[20], corrected_cw[19], corrected_cw[18], corrected_cw[17],
    corrected_cw[16], corrected_cw[14], corrected_cw[13], corrected_cw[12],
    corrected_cw[11], corrected_cw[10], corrected_cw[9],  corrected_cw[8],
    corrected_cw[6],  corrected_cw[5],  corrected_cw[4],  corrected_cw[2]
};

always @(posedge i_clk or negedge i_rst_n) begin
    if (!i_rst_n) begin
        o_dec_data   <= 32'd0;
        o_single_err <= 1'b0;
        o_double_err <= 1'b0;
        o_err_pos    <= 6'd0;
    end else if (i_dec_valid) begin
        o_dec_data   <= ext_data;
        o_single_err <= sec;
        o_double_err <= ded;
        o_err_pos    <= sec ? synd : 6'd0;
    end
end

endmodule

7. Syndrome Table (First 16 Entries)

The 6-bit syndrome value directly equals the 1-indexed error position in the codeword. Syndrome 0 = no error. When the overall parity check is also odd, the syndrome gives the exact position to flip.

Syndrome [5:0]Error PositionCodeword BitMaps To
000000NoneNo error
0000011cw[0]P1 (parity bit)
0000102cw[1]P2 (parity bit)
0000113cw[2]D[0]
0001004cw[3]P4 (parity bit)
0001015cw[4]D[1]
0001106cw[5]D[2]
0001117cw[6]D[3]
0010008cw[7]P8 (parity bit)
0010019cw[8]D[4]
00101010cw[9]D[5]
00101111cw[10]D[6]
00110012cw[11]D[7]
00110113cw[12]D[8]
00111014cw[13]D[9]
00111115cw[14]D[10]

8. Port Reference Table

PortDirWidthDescription
i_clkIn1System clock
i_rst_nIn1Active-low synchronous reset
i_enc_validIn1Encode request — latch i_enc_data this cycle
i_enc_dataIn32Raw 32-bit data to protect
o_enc_codewordOut3939-bit SECDED codeword (data + 7 ECC bits), registered
i_dec_validIn1Decode request — evaluate i_dec_codeword this cycle
i_dec_codewordIn39Received codeword from DRAM (may have errors)
o_dec_dataOut32Corrected data output, registered
o_single_errOut1High when a single-bit error was detected and corrected
o_double_errOut1High when a double-bit error is detected (uncorrectable)
o_err_posOut6Codeword position (1–38) of corrected bit; 0 if no error

9. SystemVerilog Testbench

SystemVerilog — tb_hbm3_ecc_engine.sv
// ============================================================
// tb_hbm3_ecc_engine.sv
// Self-checking testbench with SVA assertions
// Tests: no-error, single-bit error at every position, DED
// ============================================================
`timescale 1ns/1ps
module tb_hbm3_ecc_engine;

logic        clk, rst_n;
logic        enc_valid;
logic [31:0] enc_data;
logic [38:0] enc_cw;

logic        dec_valid;
logic [38:0] dec_cw;
logic [31:0] dec_data;
logic        single_err, double_err;
logic [5:0]  err_pos;

hbm3_ecc_engine dut (
    .i_clk(clk), .i_rst_n(rst_n),
    .i_enc_valid(enc_valid), .i_enc_data(enc_data),
    .o_enc_codeword(enc_cw),
    .i_dec_valid(dec_valid), .i_dec_codeword(dec_cw),
    .o_dec_data(dec_data),
    .o_single_err(single_err), .o_double_err(double_err),
    .o_err_pos(err_pos)
);

// 500 MHz clock
initial clk = 0;
always #1 clk = ~clk;

// SVA: after enc_valid, codeword must be valid next cycle
property p_enc_latency;
    @(posedge clk) enc_valid |=> (enc_cw !== 39'bx);
endproperty
assert property(p_enc_latency) else
    $error("ENC: codeword X after enc_valid");

// SVA: dec output stable cycle after dec_valid
property p_dec_latency;
    @(posedge clk) dec_valid |=> (dec_data !== 32'bx);
endproperty
assert property(p_dec_latency) else
    $error("DEC: data X after dec_valid");

// SVA: single_err and double_err must be mutually exclusive
property p_exclusive;
    @(posedge clk) !(single_err && double_err);
endproperty
assert property(p_exclusive) else
    $error("ECC: single_err and double_err both asserted!");

task automatic encode_and_check;
    input [31:0] data;
    output [38:0] cw_out;
    begin
        @(negedge clk);
        enc_valid = 1; enc_data = data;
        @(posedge clk); #0.1;
        enc_valid = 0;
        @(posedge clk); #0.1;
        cw_out = enc_cw;
    end
endtask

task automatic decode_cw;
    input [38:0] cw;
    begin
        @(negedge clk);
        dec_valid = 1; dec_cw = cw;
        @(posedge clk); #0.1;
        dec_valid = 0;
        @(posedge clk); #0.1;
    end
endtask

integer i, pass, fail;
logic [38:0] cw_good;

initial begin
    pass = 0; fail = 0;
    rst_n = 0; enc_valid = 0; dec_valid = 0;
    enc_data = 0; dec_cw = 0;
    repeat(4) @(posedge clk);
    rst_n = 1;
    @(posedge clk);

    // ── Test 1: No error ──────────────────────────────────────
    encode_and_check(32'hDEAD_BEEF, cw_good);
    decode_cw(cw_good);
    if (dec_data === 32'hDEAD_BEEF && !single_err && !double_err) begin
        $display("PASS: No-error decode: data=0x%08X", dec_data); pass++;
    end else begin
        $display("FAIL: No-error decode: got=0x%08X sec=%0b ded=%0b",
                  dec_data, single_err, double_err); fail++;
    end

    // ── Test 2: Single-bit errors at all 39 positions ─────────
    encode_and_check(32'hA5A5_A5A5, cw_good);
    for (i = 0; i < 39; i++) begin
        decode_cw(cw_good ^ (39'd1 << i));
        if (single_err && !double_err && dec_data === 32'hA5A5_A5A5) begin
            pass++;
        end else begin
            $display("FAIL: SEC at pos %0d: dec=0x%08X sec=%b ded=%b pos=%0d",
                      i, dec_data, single_err, double_err, err_pos);
            fail++;
        end
    end
    $display("SEC sweep: %0d/39 passed", pass-1); // -1 for test1

    // ── Test 3: Double-bit error ───────────────────────────────
    encode_and_check(32'h1234_5678, cw_good);
    decode_cw(cw_good ^ 39'h3);  // flip bits 0 and 1
    if (double_err && !single_err) begin
        $display("PASS: DED detected"); pass++;
    end else begin
        $display("FAIL: DED not flagged: sec=%0b ded=%0b", single_err, double_err);
        fail++;
    end

    $display("═══ RESULTS: PASS=%0d  FAIL=%0d ═══", pass, fail);
    if (fail === 0) $display("ALL TESTS PASSED");
    $finish;
end
endmodule

10. Frequently Asked Questions

Why does HBM3 use inline ECC instead of a separate ECC DRAM?

HBM3 stores ECC bits in the same memory array as data (inline ECC), unlike DDR5 which uses a separate x8 DRAM rank. Inline ECC lets HBM3 maintain its wide-bus efficiency without needing extra dies. The trade-off is about 18% capacity overhead (7 ECC bits per 32 data bits), but the latency savings from avoiding a separate ECC transaction dominate at HBM3 speeds.

What is SECDED and what Hamming distance does it require?

SECDED (Single Error Correct, Double Error Detect) requires a minimum Hamming distance of 4. With d=4: any single-bit error moves the received word to distance 1 from the correct codeword (correctable), any double-bit error moves it to distance 2 from all valid codewords (detectable but not correctable). The overall parity bit P7 is what pushes the minimum distance from 3 to 4 and disambiguates correctable from uncorrectable errors.

How many parity bits are needed for 32-bit data?

Standard Hamming codes require r parity bits where 2^r >= m + r + 1, with m = 32 data bits. Solving: r=6 gives 2^6=64 >= 32+6+1=39. So 6 Hamming parity bits suffice. Adding the 7th overall parity bit converts SEC to SECDED, for a total codeword width of 39 bits (32 data + 7 ECC).

What does the syndrome value represent after decoding?

The syndrome is a 6-bit value computed by XOR-ing the received parity bits against freshly recomputed parity over the received data bits. Syndrome = 0 means no error. A non-zero syndrome whose value falls within the codeword range (1–38) and whose overall parity bit is odd means a single-bit error at that exact position — correct it by flipping that bit. A non-zero syndrome with even overall parity means a double-bit error has been detected and cannot be corrected.

Is this ECC engine synthesizable for ASIC and FPGA?

Yes. The module uses only XOR trees and registered outputs — no RAM, no variable-bound loops, no unsupported constructs. It synthesizes cleanly in Synopsys Design Compiler for ASIC and in Vivado/Quartus for FPGA. The XOR trees for 7 parity bits over 32 data bits are shallow (depth ~5 XOR gates) so they meet timing comfortably even at 2 GHz. The generate loop for bit-flip correction produces 39 parallel 2:1 muxes.