DFT Day 6 — LBIST: PRPG, MISR, STUMPS Architecture & Aliasing

Why Built-In Self-Test?

External ATPG-based testing requires an ATE (Automatic Test Equipment) machine — a multi-million-dollar instrument that connects to the chip's test pins, applies patterns at speed, and compares responses. ATE is available during production test at the fab or test house, but once a chip is deployed in the field, ATE is no longer available. If the chip needs to be tested after deployment — for safety, reliability, or diagnostics — you need another approach.

Built-In Self-Test (BIST) solves this by embedding the test circuitry inside the chip itself:

Pattern generator on-chip — no external ATE needed to supply test vectors
Response compactor on-chip — no external comparator needed; the chip evaluates its own pass/fail
Test control on-chip — a simple pin or JTAG instruction triggers the self-test autonomously

The result: the chip can test itself anywhere, anytime — at power-on, periodically in the field, or on demand via a service port. This enables several critical use cases:

Use Case	Description	Standard
Power-On Self-Test (POST)	Chip tests itself before entering functional operation; catches shipping/storage damage	—
Automotive Periodic Test	Safety-critical ECU runs BIST every N milliseconds; detects wear-out and transient faults	ISO 26262 ASIL-D
Field Diagnostic	Service technician triggers BIST via OBD/JTAG to localise a field failure	—
Cost Reduction	Reduce ATE time at production test by running simpler on-chip BIST first; use ATE only for top-off patterns	—
Space / Mil-Aero	In-flight or in-orbit self-test where ATE is impossible	MIL-STD-883

For Logic BIST (LBIST), the circuit under test is the combinational logic and flip-flops of the digital core — the same logic tested by scan ATPG. LBIST uses the existing scan chain infrastructure (scan flip-flops, scan-in/scan-out) and adds a pattern generator (PRPG) and response compactor (MISR).

LFSR — Linear Feedback Shift Register

The Linear Feedback Shift Register (LFSR) is the fundamental hardware building block of LBIST. An N-bit LFSR is an N-bit shift register where one or more stages are XOR-fed back to the input, creating a pseudo-random binary sequence that cycles through 2^N − 1 unique states before repeating (state 0 is excluded in a standard LFSR; an all-zero state locks the register).

LFSR Structure and Primitive Polynomials

The feedback tap positions are determined by a primitive polynomial over GF(2) — a polynomial that generates a maximal-length sequence of 2^N − 1 states. Common primitive polynomials:

N (bits)	Primitive Polynomial	Sequence Length	Common Use
4	x⁴ + x³ + 1	15	Short test, teaching
8	x⁸ + x⁶ + x⁵ + x⁴ + 1	255	Small LBIST kernels
16	x¹⁶ + x¹⁵ + x¹³ + x⁴ + 1	65,535	MISR compactors
32	x³² + x²² + x² + x¹ + 1	~4.3 × 10⁹	Full LBIST PRPG

The key property: for a maximal-length LFSR, every non-zero N-bit pattern appears exactly once in the sequence. This gives excellent input space coverage with minimal hardware — just N flip-flops and a handful of XOR gates.

Verilog — 8-bit LFSR (Maximal Length, Galois Form)

// 8-bit maximal-length LFSR
// Primitive polynomial: x^8 + x^6 + x^5 + x^4 + 1
// Feedback taps at bits 8, 6, 5, 4 (1-indexed from MSB)
module lfsr_8bit (
  input        clk, rst_n, en,
  output reg [7:0] q
);
  wire feedback = q[7];  // MSB feeds back

  always @(posedge clk or negedge rst_n) begin
    if (!rst_n)
      q <= 8'hFF;          // seed (must be non-zero)
    else if (en) begin
      q[7] <= q[6];
      q[6] <= q[5] ^ feedback;  // tap at x^6
      q[5] <= q[4] ^ feedback;  // tap at x^5
      q[4] <= q[3] ^ feedback;  // tap at x^4
      q[3] <= q[2];
      q[2] <= q[1];
      q[1] <= q[0];
      q[0] <= feedback;
    end
  end
endmodule
// Sequence length: 2^8 - 1 = 255 unique states

PRPG — Pseudo-Random Pattern Generator

In LBIST, the LFSR is used as a Pseudo-Random Pattern Generator (PRPG). Each clock cycle, the LFSR advances to its next state — a new pseudo-random N-bit word. These bit sequences drive the scan chain inputs, effectively loading pseudo-random test patterns into the circuit under test.

The PRPG operates as follows in the LBIST flow:

LBIST controller initialises the LFSR with a known seed (non-zero start state)
For each test pattern: the LFSR shifts N times to load N bits into the scan chain (where N = scan chain length)
One capture cycle applies the functional clock — the circuit responds and captures data into scan FFs
Scan chain outputs are collected by the MISR
Repeat for the required number of patterns (typically 10,000–50,000)

The number of patterns needed to achieve target coverage depends on the design's fault complexity. For a typical 1–10 million gate design, 10,000–50,000 patterns achieve approximately 85–92% stuck-at coverage — acceptable for field test, though lower than what external ATPG achieves.

Why Not 100% Coverage with PRPG?

Random patterns are excellent at detecting most faults but statistically miss some hard-to-detect faults (random-pattern-resistant faults). External ATPG uses deterministic algorithms to target specific faults; PRPG just generates patterns and hopes they happen to sensitise each fault. Coverage plateaus around 85–92% regardless of how many random patterns are added.

MISR — Multiple Input Signature Register

The Multiple Input Signature Register (MISR) is the response compactor of LBIST — also called a space compactor. After each capture cycle, the scan chain outputs (potentially thousands of bits) are XOR-compacted into the MISR's running signature. The MISR accumulates the effect of all captured responses across all patterns into a single N-bit signature word.

The MISR is structurally similar to an LFSR, but it has multiple inputs (one per scan chain output) XOR-fed into specific stages. The result is a running hash of all captured response data:

MISR[i] ← MISR[i-1] ⊕ (scan_out_bit_i XOR feedback_tap) Final MISR state = compact signature of ALL captured responses Fault-free gold signature = pre-simulated expected MISR value Pass/Fail: MISR_final == gold_signature?

Verilog — 16-bit MISR with 4 Parallel Scan Inputs

// 16-bit MISR — space compactor for LBIST
// Primitive polynomial: x^16 + x^15 + x^13 + x^4 + 1
// 4 parallel scan chain outputs feed into MISR stages
module misr_16bit (
  input         clk, rst_n, capture_en,
  input  [3:0]  scan_out,    // one bit per scan chain
  output reg [15:0] signature
);
  wire fb = signature[15];

  always @(posedge clk or negedge rst_n) begin
    if (!rst_n)
      signature <= 16'hFFFF;
    else if (capture_en) begin
      // Shift with feedback taps (x^16 + x^15 + x^13 + x^4 + 1)
      signature[15] <= signature[14] ^ fb ^ scan_out[3];  // x^15 tap
      signature[14] <= signature[13];
      signature[13] <= signature[12] ^ fb ^ scan_out[2];  // x^13 tap
      signature[12] <= signature[11];
      signature[11] <= signature[10];
      signature[10] <= signature[9];
      signature[9]  <= signature[8];
      signature[8]  <= signature[7];
      signature[7]  <= signature[6];
      signature[6]  <= signature[5];
      signature[5]  <= signature[4];
      signature[4]  <= signature[3] ^ fb ^ scan_out[1];  // x^4 tap
      signature[3]  <= signature[2];
      signature[2]  <= signature[1];
      signature[1]  <= signature[0];
      signature[0]  <= fb ^ scan_out[0];                  // x^1 tap
    end
  end
endmodule
// Gold signature pre-computed by fault-free simulation

Aliasing — The False-Pass Risk

Aliasing is the key risk in MISR-based response compaction. It occurs when a faulty circuit produces exactly the same MISR signature as the fault-free (good) circuit — the LBIST declares the faulty chip as passing. This is a test escape: a defective chip passes LBIST and ships to the customer.

Aliasing Probability Mathematics

For a well-designed N-bit MISR using a primitive polynomial, the aliasing probability per fault is:

P(aliasing) = 2^(-N) For N=16: P = 2^(-16) ≈ 1.5 × 10^(-5) → too high for production For N=32: P = 2^(-32) ≈ 2.3 × 10^(-10) → negligible For N=48: P = 2^(-48) ≈ 3.6 × 10^(-15) → safety-critical applications For N=64: P = 2^(-64) ≈ 5.4 × 10^(-20) → ultra-high-reliability

The aliasing probability is independent of the number of test patterns — it depends only on the MISR width N. This is a fundamental property of the MISR algebraic structure. A 32-bit MISR gives aliasing probability ≈ 2.3 × 10⁻¹⁰ — meaning out of every billion faulty chips tested, at most ~0.23 chips escape undetected due to aliasing. For practical production volumes, this is negligible.

Industry Standard

32-bit MISR is standard in most commercial LBIST implementations. Safety-critical automotive (ISO 26262 ASIL-D) and space applications may use 48 or 64-bit MISRs to further reduce aliasing probability. Never use a 16-bit MISR for production test — the 1-in-65,535 aliasing probability is too high.

STUMPS Architecture

STUMPS (Self-Testing Using MISR and Parallel Shift register Sequences) is the industry-standard LBIST architecture, introduced by Bardell and McAnney at IBM in 1982 and still the dominant approach in modern LBIST tools (Siemens Tessent LBIST, Synopsys DFT Compiler BIST).

The STUMPS architecture has four key components:

LFSR / PRPG — generates pseudo-random bit sequences (the test patterns)
Phase Shifter — distributes and decorrelates LFSR outputs to scan chain inputs
Scan Chains — the existing DFT scan infrastructure (no change needed)
MISR / Space Compactor — compresses scan chain outputs into a running signature

STUMPS LBIST Architecture — Block Diagram

Why the Phase Shifter Is Essential

The LFSR generates consecutive bits that are correlated — adjacent outputs of the LFSR shift register are just time-shifted versions of each other. If adjacent scan chain inputs receive correlated patterns, many faults that require uncorrelated stimuli on adjacent scan chains are undetectable — effectively invisible to the pseudo-random test.

The phase shifter is a network of XOR gates that mixes different LFSR stage outputs to create decorrelated patterns for each scan chain input. Each scan chain input receives an XOR combination of 2–4 LFSR stages — statistically independent of neighbouring chains. This restores the statistical independence that makes pseudo-random patterns effective for broad fault coverage.

LBIST Flow and Timing

The complete LBIST operational sequence, controlled by the on-chip LBIST controller:

Entry: Assert LBIST mode via test pin or JTAG BIST instruction. Normal functional operation halts. Scan enable (SE) is controlled by the LBIST controller, not external test equipment.
Seed load: Load the LFSR with its initial seed value (from a ROM or hardcoded constant). Load the MISR to all-ones initial state.
Pattern loop (repeat N_patterns times):
- Shift phase: LFSR advances N cycles → N bits loaded into scan chain via phase shifter → scan chain is filled with one pseudo-random pattern
- Capture phase: one functional clock cycle applied (SE=0) → circuit evaluates → scan FFs capture response
- Compaction phase: scan chain outputs shifted into MISR (SE=1, MISR accumulates)
Signature compare: LBIST controller compares final MISR value against pre-computed gold signature stored in ROM. Assert PASS or FAIL output.
Exit: Deassert LBIST mode. Resume normal functional operation.

LBIST Test Time Estimation

Total LBIST cycles = N_patterns × (chain_length + 1) Example: 20,000 patterns, chain length = 500 FFs → Total cycles = 20,000 × 501 = 10,020,000 cycles At 100 MHz scan frequency: → Test time = 10,020,000 / 100,000,000 ≈ 0.1 seconds At 500 MHz scan frequency (aggressive): → Test time ≈ 20 milliseconds

LBIST vs External Scan ATPG: Comparison

Property	LBIST	External Scan ATPG
Pattern source	On-chip LFSR/PRPG	Off-chip ATE
Response analysis	On-chip MISR (signature)	Off-chip ATE comparator (full response)
ATE cost	None (no ATE needed)	High ($1M–$10M+ for full-speed ATE)
Stuck-at coverage	85–92% (limited by random patterns)	98–99.5% (deterministic ATPG)
Transition coverage	70–85% (random may miss delay faults)	92–97% (with LOC/LOS at-speed)
Fault diagnosis	Not possible (MISR hides fault info)	Possible (full response available)
Test time per chip	Fast (simple controller, no ATE comm)	Slower (ATE communication overhead)
Area overhead	Medium (LFSR + phase shifter + MISR)	Low (only scan chain insertion)
Field testability	Yes — chip tests itself	No — ATE must be available
Primary use	Field test, POST, automotive periodic test	Production wafer/final test

Random-Pattern-Resistant Faults and Top-off Patterns

The theoretical coverage ceiling of PRPG-based LBIST is not 100%. Some faults are random-pattern-resistant (RPR) — they require very specific input combinations that are statistically unlikely to appear in a pseudo-random sequence.

What Makes a Fault RPR?

A fault is RPR when its detection requires all of the following simultaneously:

A specific sensitisation condition on many primary inputs (e.g., all 32 inputs of a wide AND gate must be at specific values)
A specific propagation condition through the combinational cloud
Specific side-input values to keep other paths at non-controlling values

If the combined probability of all these conditions being met by a random pattern is less than 1/10,000, then even with 50,000 LBIST patterns, the expected number of detections is less than 5 — statistically unreliable.

Common RPR fault examples:

Faults at the output of a wide fan-in AND/OR gate (need all other inputs non-controlling)
Faults deep inside arithmetic units (adder carry chains) requiring specific carry propagation states
Faults near the output of a parity tree (need specific XOR combination of many inputs)

Top-off Patterns: The Solution

Top-off patterns are a small set of deterministic ATPG patterns (typically 100–1,000 patterns) specifically generated to target the RPR faults identified after LBIST simulation. The ATPG tool runs a simulation to find which faults are not adequately covered by LBIST, then generates deterministic patterns for just those faults.

Top-off patterns are stored in a small ROM on-chip or applied via scan during LBIST using a stored pattern mode. The combined LBIST + top-off approach achieves near-ATPG coverage with much lower ATE cost:

Combined LBIST Strategy

LBIST (50,000 random patterns) ≈ 90% stuck-at coverage
+ Top-off patterns (500 deterministic) ≈ additional 8%
= Combined coverage ≈ 98% — approaching full ATPG quality, at field-test cost.

Interview FAQ: LBIST

What is the difference between PRPG and MISR in LBIST?

PRPG (Pseudo-Random Pattern Generator) is the stimulus generator — an LFSR that produces pseudo-random bit sequences to load into scan chains as test patterns. It is the input side of LBIST. MISR (Multiple Input Signature Register) is the response compactor — it collects scan chain outputs from all capture cycles and computes a running XOR signature that fingerprints the circuit's response. It is the output side of LBIST. PRPG drives patterns in; MISR compresses responses out into a single signature for pass/fail comparison against a pre-computed gold value.

What is aliasing and why does it matter in LBIST?

Aliasing occurs when a faulty circuit's MISR response compacts to exactly the same signature as the fault-free circuit — causing LBIST to incorrectly declare the faulty chip as passing (a test escape). For an N-bit MISR using a primitive polynomial, the aliasing probability per fault is 2^(-N). For N=32 this is ~2.3 × 10^(-10), negligible in practice. For N=16 it is 1 in 65,535 — too high for production. Aliasing probability is independent of the number of patterns. Solution: always use a wide MISR (32 bits minimum; 48+ for safety-critical).

What is the STUMPS architecture and why use a phase shifter?

STUMPS (Self-Testing Using MISR and Parallel Shift register Sequences) is the standard LBIST architecture: LFSR/PRPG → Phase Shifter → Scan Chains → MISR. The phase shifter is a network of XOR gates that decorrelates adjacent LFSR stage outputs before distributing them to individual scan chain inputs. Adjacent LFSR stages are time-shifted versions of the same sequence — highly correlated. Without the phase shifter, adjacent scan chains receive correlated patterns, drastically reducing fault detection efficiency (many faults require uncorrelated inputs on neighbouring chains). The phase shifter breaks this correlation, restoring pseudo-random independence across all scan chain inputs.

When would you use LBIST instead of scan ATPG?

LBIST is preferred over external scan ATPG when ATE is unavailable or too expensive. Key scenarios: (1) Power-on self-test — chip tests itself before functional operation; (2) Automotive periodic test — ISO 26262 ASIL-D requires chips to run self-test every few milliseconds in the field; (3) Field diagnostics — chip is deployed and no ATE is accessible; (4) High-volume cost reduction — use LBIST for basic screening then ATPG only for top-off. External ATPG is preferred for production test where maximum fault coverage, fault diagnosis capability, and at-speed transition fault testing are required.

What is a random-pattern-resistant fault?

A random-pattern-resistant (RPR) fault requires a very specific input combination to sensitise — a combination that pseudo-random patterns are statistically unlikely to produce. Examples: faults at wide AND/OR gate outputs (all other inputs must be at non-controlling values simultaneously), faults deep in adder carry chains, or faults in parity trees. The detection probability per random pattern is very low (e.g., 2^(-32) for a 32-input AND gate), so even 50,000 LBIST patterns may never detect it. Solution: identify RPR faults via post-LBIST simulation, then generate deterministic top-off ATPG patterns to target them.

← Day 5: At-Speed Testing Day 7: MBIST →

Logic Built-In Self-TestPRPG · MISR · STUMPS · Aliasing