Design for Testability
Manufacturing millions of chips without testing each one is impossible — a single process defect can cause a stuck wire invisible to functional simulation. DFT techniques provide structured access to internal state, enabling efficient post-silicon verification at production scale.
1. Why DFT is Non-Negotiable
A modern SoC has billions of transistors. A single dust particle, chemical contamination, or lithography variation can create a defect that passes RTL simulation but causes the chip to fail in the field. Without DFT, internal nodes are inaccessible — you only see primary I/O pins.
Internal nodes are not accessible via chip pins. A fault on an interior wire may never propagate to an observable output under normal functional patterns.
Certain internal states require specific input sequences to reach. Without DFT, it may take billions of random patterns to exercise a single node.
A faulty chip shipped to a customer can cost 10–1000× more to fix than catching it at wafer test. Industry targets 95–99% fault coverage.
ATE (Automatic Test Equipment) time costs $200–$500/hour. DFT reduces test time by enabling parallel multi-site testing with short scan patterns.
2. Fault Models
ATPG tools don't test for "all possible defects" — they target specific fault models that approximate common manufacturing failures.
| Fault Model | What It Models | Coverage Target | Test Method |
|---|---|---|---|
| Stuck-At-0 (SA0) | Net permanently tied to logic 0 (short to GND) | 95–99% | Static scan |
| Stuck-At-1 (SA1) | Net permanently tied to logic 1 (short to VDD) | 95–99% | Static scan |
| Transition Fault | Net changes correctly but slower than spec (delay defect) | 90–95% | At-speed scan (launch/capture) |
| Bridging Fault | Unintended short between two nets | 85–90% | IDDQ + scan |
| Open Fault | Broken wire — net floats or has no driver | 80–90% | Scan with X-state analysis |
| Cell Internal Fault | Defects inside a cell (intra-cell) | 90%+ | Cell-aware ATPG |
3. Scan Chain Architecture
Scan insertion replaces standard flip-flops with scan flip-flops (SFFs) — a D flip-flop with an added MUX on the D input. When scan_enable (SE) = 1, the MUX selects the scan data input (SI); when SE = 0, it selects the functional data input (D). All SFFs are stitched into a chain: SO of FF[n] connects to SI of FF[n+1].
Scan Flip-Flop RTL
Three-Phase Test Procedure
Every DFT test follows three phases. The shift-in phase loads the test pattern; the capture phase exercises the circuit under test; the shift-out phase reads the response for comparison.
| Phase | SE | Clock Pulses | What Happens |
|---|---|---|---|
| Shift In | 1 (scan) | N (chain length) | ATPG test pattern serially loaded into all scan FFs |
| Capture | 0 (func) | 1 (or 2 for at-speed) | Circuit operates functionally; logic response captured into FFs |
| Shift Out | 1 (scan) | N (chain length) | Captured response shifted out to SO; compared to golden expected |
Interactive: Scan Chain Simulator
Toggle SE, set the scan input bit, and pulse the clock to shift data through a 3-FF scan chain.
Test Controller
SE=0: Capture — logic operates normally
Clocks applied: 0
5. Automatic Test Pattern Generation (ATPG)
ATPG software (Mentor Tessent, Synopsys TetraMAX) algorithmically generates test vectors that detect stuck-at and transition faults. For each untested fault, ATPG works backward from the fault site to determine what values must be loaded into the scan chain to activate the fault and propagate the error to an observable output (SO or primary output).
Key metric: Fault coverage = (Faults detected) / (Total possible faults). Industry-standard target is ≥ 97% stuck-at fault coverage for consumer products, ≥ 99% for automotive (ISO 26262) and aerospace chips.
Untestable faults — faults that ATPG cannot generate patterns for due to circuit structure — should be justified and documented. Common causes include redundant logic (both branches give same output), blocking gates (output always 0 regardless of input), and DFT rule violations.
6. Built-In Self Test (BIST)
BIST embeds test generation and response analysis directly on-chip, eliminating the need for expensive ATE for memory testing and reducing logic test cost for embedded processors. The two primary types are:
A dedicated controller applies march algorithms (March C-, March LR) to SRAM/ROM, testing stuck-at, coupling, and address faults. Almost all chips with embedded memories use MBIST.
A PRPG (pseudo-random pattern generator) applies random patterns; a MISR (multiple-input signature register) compresses responses to a signature. Mismatch against golden signature indicates a fault.
7. JTAG Boundary Scan (IEEE 1149.1)
JTAG adds boundary scan cells at every I/O pin of the chip. A standard 4-wire TAP (Test Access Port: TCK, TMS, TDI, TDO) controller allows testing of board-level interconnect — detecting opens and shorts between chips on a PCB without physical probing.
8. DFT-Friendly RTL Coding Rules
DFT insertion is largely automated, but RTL violations create untestable faults, blocked scan paths, or ATPG runtime explosion. Avoid these common issues:
- Use synchronous resets whenever possible — asynchronous resets create extra ATPG exemptions and need special handling in scan mode.
- Never gate the clock with combinational logic in RTL — always use ICG cells. Gated clocks become multiple clock domains and complicate scan stitching.
- Bring
scan_enableandtest_modeas dedicated top-level ports — never derive them from functional logic. - Avoid tri-state buses on internal signals — they create X-propagation in ATPG that blocks fault detection.
- Don't use feedback loops with no scan access — loops must be broken by at least one scan FF for ATPG to work.
- Don't use internally-generated clocks (ring oscillators, divided clocks) for core logic — ATPG cannot control them in scan mode.
- Isolate memories behind MUX-based test interfaces — direct memory reads/writes in scan mode can corrupt state.
- Keep combinational paths between scan FFs short — excessive logic depth causes X-propagation that masks fault responses.