DFT · Chip Testing

Structural vs Functional Testing

Every chip that leaves the fab must be tested. But there are two fundamentally different philosophies of testing — one checks what was built, the other checks what it does. Understanding both is essential for any DFT or verification engineer.

STRUCTURAL TESTING "Was the chip built correctly?" Stuck-at Faults SA0 / SA1 Transition Faults slow-to-rise/fall Path Delay Faults timing violations Bridging Faults short between wires ATPG auto test patterns Scan Chains DFT infrastructure Target: 99%+ fault coverage · ATE at wafer/package FUNCTIONAL TESTING "Does the chip do the right thing?" MBIST memory self-test LBIST logic self-test Boundary Scan JTAG IEEE 1149.1 Functional Vectors use-case patterns IDDQ Testing quiescent current Post-Silicon board-level validation Target: spec compliance · used at board / system level
Both test types are essential — structural tests catch manufacturing defects; functional tests verify correct behaviour. Production chips use both.
Fundamental Concept
The Core Difference — One Question Changes Everything
🔩 Structural Testing
  • Question: "Was this chip manufactured without defects?"
  • Models: Physical defects — shorts, opens, resistive bridges
  • Patterns: Mathematically generated by ATPG tools
  • Infrastructure: Requires scan insertion during DFT
  • When: Wafer sort (pre-package) + final test (post-package)
  • Tools: ATE (Teradyne, Advantest) + ATPG (TetraMAX, Modus)
  • Metric: Fault coverage % (target ≥ 99%)
  • Does NOT verify: Functional correctness, timing at speed, protocol compliance
⚡ Functional Testing
  • Question: "Does this chip behave as specified?"
  • Models: Application use cases and corner scenarios
  • Patterns: Derived from design spec, RTL simulation, or BIST hardware
  • Infrastructure: BIST engines, JTAG, or board-level test harness
  • When: Post-silicon bring-up, board-level test, field diagnostics
  • Tools: BIST FSM on-chip, oscilloscopes, protocol analyzers
  • Metric: Test coverage against specification requirements
  • Does NOT catch: Stuck wires, bridge defects (that's structural's job)
PropertyStructural TestingFunctional Testing
GoalDetect manufacturing defectsVerify design correctness & spec compliance
Fault modelStuck-at, transition, path delay, bridgingSpecification failures, protocol violations, data errors
Pattern sourceATPG tool (automatic)Design spec, simulation, BIST engine
DFT required?Yes — scan chains, test pinsOptional (BIST) or No (external functional ATE)
Test speedSlow scan shift speed (not at-speed)At-speed or functional speed
Test timeMilliseconds (high vector parallelism)Seconds to minutes (full boot, SW execution)
What it missesFunctional bugs, spec violationsLow-level defects (sticky shorts, resistive bridges)
Used atWafer sort, package test (ATE floor)Board bring-up, system integration, field diagnostics
Industry metricFault coverage % (99%+ target)Functional test coverage vs. spec
Typical cost$0.10–$5.00 per chip (ATE time)Higher — board, SW, longer test time
Deep Dive — Structural
Structural Test Types & Fault Models

A fault model is a mathematical abstraction of what a physical manufacturing defect looks like at the logic level. The four primary fault models cover the vast majority of real silicon defects.

📌
Most Common
Stuck-At Fault (SAF)
A wire is permanently fixed at logic 0 (SA0) or logic 1 (SA1) regardless of what drives it. Caused by a short circuit to GND (SA0) or VDD (SA1). The industry-standard fault model since the 1970s. ATPG generates patterns that:

1. Activate the fault: set the stuck node to its opposite value.
2. Propagate the fault effect through logic to an observable output.
3. Observe the wrong value at a scan flop or primary output.
SA0SA199%+ target
⏱️
Timing Defects
Transition Fault (TF)
A wire can toggle but does so too slowly — it fails to complete a 0→1 (slow-to-rise) or 1→0 (slow-to-fall) transition within the clock period. Caused by resistive contacts, high-resistance opens, or marginal drive strength. Requires at-speed testing (launch-capture with two clocks) since slow transitions are invisible at reduced test frequency.
slow-to-riseslow-to-fallat-speed
🛤️
Critical Path
Path Delay Fault (PDF)
Tests the cumulative delay along a specific combinational path through multiple gates. Even if every individual gate is fast enough, their combined delay on the critical path may exceed the clock period. Path delay testing requires full functional-speed clocking and two-cycle launch-capture patterns (LOS: launch-on-shift or LOC: launch-on-capture). Detects deep submicron timing failures invisible to stuck-at tests.
critical pathLOSLOCsetup time
🔗
Physical Shorts
Bridging Fault (BF)
Two adjacent wires are shorted together due to a particle or lithography defect. The shorted wires create a logic conflict — one wire trying to drive 0 and the other 1 — that may resolve as AND or OR depending on transistor drive strengths. Bridging faults are harder to detect than stuck-at: the fault effect depends on both the victim and aggressor wire values simultaneously.
AND-bridgingOR-bridgingwired-AND
🔋
Leakage Defects
IDDQ Testing
Measures the quiescent supply current (IDDQ) when the chip is in a static state (no toggling). A defect-free CMOS gate draws near-zero current when static. Resistive bridges, gate oxide shorts, and parasitic transistors create unexpected current paths that elevate IDDQ by 10–1000×. IDDQ testing catches defects that structural tests miss — particularly stuck-open faults and oxide shorts.
quiescent currentoxide shortCMOS
🔬
Advanced Node
Cell-Aware Fault (CAF)
At 7nm and below, defects can occur inside a standard cell — not just on the wires connecting cells. Cell-aware ATPG uses transistor-level models of each library cell to detect intra-cell defects (an extra contact inside a NAND gate, for example) that are invisible to conventional stuck-at models. Required for 7nm and below to meet DPPM targets.
7nm+intra-cellDPPM

The Scan Chain — DFT Infrastructure for Structural Tests

Structural tests need to control and observe every flip-flop in the design. In normal operation, most flip-flops are buried deep inside the chip — unreachable from the primary I/O pins. The scan chain solves this by converting all flip-flops into a giant shift register.

Pin Added to Each FFFunctionActive When
Scan In (SI)Serial data input for shifting test patterns into the flopSE = 1 (scan mode)
Scan Enable (SE)Mux select: 0 = functional path, 1 = scan pathControlled by test controller
Scan Out (SO)Serial output — connects to SI of the next flop in the chainSE = 1 (shift phase)
SystemVerilogscan flip-flop model
// Scan-enabled D flip-flop (SDFF)
// Synthesis inserts these automatically after DFT insertion
module scan_dff (
  input  logic clk,
  input  logic rst_n,
  input  logic d,          // functional data input
  input  logic si,         // scan serial input
  input  logic se,         // scan enable: 1=scan mode, 0=functional
  output logic q,
  output logic so          // scan serial output (= q, feeds next FF)
);
  logic d_mux;
  assign d_mux = se ? si : d;   // mux: functional vs scan path
  assign so = q;                 // scan output is FF output

  always_ff @(posedge clk or negedge rst_n) begin
    if (!rst_n) q <= 1'b0;
    else        q <= d_mux;
  end
endmodule

// In a design with N flip-flops, they are chained:
// scan_in → FF[0].si → FF[0].so → FF[1].si → ... → FF[N-1].so → scan_out
// Pattern shift-in: apply N clocks with SE=1
// Capture: apply 1 clock with SE=0 (functional capture)
// Shift-out: apply N clocks with SE=1, sample scan_out

ATPG Flow — Automatic Test Pattern Generation

01
Scan-inserted netlist input
ATPG tool reads the post-scan-insertion gate-level netlist (from Synopsys DFT Compiler or Cadence DFT Advisor). All flip-flops are now scan flops. Scan chains are stitched.
02
Fault list generation
Tool creates a list of all possible faults based on the fault model (stuck-at: 2 faults per node; transition: 2 per node; path delay: one per critical path). A design with 1M nets has ~2M stuck-at faults.
03
Pattern generation (D-algorithm / PODEM / FAN)
For each undetected fault, ATPG uses Boolean satisfiability or search algorithms to find input values that activate the fault and propagate its effect to an observable output. Some faults are redundant (provably undetectable) and are excluded.
04
Fault simulation & coverage report
Each generated pattern is fault-simulated against the full fault list to count how many additional faults each pattern detects. The process continues until the coverage target is met or no more patterns can be generated.
05
Pattern delivery to ATE
Patterns are output in STIL (Standard Test Interface Language) or WGL format for loading into ATE (Teradyne UltraFLEX, Advantest T2000). The ATE applies the patterns to the physical chip at wafer sort and package test.
ATPG ToolVendorFault Models Supported
TetraMAX (TMAX)SynopsysStuck-at, transition, path delay, IDDQ, cell-aware
ModusCadenceStuck-at, transition, path delay, cell-aware, bridge
FastScan / TessentSiemens (Mentor)Stuck-at, transition, path delay, IDDQ, cell-aware
Deep Dive — Functional
Functional Test Types

Functional tests verify that a chip behaves correctly under real operating conditions. Unlike structural tests that use artificial patterns, functional tests use real application vectors — or self-test engines that mimic real operation.

MBIST — Memory Built-In Self Test

Every SoC contains multiple memories (SRAMs, ROMs, register files). Memories are particularly prone to manufacturing faults (stuck bits, coupling faults, address decoder faults). MBIST embeds a hardware FSM inside the chip that tests all memories autonomously — no ATE required.

MBIST AlgorithmWhat It TestsPattern Written/Read
March C−Stuck-at, transition, coupling faults (most common)↑W0, ↑R0W1, ↑R1W0, ↓R0W1, ↓R1W0, ↓R0
CheckerboardAdjacent-cell coupling (alternating 0/1 pattern)0101…/1010… written, read back, inverted, read again
Walking 1s / 0sAddress decoder faults, bit-line opensSingle 1 (or 0) walked through all addresses
GALPATAll coupling faults (exhaustive but slow)Each cell flipped, all others read — O(N²) complexity
March BUnlinked memory faults9-operation march sequence covering all fault types
SystemVerilogMarch C- algorithm in hardware FSM
// MBIST controller — March C- on a 256×8 SRAM
// 6 operations: ↑W0, ↑R0W1, ↑R1W0, ↓R0W1, ↓R1W0, ↓R0
module mbist_ctrl #(
  parameter DEPTH = 256,
  parameter WIDTH = 8
)(
  input  logic clk, rst_n, bist_start,
  output logic bist_done, bist_pass,
  // SRAM interface
  output logic [7:0]  mem_addr,
  output logic [7:0]  mem_wdata,
  output logic        mem_we,
  input  logic [7:0]  mem_rdata
);
  typedef enum logic [2:0] {
    IDLE, OP0_W0, OP1_R0W1, OP2_R1W0,
    OP3_R0W1_DN, OP4_R1W0_DN, OP5_R0, DONE
  } state_t;

  state_t state;
  logic [7:0] addr;
  logic fail_flag;

  always_ff @(posedge clk or negedge rst_n) begin
    if (!rst_n) begin
      state <= IDLE; bist_done <= 0; bist_pass <= 0;
      addr <= 0; fail_flag <= 0;
    end else begin
      mem_we <= 0;
      case (state)
        IDLE: if (bist_start) begin state <= OP0_W0; addr <= 0; end

        OP0_W0: begin   // ↑W0: write 0 to all addresses (ascending)
          mem_we <= 1; mem_wdata <= 8'h00; mem_addr <= addr;
          if (addr == DEPTH-1) begin state <= OP1_R0W1; addr <= 0; end
          else addr <= addr + 1;
        end

        OP1_R0W1: begin // ↑R0W1: read 0, write 1 (ascending)
          mem_addr <= addr;
          if (mem_rdata != 8'h00) fail_flag <= 1;  // read verify
          mem_we <= 1; mem_wdata <= 8'hFF;
          if (addr == DEPTH-1) begin state <= OP2_R1W0; addr <= 0; end
          else addr <= addr + 1;
        end
        // ... remaining operations follow same pattern ...

        DONE: begin bist_done <= 1; bist_pass <= ~fail_flag; end
      endcase
    end
  end
endmodule

LBIST — Logic Built-In Self Test

LBIST tests the chip's logic (not memories) by generating pseudo-random test patterns on-chip and compressing the outputs into a signature that is compared against a golden value.

ComponentFull NameFunction
PRPGPseudo-Random Pattern GeneratorLFSR that generates pseudo-random patterns and feeds them into the scan chains as test stimuli
MISRMultiple Input Shift RegisterXOR-based compactor at the scan chain outputs — compresses millions of response bits into a 32-bit or 64-bit signature
Golden SignatureExpected MISR outputPre-computed from a defect-free simulation — stored in ROM or OTP memory on-chip
LBIST ControllerFSM managing the testControls PRPG seed, number of patterns, capture mode, MISR read-out, and pass/fail comparison
LBIST at power-on (ISO 26262): Automotive chips must run LBIST every time the car starts, before the ECU is allowed to control safety systems. If the LBIST signature doesn't match, the chip reports a fault and the system enters safe mode. This is a hardware-enforced safety mechanism — completely invisible to software.

Boundary Scan — JTAG IEEE 1149.1

Boundary scan adds a shift register cell around every I/O pin of every chip on a PCB. By daisy-chaining these cells through the standard 4-pin JTAG interface (TCK, TMS, TDI, TDO), a test controller can drive signals onto the board and observe responses — without any physical probe access to internal nodes.

JTAG PinNameFunction
TCKTest ClockDrives the boundary scan shift register — typically 1–100 MHz
TMSTest Mode SelectNavigates the TAP (Test Access Port) state machine — 16 states
TDITest Data InSerial input for shifting data/instructions into the scan register
TDOTest Data OutSerial output from the last cell in the chain — observed data
TRST*Test Reset (optional)Asynchronously resets the TAP state machine

What boundary scan tests: Board-level interconnect — solder joints between chips, trace opens/shorts, chip-to-chip connectivity. Also used for in-system programming (flash loading), debugging (via ARM Debug Access Port extensions), and field diagnostics.

Functional ATE Vectors

Functional ATE vectors apply real application patterns — boot sequences, instruction execution, DMA transfers, protocol handshakes — to the packaged chip on the ATE floor. These complement structural tests by catching specification failures that structural patterns cannot detect:

Production Test Strategy
Where Each Test Type Is Applied
Test StageTest Types UsedWhat's CaughtCost/Chip
Wafer Sort (EWS)Structural: stuck-at, transition, IDDQGross manufacturing defects before packaging — saves packaging cost on bad dieLow (ms/chip)
Package Test (Final)Structural: all faults + functional: MBIST, LBIST, boundary scanDefects introduced during packaging, final quality gateMedium ($0.10–$5)
System Bring-upFunctional: BIST, JTAG, board-levelBoard assembly defects, chip-to-chip interface issues, firmware compatibilityHigh (hours/board)
Power-on (In-field)Functional: LBIST (automotive), MBISTLatent defects that develop over time (aging), confirms chip is healthy before safety-critical operationMinimal (seconds)
Failure AnalysisAll types + physical FA (FIB, SEM)Root cause of field returns — feeds back to yield improvementHigh (days/sample)
Industry standard: Production silicon uses both structural AND functional testing in sequence. Structural tests (ATPG) run first at the ATE — they are fast (milliseconds) and catch the vast majority of manufacturing defects. Functional tests run after — they are slower but verify the chip works as a system. Skipping either type increases the risk of shipping defective parts to customers.
Key Metric
Fault Coverage — How to Measure & Hit 99%

Fault Coverage (FC) = (Detected Faults + Potentially Detected Faults) / Total Modelled Faults × 100%

Fault CategorySymbolMeaning
DetectedDTAt least one pattern activates and observes this fault — definitely tested
Potentially DetectedPDPattern activates the fault but the effect may be masked — possibly tested
UndetectedUDNo pattern found that detects this fault — either ATPG couldn't find one or the fault is redundant
RedundantREMathematically proven to be undetectable — the defect cannot change the observable output for any input. Excluded from coverage calculation.
Not Controlled / Not ObservedNC/NOFault cannot be detected because the node can't be set (NC) or the effect can't be observed (NO) — often X-state sources or black boxes

How to Improve Stuck-At Coverage When You're Stuck Below 99%

1
Identify hard-to-test structures
Run ATPG and analyse the undetected/untestable fault report. Most coverage gaps trace to: asynchronous resets, black-box modules, clock gating cells, or test-mode lockouts. ATPG tool generates a list of UD faults with the reason code.
2
Add test points (TPs)
Insert observation test points (an OR gate feeding a scan flop) near nodes that can't be observed. Insert control test points (a MUX fed by a scan flop) at hard-to-control nodes. These add area but directly improve coverage of otherwise untestable logic.
3
Fix X-sources
Uninitialized flops, memories read before write, and analog outputs driving digital logic create X-values that mask fault effects. Constrain X-sources in the ATPG model, add tie-offs, or use X-bounding techniques. X-state handling is the single biggest coverage killer in complex SoCs.
4
Increase scan chain count / reduce chain length
More scan chains = more parallel shift-in paths = better pattern concurrency and shorter test time. Target 500–2000 flops per chain. Too-long chains cause test-time explosion; too-short chains waste I/O pins.
5
Add EDT / compression
EDT (Embedded Deterministic Test) adds an on-chip decompressor between the scan I/O and the internal scan chains, allowing 50–100× pattern compression — same coverage, 50–100× less ATE time, dramatically lower test cost per chip.
Interview Preparation
Structural vs Functional Testing — Top Interview Q&A

These questions appear at Qualcomm, Intel, NVIDIA, Texas Instruments, Broadcom, MediaTek, and semiconductor test companies targeting DFT engineers and digital design engineers.

Q1. What is the difference between structural and functional testing?

Structural testing targets manufacturing defects using abstract fault models (stuck-at, transition, bridging). Patterns are generated automatically by ATPG. It does not care what the chip is supposed to do — only whether physical wires and gates are intact. Metric: fault coverage %.

Functional testing verifies that the chip meets its specification by applying real application stimuli and checking outputs. It catches bugs in the design, not defects in manufacturing. Examples: MBIST (does this memory read back what was written?), functional ATE vectors (does the processor execute ADD correctly?).

Key interview point: A chip can pass 100% structural coverage and still be functionally broken if there's a design bug. A chip can also pass all functional tests but have a latent defect that causes early field failure — which structural testing would catch.

Q2. Why do we need scan chains? Can't we test without them?

Without scan chains, internal flip-flops are only reachable through long chains of combinational logic from primary inputs. To set a specific flip-flop to a known state, you might need to apply hundreds of clock cycles of specific input patterns — and then you can only observe its effect after propagating through more logic to a primary output.

Scan chains make every flip-flop directly controllable (shift in any value) and directly observable (shift out its captured value) with only 2×N clock cycles regardless of design depth (N = chain length). This transforms an exponentially hard test problem into a linear one.

The area overhead is typically 5–10% for the added scan muxes, but the alternative — untestable or uneconomically testable silicon — is far worse.

Q3. What is the difference between stuck-at and transition faults?

Stuck-at fault (SAF): A wire is permanently fixed at 0 or 1 — it cannot change regardless of what logic drives it. Modelled as a DC defect (short to rail). Detected by applying the opposite value and checking if the output changes. Tests can run at slow speed — just need to set the stuck node to the opposite value.

Transition fault (TF): A wire CAN change, but does so too slowly. It fails to complete a 0→1 or 1→0 transition within the clock period. Modelled as a resistive open or weak drive. Requires at-speed testing (launch at functional frequency) because a slow-to-rise fault is invisible at reduced test frequency.

In practice, stuck-at tests are run first (they're cheaper — no speed constraints). Transition tests are added to catch timing-related manufacturing defects that stuck-at misses.

Q4. What is ATPG and how does it work?

ATPG (Automatic Test Pattern Generation) automatically generates input patterns that detect manufacturing faults. Given a scan-inserted netlist, ATPG:

1. Lists all faults — for stuck-at: 2 faults per net (SA0, SA1), ~2M faults for a 1M-net design.
2. For each fault, finds a test: set the stuck node to its opposite value (activate), trace the fault effect through gates to a scan flop or primary output (propagate), and verify the observable value is wrong (observe).
3. Uses algorithms like D-algorithm, PODEM, or FAN — essentially Boolean satisfiability searches through the circuit.
4. Some faults are redundant (no test exists) — ATPG proves this and excludes them.
5. Outputs patterns in STIL/WGL format for ATE.

Tools: Synopsys TetraMAX, Cadence Modus, Siemens Tessent.

Q5. What is fault coverage and what is the typical target?

Fault coverage = (Detected + Potentially Detected faults) / (Total faults − Redundant faults) × 100%

Industry targets vary by application:

The gap between 95% and 99% matters a lot at scale: at 1 billion chips shipped, going from 95% to 99% coverage cuts field returns 4×. At $5 warranty cost per return, that's $200M saved.

Q6. What is BIST and when would you prefer it over external ATE?

BIST (Built-In Self Test) embeds the test engine on the chip itself — eliminating the need for external ATE for those specific tests. Prefer BIST when:

Drawback of BIST: it adds die area (PRPG, MISR, controller). Also, LBIST coverage is typically lower than deterministic ATPG because pseudo-random patterns miss correlated faults.

Q7. What is the purpose of IDDQ testing?

IDDQ (quiescent supply current) testing measures the chip's DC current while all inputs are held static (no switching). A defect-free CMOS circuit draws near-zero static current because complementary P-channel and N-channel transistors never form a DC path simultaneously in steady state.

Defects that create extra current paths — gate oxide shorts (GOS), resistive bridges between VDD and VSS, parasitic gate-to-drain capacitors — cause IDDQ to be 10× to 1000× higher than normal. These defects often don't create a stuck-at or transition fault but will cause premature field failure due to oxide breakdown or thermal degradation.

IDDQ is particularly valuable for: detecting high-resistance bridges, gate oxide reliability screening, and identifying chips that are structurally "pass" but are reliability risks.

Q8. What is EDT (Embedded Deterministic Test) and why is it used?

EDT is a test compression technique that reduces ATE test time by 50–200× without significantly reducing fault coverage. Instead of shifting full test patterns directly into scan chains, EDT adds two on-chip modules:

Decompressor: Expands a small number of ATE channels (e.g., 8 channels) into hundreds of scan chains simultaneously using XOR logic. A single ATE shift cycle fills many chains in parallel.

Compactor: At the scan output, XOR-compresses responses from hundreds of chains into a few ATE channels for observation.

Result: ATE only needs to handle 8 channels instead of 800, so the same number of scan chains can be tested in 1/100th the time. Test cost per chip drops proportionally. EDT is standard on all advanced chips — without it, test time at 7nm would be economically prohibitive.

Structural vs Functional Testing — Why Both Are Non-Negotiable

When a chip rolls off the production line, two fundamental questions must be answered before it can be shipped to a customer. The first is physical: was this chip built correctly? The fabrication process, despite its precision, introduces defects — a stray particle on a photomask, a slightly misaligned via, a marginal implant dose — that can make a wire permanently stuck at one logic level, or create a short between two adjacent metal lines. Structural testing exists to catch these manufacturing defects with mathematical certainty, using fault models that abstract physical reality into logical behaviour.

The second question is behavioural: does this chip do what the designer intended? A chip can be manufactured perfectly — every wire intact, every gate functioning — and still be wrong if the RTL had a bug. A FIFO that fills to 255 entries instead of 256 before asserting full. An AXI slave that holds READY low for one cycle too long under back-pressure. A divider that returns the correct quotient but the wrong remainder when the dividend is zero. Functional testing catches these design errors through application-realistic stimuli that structural patterns would never generate.

The complementary nature of both test philosophies

Neither test type subsumes the other. Structural tests are fast, automatic, and mathematically rigorous about coverage of physical defects — but they are deliberately ignorant of what the chip is supposed to compute. Functional tests are specification-aware and system-realistic — but they cover only the scenarios a human thought to test. Together they form the complete quality gate between the fab and the customer.

In modern SoC design, DFT engineers work alongside RTL designers from the very first design stage — not as an afterthought. Scan insertion, BIST engine placement, and testability analysis happen concurrently with functional design, because design choices that improve RTL elegance often hurt testability, and the cost of low fault coverage at ATE is paid in every chip that reaches a customer with a latent defect.