What is DFT Sign-off?
DFT sign-off is the formal gate that a chip must pass before its test strategy is approved for production. It is analogous to timing sign-off (STA) in the physical design flow: just as no chip ships with unclosed timing violations, no chip ships without meeting its DFT coverage and quality targets. The DFT sign-off review is conducted by the DFT team, reviewed by the design team lead, and approved by the test engineering manager before the design is released to the foundry for tape-out.
Sign-off answers a deceptively simple question: "If we run this test on the ATE, will we reliably find all meaningful defects while not incorrectly rejecting good dies?" Answering it requires demonstrating adequate fault coverage, acceptable pattern count and test time, power compliance during test, ATE pattern readiness, and a clear plan for handling field returns.
DFT sign-off is not a formality — it is the binding quality commitment before silicon investment. A 1% coverage miss at 99% vs 98% can double the DPPM shipped to customers and trigger expensive field recalls.
Fault Coverage Metrics
Fault coverage is the primary DFT metric. It quantifies what fraction of all modeled fault sites are detected by the test pattern set. Different fault models capture different physical defect mechanisms, so coverage is reported separately for each model.
The distinction between "untestable" and "undetected" is critical for sign-off. Untestable faults (also called ATEs — always-testable-exceptions in some tools, or UC — untestable-collateral) are excluded from the denominator because they are physically impossible to detect. Undetected faults (faults the tool could detect with more patterns or time) remain in the denominator and drag down coverage. Sign-off engineers must justify every untestable fault with an analysis showing it cannot correspond to a real defect.
| Fault Type | Coverage Formula | Industry Target | Tool Report Name |
|---|---|---|---|
| Stuck-At (SA) | SA_detected / SA_testable | >99% (99.5%+ tier-1) | Stuck-at fault coverage |
| Transition Fault (TF) | TF_detected / TF_testable | 92–96% | Transition fault coverage (at-speed) |
| Path Delay (PD) | PD_detected / PD_testable | >90% critical paths | Path delay coverage |
| Bridging | BR_detected / BR_testable | >85% (where modeled) | Bridging fault coverage |
| Cell Aware (CA) | CA_detected / CA_testable | >95% (advanced nodes) | Cell-aware coverage |
For advanced nodes (7 nm and below), cell-aware ATPG is increasingly required. Unlike stuck-at or transition fault models that treat each gate as a black box, cell-aware ATPG uses a detailed transistor-level fault model for each cell in the library. This catches intra-cell defects (broken transistors inside a complex gate) that traditional models miss. Cell-aware patterns are generated by tool vendors as part of the PDK.
DPPM — Defective Parts Per Million
DPPM (Defective Parts Per Million) is the quality metric that customers care about. It measures how many defective chips escape the test and reach the customer's hands, expressed per million chips shipped. DPPM directly correlates with warranty costs, field reliability, and customer trust.
DPPM Calculation
The exponential improvement near 100% coverage is what makes every fraction of a percent matter. Going from 99% to 99.5% halves the DPPM. Going from 99.5% to 99.9% halves it again. This is why tier-1 companies chase 99.5%+ coverage aggressively — the DPPM improvement at high coverage is non-linear.
Automotive chips (ISO 26262 ASIL-D) demand DPPM below 1 — which requires not only 99.9%+ stuck-at coverage but also full transition fault coverage, cell-aware patterns, IDDQ screening, and often burn-in test. Consumer chips can typically accept 10–100 DPPM depending on the application and price point.
Test Escape Analysis
A test escape is a defective die that passes production test and is shipped to a customer. Test escapes are the worst possible DFT outcome: they mean real defects reached the field, often causing system failures long after sale.
Root Causes of Test Escapes
- Uncovered faults: The defect corresponds to a fault that is in the "undetected" category — the ATPG tool ran out of time or patterns before covering it.
- Fault model mismatch: The physical defect is a resistive open or a partial bridge — it behaves differently from a clean stuck-at-0 or stuck-at-1 model. The stuck-at patterns cannot reliably detect it.
- Pattern starvation near 99% boundary: The last 0.5–1% of faults often require many more patterns than the first 98%. If the ATPG run was time-limited, this long tail may be underserved.
- ATE translation errors: The STIL-to-ATE format conversion introduced timing or logic errors in the patterns. The patterns work in simulation but fail on silicon.
- Marginal defects at parametric limits: A defect that is borderline — it does not affect DC behavior but causes failures at speed or at low temperature. Stuck-at patterns (which are applied at slow functional speed) do not find it; only at-speed transition or path-delay patterns do.
Escape Analysis Process
When a field return is analyzed (physical failure analysis, or PFA), the defect location and type are fed back to the DFT team. The team asks: "Is this defect covered by our pattern set?" If not, the pattern set is augmented. This feedback loop is called test escape analysis and is a key input to the DFT roadmap for the next tape-out.
ATPG Efficiency Metrics
Fault coverage alone does not define a good test strategy. A test with 99.5% coverage but 100,000 patterns is impractical — it would take hours per die on ATE. DFT sign-off also requires demonstrating that the pattern set is efficient: coverage achieved with minimal patterns in minimal time.
| Metric | Definition | Typical Target | Impact if Missed |
|---|---|---|---|
| Pattern Count (SA) | Number of stuck-at test patterns | <5,000 (compressed) | Test time too long; ATE memory exceeded |
| Pattern Count (TF) | Number of transition fault patterns | <10,000 (compressed) | At-speed test time balloons |
| Compression Ratio | Effective patterns / ATE patterns (EDT) | 32×–128× | ATE memory / test time limit |
| ATPG CPU Time | Wall-clock time to generate patterns | <24 hours | Delays tape-out schedule |
| Fault Simulation Accuracy | Simulation-vs-silicon coverage correlation | >98% match | Coverage claims are unreliable |
| Untestable % (SA) | Fraction of SA faults classified untestable | <2% | Coverage ceiling too low; must justify each |
# ── Stuck-At Fault Coverage Report ────────────────────────── Total Faults : 4,823,412 Detected Faults : 4,774,178 Untestable Faults : 43,110 (0.89%) Undetected Faults : 6,124 (0.13%) Fault Coverage : 99.87% (detected / testable) Test Coverage : 98.99% (detected / total) # ── Pattern Metrics ────────────────────────────────────────── SA Patterns : 3,847 (EDT compressed, 64x) TF Patterns (LOC) : 6,211 (at-speed, 128x) Scan Chains : 128 (EDT channels: 2) Chain Length (max) : 1,924 FFs # ── Test Time Estimate ─────────────────────────────────────── SA Test Time : 74 ms (at 100 MHz shift, 64x compression) TF Test Time : 121 ms (at-speed capture, 128x compression) Total Estimated : 195 ms (with overhead) # ── Sign-off Status ────────────────────────────────────────── PASS: SA coverage >99% threshold [99.87% PASS] PASS: TF coverage >92% threshold [94.12% PASS] PASS: Pattern count <10,000 [SA:3847 TF:6211 PASS] PASS: Test time <500ms budget [195ms PASS]
ATE — Automatic Test Equipment
Once patterns are generated and sign-off metrics are satisfied, the test must be applied to real silicon on production ATE. Automatic Test Equipment is specialized hardware that sits at the heart of semiconductor manufacturing — every chip in the world passes through an ATE before shipping.
What ATE Does
ATE performs three fundamental functions:
- Apply test stimulus: Drive test vectors onto the chip's input pins with precise timing (picosecond accuracy), at-speed (100 MHz to several GHz for RF chips).
- Measure response: Capture the chip's output on every pin and compare it to the expected response (GOOD/BAD decision).
- Sort dies: Pass dies continue to packaging. Fail dies are marked (inked) on the wafer map. Bins classify fail modes (SA fail, TF fail, leakage fail, functional fail) for yield analysis.
| ATE Platform | Vendor | Target Segment | Pin Count | Speed |
|---|---|---|---|---|
| Ultraflex / UltraFLEXplus | Teradyne | SoC, mobile, data center | Up to 1024 | Up to 3.2 GHz |
| J750 | Teradyne | Mixed-signal, MCU, consumer | Up to 1024 | 200 MHz digital |
| T2000 | Advantest | Memory, SoC | Up to 2048 | Up to 800 MHz |
| V93000 (SmarTest) | Advantest | High-speed digital, automotive | Up to 1024+ | Up to 6.4 GHz |
ATE Resources and Constraints
ATE resources directly constrain the test strategy. The key constraints are:
- Pin count (channels): The chip's scan-in, scan-out, and control pins must fit within the ATE's available channels. Large SoCs may require multiple ATE channel cards.
- Pattern memory: ATE stores test patterns in on-board memory. Uncompressed pattern sets for large SoCs can exceed ATE memory. EDT compression is essential to fit patterns within memory budgets (typically 64 MB to 512 MB per tester).
- Timing accuracy: At-speed tests require ps-level edge placement accuracy to reliably apply and measure clock edges at 100 MHz–1 GHz.
- Power supply channels: Multi-VDD chips require multiple independent supply channels with current monitoring (for IDDQ testing).
Tester Formats — WGL, STIL, VCD
ATPG tools generate patterns in their internal proprietary formats. To apply these patterns on ATE, they must be converted into ATE-compatible formats. This translation step is called pattern retargeting and is one of the final steps in DFT sign-off.
| Format | Standard / Source | Description | Used By |
|---|---|---|---|
| STIL | IEEE 1450 | Standard Test Interface Language — portable, ASCII, describes waveforms + vectors + scan chain topology | All ATE flows; intermediate standard |
| WGL | Synopsys (de facto) | Waveform Generation Language — Synopsys TetraMAX native output; widely supported by Teradyne | Teradyne J750, Ultraflex |
| AVC | Advantest (ASCII Vector) | Advantest-native ASCII format for T2000 and V93000 platforms | Advantest T2000, V93000 |
| VCD | IEEE 1364 | Value Change Dump — simulation dump format; sometimes used as pattern source but not a primary ATE format | Simulation, debug |
| CTL | IEEE 1149.8 / Tessent | Core Test Language — describes core-level patterns in hierarchical DFT (IP reuse) | Hierarchical DFT, IP integration |
The conversion from STIL to ATE-native format is not simply a syntax translation. Timing must be re-specified in the ATE's timing resolution (typically in ps). Scan chain waveforms must be mapped to the ATE's available pin timing domains. Any ATE-unsupported features (e.g., complex multi-clock capture) must be re-implemented in the ATE's test program language. This step is validated by running the ATE patterns in an ATE simulator and comparing responses against the ATPG simulation.
Test Time on ATE
ATE time is a direct manufacturing cost. Every second a die sits under the tester probes adds to its cost of goods sold (COGS). For high-volume chips (hundreds of millions per year), even 10 ms of wasted test time translates to millions of dollars annually.
This example shows why EDT compression is economically critical. A 64× compression ratio reduces test time from 100 ms to 1.56 ms — a 64× reduction in ATE cost. At an ATE operating cost of $300/hour ($0.083/second), the difference between compressed and uncompressed is:
- Uncompressed: 100 ms × $0.083/s = $0.0083 per die (ATE fraction)
- Compressed: 1.56 ms × $0.083/s = $0.00013 per die
At 100 million dies per year, this is a $830,000 vs $13,000 annual ATE cost difference — a $817,000 saving from compression alone.
DFT Sign-off Checklist
The following is the standard DFT sign-off checklist used at tape-out. Every item must be verified and documented before the design is released to the foundry.