Why Power Matters in DFT
Every VLSI engineer learns early that scan shift consumes far more power than normal functional operation. The gap is not marginal — measurements on production SoCs consistently show that scan shift causes 2–5× the switching activity of the chip's worst-case functional workload. For a chip designed with a 1 W thermal envelope, running the scan chains at full speed can briefly draw 3–5 W. The consequences are severe:
- IR drop: The surge in dynamic current causes the supply voltage to sag at distant corners of the die. If IR drop exceeds 10% of VDD, flip-flops near the affected region may miscapture, turning good logic into a test fail — even though the silicon has no defect.
- Electromigration (EM): Unusually high current densities in power-grid metal lines during scan can exceed EM reliability limits, shortening the chip's long-term reliability even if it does not fail at test time.
- Thermal stress: Concentrated switching in a specific region heats that region faster than the package can dissipate. Scan patterns that repeatedly toggle the same logic cluster can produce localized hot spots.
- Test-induced yield loss (TIYL): The most insidious effect. A chip with no real defects fails the test purely because scan-induced stress (IR drop, thermal) temporarily pushes it outside its operating window. TIYL reduces the apparent yield without any wafer-processing problem.
Modern SoC design rules at 7 nm and below mandate power-aware DFT as a sign-off requirement — not an optional refinement. DFT engineers must demonstrate that switching activity during scan stays within the power grid's safe operating window before the design is taped out.
Scan shift is essentially random data streaming through every flip-flop on every clock edge. Without deliberate constraints, the toggle rate approaches 50% — the statistical average for uncorrelated binary data. Functional logic rarely reaches even 20% toggle rate because data values are correlated and many paths are idle.
X-States and X-Masking
In digital simulation, logic values are not just 0 or 1. They can also be X — an unknown or don't-care value. X-states arise from multiple sources in real designs:
- Uninitialized memories: SRAMs and register files have indeterminate content at power-on.
- Tri-state buses: Bus outputs floating when no driver is enabled.
- Power-gated domains: Flip-flops inside a shut-off power domain output X after the domain is powered down.
- Clock domain crossings (CDC): Metastability in synchronizers can produce X at simulation time.
- Functional don't-cares: Some outputs are genuinely not important in certain operating modes.
X-states are toxic to DFT in two distinct ways. First, in LBIST with MISR (Multiple Input Signature Register): the MISR accumulates a compressed signature of all scan outputs over many capture cycles. If a single X propagates into the MISR, it corrupts the signature. The chip will show a mismatching signature even if all real faults are absent — a false fail. Second, in ATPG compaction: EDT and other compactors rely on combining multiple fault-detecting vectors. If an X appears in a scan cell output that the compactor is combining, the combination is invalid.
What X-Masking Does
X-masking selectively blocks scan chain outputs that are known to contain X values before they enter the signature accumulator or compactor. The mask acts as a qualifier: a '0' mask bit silences that scan output (the value is ignored), while a '1' mask bit passes the output through for comparison.
X-Masking Implementation
There are two primary implementation approaches, each with different tradeoffs in hardware cost versus ATE memory.
Hardware X-Masking
A dedicated AND-mask register is added — one flip-flop per scan output. During the mask-shift phase, the ATE loads the mask values into this register. Each scan output is then ANDed with its mask bit before reaching the compactor or MISR. A '0' mask bit completely suppresses that chain output. Hardware X-masking is the most robust approach because it is pattern-independent — the mask register is loaded once per test segment, not once per pattern.
Software X-Masking (ATPG-based)
The ATPG tool's fault simulation engine identifies, for each test pattern, which scan output bits will contain X values when the pattern is applied to the design. The mask bits are encoded as part of the test pattern itself and are included in the ATE stimulus file. The ATE applies the mask bits to the mask register just before unloading each scan chain.
| Method | How It Works | Hardware Cost | ATE Memory Impact | Best For |
|---|---|---|---|---|
| Hardware X-masking | AND-mask register per scan output; loaded by ATE before test | 1 FF per scan output | Low (mask is static per segment) | LBIST, EDT compaction |
| Software X-masking | ATPG encodes mask bits into pattern; ATE shifts them per pattern | None | High (mask bits per pattern per chain) | Stuck-at ATPG with few X sources |
| X-tolerance (EDT) | EDT decompressor has X-tolerance; X at decompressor output fills without corrupting | EDT hardware already present | None extra | EDT-compressed designs |
Power-Aware ATPG
Standard ATPG has one goal: detect as many faults as possible with as few patterns as possible. It has no awareness of how much power those patterns consume when shifted through the scan chains. Power-aware ATPG adds a second objective: ensure that the switching activity during every shift clock stays below a specified toggle rate threshold.
Toggle Rate Constraint
A typical constraint is: "No more than 15% of all scan cells may transition (0→1 or 1→0) on any single shift clock." The ATPG tool tracks, for each tentative don't-care fill assignment, the resulting toggle count increment. When the cumulative toggle count for a shift clock would exceed the threshold, the tool backtracks and selects a different fill value — one that doesn't cause an additional transition.
Low-Capture-Power Constraints
The capture cycle — the single functional clock applied after loading the test pattern — is often the highest-power moment in a scan test. All the loaded pattern values propagate through the combinational logic simultaneously. Power-aware ATPG can apply low-capture-power constraints that limit the number of simultaneously switching primary inputs and internal flip-flop outputs during capture. This reduces the instantaneous current spike at the capture edge.
## TetraMAX power-aware ATPG constraints ## Limit shift switching activity to 15% of chain length set_dft_signal -type ScanClock -view existing_dft \ -port clk -timing {45 55} ## Low power shift constraint set_atpg_constraint -shift_power_limit 15 ; # 15% max toggle rate set_atpg_constraint -capture_power_limit 20 ; # 20% max capture toggle ## Tessent equivalent set_pattern_filtering -low_power_shift on set_pattern_filtering -shift_power_factor 0.15 set_pattern_filtering -capture_power_factor 0.20 ## Run ATPG with power constraint run_atpg -effort high
The tradeoff: applying power constraints increases pattern count. When the tool cannot use its preferred don't-care fill (because it would cause too many toggles), it may need a separate pattern to cover faults that could have been combined. Industry experience shows a 15–20% power limit increases pattern count by 10–40% compared to unconstrained ATPG. This is an acceptable tradeoff given the alternative is TIYL.
Toggle Rate Analysis
Before applying power-aware ATPG, DFT engineers run toggle rate analysis to characterize the baseline switching activity of the scan chains under the nominal (unconstrained) pattern set. This analysis reports:
- Per-pattern average toggle rate across all chains
- Per-scan-cell toggle count (identifies "hot" cells that transition frequently)
- Maximum single-shift-clock toggle count (the worst-case IR drop moment)
- Total energy estimate per pattern (useful for thermal budgeting)
| Design Node | Typical Unconstrained Toggle Rate | Industry Target | Impact if Exceeded |
|---|---|---|---|
| 28 nm and above | 35–45% | <25% | IR drop, minor TIYL risk |
| 16/14 nm FinFET | 40–50% | <20% | Significant IR drop, EM risk |
| 7/5 nm | 45–55% | <15% | High TIYL risk, thermal excursion |
| 3 nm and below | 50%+ | <10–12% | Mandatory constraint; test without it fails sign-off |
Scan Segmentation for Power Domains
Multi-VDD SoC designs partition the chip into several power domains — independent regions that can be powered up or down independently via power switches (PSW). This is the foundational technique for reducing leakage in idle blocks. The challenge for DFT: a scan chain that crosses a power domain boundary is physically broken when the downstream domain is off.
The Segmentation Solution
The DFT tool (Tessent Shell, Synopsys DFT Compiler) automatically segments scan chains per power domain when given a UPF/CPF power intent file. Each segment is a self-contained mini scan chain confined to one domain. The segments connect through the ATE's scan-in/scan-out pins — not through on-chip daisy-chaining across domain boundaries.
When Domain B is powered off, the DFT controller deasserts its scan enable (SE_B) and bypasses or skips that segment. Only Domain A's scan segment is shifted. Test coverage for Domain B is achieved in a separate power-on test mode where Domain B is brought up.
UPF-Aware DFT
The Unified Power Format (UPF, IEEE 1801) is the industry-standard way to express power intent — power domains, supply nets, power switches, isolation cells, level shifters, and retention registers. DFT tools that read UPF files are called UPF-aware, and they adjust their scan insertion and ATPG behavior accordingly.
Isolation Cells
At the boundary of a power-gated domain, isolation cells clamp outputs to a safe value (0 or 1) when the domain is off. During ATPG, the tool must model isolation cell behavior: when Domain B is off, its outputs to Domain A are clamped to the isolation value (not X). This allows ATPG to generate correct patterns for logic in Domain A that receives signals from Domain B through isolation cells.
Level Shifters
Signals crossing between VDD1 and VDD2 supply domains pass through level shifters that translate voltage levels. Level shifters are modeled as transparent buffers (functionally) during ATPG, but their internal structure must be tested too — typically via ATPG patterns that set their inputs to 0 and 1 and verify propagation through the shifted output.
Retention Registers
Retention registers are flip-flops with a shadow latch (balloon latch) that preserves data through power-down events. When the domain powers down, a SAVE signal transfers the FF's state to the shadow latch. When the domain powers back up, a RESTORE signal copies it back. For DFT:
- Retention FFs must be included in the scan chain (they need to be tested like any other FF).
- The SAVE/RESTORE control must be inactive during scan shift — otherwise, asserting SAVE would transfer scan data to the shadow latch, corrupting both.
- ATPG generates specific patterns to test the retention path itself: SAVE, power-down, RESTORE, and verify the retained value.
Clock Gating in DFT
Clock gating is the primary dynamic power reduction technique in synchronous designs. An ICG (Integrated Clock Gate) cell — a latch-based gate that suppresses the clock when the enable is deasserted — sits between the clock distribution network and the flip-flops in a given cluster. For test, clock gating creates a serious problem: if a clock gate is closed during scan shift, the FFs it feeds will not receive any shift clock pulses. Those FFs become invisible to the scan chain — they cannot be loaded or unloaded.
Test Mode Override
The solution is simple in concept but must be implemented carefully: during test mode, the scan enable (SE) signal is connected into the clock gate's enable path so that SE=1 forces the clock gate open. The standard approach is to OR the ICG enable with scan_enable:
// Integrated Clock Gate (ICG) with test override module icg_cell ( input EN, // functional enable input SE, // scan enable (test mode) input CLK, // clock in output GCLK // gated clock out ); reg en_latch; // Latch: transparent when CLK=0 (level-sensitive hold) always @(*) begin if (~CLK) en_latch = EN | SE; // SE forces gate open in test mode end assign GCLK = en_latch & CLK; endmodule // In the design hierarchy, ICG is instantiated as: // icg_cell u_icg (.EN(func_en), .SE(scan_enable), .CLK(clk), .GCLK(gated_clk)); // During scan shift: scan_enable=1 → GCLK follows CLK → all FFs receive shift clocks // During functional: scan_enable=0 → GCLK gated by func_en as normal
Missing clock gate test override is a common DFT bug. The symptom is a chain of FFs that appear stuck — they shift in a constant value regardless of scan_in. The DRC (Design Rule Check) step in scan insertion tools catches this and flags every ICG that lacks SE override as an untestable clock gate violation.
EDT Low-Power Features
Mentor Tessent's EDT (Embedded Deterministic Test) architecture includes built-in low-power features beyond what standard ATPG provides. Understanding these features is important for DFT engineers using EDT-compressed scan in low-power SoCs.
Correlated Pattern Generation
EDT's decompressor generates multiple scan chain inputs from a small number of ATE channels through a linear decompression network. The decompressor introduces correlations between what different chains receive. Low-power EDT exploits this: by choosing decompressor seeds that produce correlated bit streams, adjacent scan cells (which are often in the same clock domain) receive similar values. Since similar adjacent values mean fewer transitions, the correlated seeds naturally reduce switching activity without sacrificing fault coverage.
X-Tolerance in EDT
EDT's decompressor is inherently X-tolerant up to a design-time parameter called the X-tolerance level. If the design has, say, 5% of scan bits containing X values, and the EDT X-tolerance is set to 8%, the decompressor can accommodate all X sources without requiring explicit per-pattern X-masking. This simplifies the test flow significantly.
| EDT Feature | What It Does | Power Benefit | Coverage Impact |
|---|---|---|---|
| Correlated seeds | Adjacent chains get similar values from decompressor | 15–30% toggle rate reduction | Slight increase in pattern count |
| X-tolerance | Decompressor absorbs X values; no extra masking needed | Avoids mask register power overhead | None (coverage maintained) |
| Low-power shift (LPSHIFT) | Insert shift-pauses; shift slowly in high-power zones | Thermal peak reduction | None (coverage same, test time longer) |
| Segment gating | Power-gate inactive EDT segments during shift | 30–50% power for gated segment | None |
The combination of power-aware ATPG constraints, EDT correlated decompression, and scan segmentation can reduce test-mode power consumption to within 1.2–1.5× of functional power — a dramatic improvement over unconstrained scan's 3–5× overhead.