Home DFT Course Day 9 — DFT for Low Power
DFT Course · Day 09 of 12

DFT for Low Power
X-Masking · Power-Aware ATPG · Scan Segmentation

By EcrioniX · Updated June 2026 · ~50 min read
X-States X-Masking Power-Aware ATPG Toggle Rate Scan Segmentation UPF / IEEE 1801 Retention Registers Clock Gating DFT EDT Low Power

Why Power Matters in DFT

Every VLSI engineer learns early that scan shift consumes far more power than normal functional operation. The gap is not marginal — measurements on production SoCs consistently show that scan shift causes 2–5× the switching activity of the chip's worst-case functional workload. For a chip designed with a 1 W thermal envelope, running the scan chains at full speed can briefly draw 3–5 W. The consequences are severe:

Modern SoC design rules at 7 nm and below mandate power-aware DFT as a sign-off requirement — not an optional refinement. DFT engineers must demonstrate that switching activity during scan stays within the power grid's safe operating window before the design is taped out.

Key Insight

Scan shift is essentially random data streaming through every flip-flop on every clock edge. Without deliberate constraints, the toggle rate approaches 50% — the statistical average for uncorrelated binary data. Functional logic rarely reaches even 20% toggle rate because data values are correlated and many paths are idle.

Switching Activity: Functional vs Scan Shift
Toggle Rate % 10% Functional 50% Scan (unconstrained) 30% Scan (30% cap) 15% LP-ATPG mode danger

X-States and X-Masking

In digital simulation, logic values are not just 0 or 1. They can also be X — an unknown or don't-care value. X-states arise from multiple sources in real designs:

X-states are toxic to DFT in two distinct ways. First, in LBIST with MISR (Multiple Input Signature Register): the MISR accumulates a compressed signature of all scan outputs over many capture cycles. If a single X propagates into the MISR, it corrupts the signature. The chip will show a mismatching signature even if all real faults are absent — a false fail. Second, in ATPG compaction: EDT and other compactors rely on combining multiple fault-detecting vectors. If an X appears in a scan cell output that the compactor is combining, the combination is invalid.

Rule of Thumb: Never allow an X source to propagate to a scan output (scan_out) or a compactor input without explicitly masking or blocking it. Unmasked X values guarantee false failures and wasted ATE time.

What X-Masking Does

X-masking selectively blocks scan chain outputs that are known to contain X values before they enter the signature accumulator or compactor. The mask acts as a qualifier: a '0' mask bit silences that scan output (the value is ignored), while a '1' mask bit passes the output through for comparison.

X-Masking Architecture
Scan Out[0] Scan Out[1]=X & & Mask Reg [1,0] Valid → MISR Masked (0) MISR Clean Signature

X-Masking Implementation

There are two primary implementation approaches, each with different tradeoffs in hardware cost versus ATE memory.

Hardware X-Masking

A dedicated AND-mask register is added — one flip-flop per scan output. During the mask-shift phase, the ATE loads the mask values into this register. Each scan output is then ANDed with its mask bit before reaching the compactor or MISR. A '0' mask bit completely suppresses that chain output. Hardware X-masking is the most robust approach because it is pattern-independent — the mask register is loaded once per test segment, not once per pattern.

Software X-Masking (ATPG-based)

The ATPG tool's fault simulation engine identifies, for each test pattern, which scan output bits will contain X values when the pattern is applied to the design. The mask bits are encoded as part of the test pattern itself and are included in the ATE stimulus file. The ATE applies the mask bits to the mask register just before unloading each scan chain.

MethodHow It WorksHardware CostATE Memory ImpactBest For
Hardware X-maskingAND-mask register per scan output; loaded by ATE before test1 FF per scan outputLow (mask is static per segment)LBIST, EDT compaction
Software X-maskingATPG encodes mask bits into pattern; ATE shifts them per patternNoneHigh (mask bits per pattern per chain)Stuck-at ATPG with few X sources
X-tolerance (EDT)EDT decompressor has X-tolerance; X at decompressor output fills without corruptingEDT hardware already presentNone extraEDT-compressed designs

Power-Aware ATPG

Standard ATPG has one goal: detect as many faults as possible with as few patterns as possible. It has no awareness of how much power those patterns consume when shifted through the scan chains. Power-aware ATPG adds a second objective: ensure that the switching activity during every shift clock stays below a specified toggle rate threshold.

Toggle Rate Constraint

A typical constraint is: "No more than 15% of all scan cells may transition (0→1 or 1→0) on any single shift clock." The ATPG tool tracks, for each tentative don't-care fill assignment, the resulting toggle count increment. When the cumulative toggle count for a shift clock would exceed the threshold, the tool backtracks and selects a different fill value — one that doesn't cause an additional transition.

Low-Capture-Power Constraints

The capture cycle — the single functional clock applied after loading the test pattern — is often the highest-power moment in a scan test. All the loaded pattern values propagate through the combinational logic simultaneously. Power-aware ATPG can apply low-capture-power constraints that limit the number of simultaneously switching primary inputs and internal flip-flop outputs during capture. This reduces the instantaneous current spike at the capture edge.

ATPG Constraint — Power-Aware Settings (Synopsys TetraMAX / Tessent)
## TetraMAX power-aware ATPG constraints
## Limit shift switching activity to 15% of chain length
set_dft_signal -type ScanClock -view existing_dft \
    -port clk -timing {45 55}

## Low power shift constraint
set_atpg_constraint -shift_power_limit 15   ; # 15% max toggle rate
set_atpg_constraint -capture_power_limit 20  ; # 20% max capture toggle

## Tessent equivalent
set_pattern_filtering -low_power_shift on
set_pattern_filtering -shift_power_factor 0.15
set_pattern_filtering -capture_power_factor 0.20

## Run ATPG with power constraint
run_atpg -effort high

The tradeoff: applying power constraints increases pattern count. When the tool cannot use its preferred don't-care fill (because it would cause too many toggles), it may need a separate pattern to cover faults that could have been combined. Industry experience shows a 15–20% power limit increases pattern count by 10–40% compared to unconstrained ATPG. This is an acceptable tradeoff given the alternative is TIYL.

Toggle Rate Analysis

Before applying power-aware ATPG, DFT engineers run toggle rate analysis to characterize the baseline switching activity of the scan chains under the nominal (unconstrained) pattern set. This analysis reports:

Toggle Rate (per shift clock) = (Transitions on that clock) ÷ (Total scan cells) Average Toggle Rate = (Sum of all transitions across all shift clocks) ÷ (Total cells × Total shift clocks) Target: Average Toggle Rate < 20% Peak Toggle Rate (any single clock) < 30%
Design NodeTypical Unconstrained Toggle RateIndustry TargetImpact if Exceeded
28 nm and above35–45%<25%IR drop, minor TIYL risk
16/14 nm FinFET40–50%<20%Significant IR drop, EM risk
7/5 nm45–55%<15%High TIYL risk, thermal excursion
3 nm and below50%+<10–12%Mandatory constraint; test without it fails sign-off

Scan Segmentation for Power Domains

Multi-VDD SoC designs partition the chip into several power domains — independent regions that can be powered up or down independently via power switches (PSW). This is the foundational technique for reducing leakage in idle blocks. The challenge for DFT: a scan chain that crosses a power domain boundary is physically broken when the downstream domain is off.

Critical Rule: Scan chains must never cross a power domain boundary without explicit isolation and segmentation logic. An unmanaged cross-domain chain will produce all-X scan responses when the downstream domain is power-gated, causing 100% false-fail on every test.

The Segmentation Solution

The DFT tool (Tessent Shell, Synopsys DFT Compiler) automatically segments scan chains per power domain when given a UPF/CPF power intent file. Each segment is a self-contained mini scan chain confined to one domain. The segments connect through the ATE's scan-in/scan-out pins — not through on-chip daisy-chaining across domain boundaries.

Scan Segmentation — Two Power Domains
Power Domain A (Always-On) FF FF FF FF SE_A gating (AND) Power Domain B (Power-Gated) FF FF (Domain OFF — seg bypassed) domain boundary

When Domain B is powered off, the DFT controller deasserts its scan enable (SE_B) and bypasses or skips that segment. Only Domain A's scan segment is shifted. Test coverage for Domain B is achieved in a separate power-on test mode where Domain B is brought up.

UPF-Aware DFT

The Unified Power Format (UPF, IEEE 1801) is the industry-standard way to express power intent — power domains, supply nets, power switches, isolation cells, level shifters, and retention registers. DFT tools that read UPF files are called UPF-aware, and they adjust their scan insertion and ATPG behavior accordingly.

Isolation Cells

At the boundary of a power-gated domain, isolation cells clamp outputs to a safe value (0 or 1) when the domain is off. During ATPG, the tool must model isolation cell behavior: when Domain B is off, its outputs to Domain A are clamped to the isolation value (not X). This allows ATPG to generate correct patterns for logic in Domain A that receives signals from Domain B through isolation cells.

Level Shifters

Signals crossing between VDD1 and VDD2 supply domains pass through level shifters that translate voltage levels. Level shifters are modeled as transparent buffers (functionally) during ATPG, but their internal structure must be tested too — typically via ATPG patterns that set their inputs to 0 and 1 and verify propagation through the shifted output.

Retention Registers

Retention registers are flip-flops with a shadow latch (balloon latch) that preserves data through power-down events. When the domain powers down, a SAVE signal transfers the FF's state to the shadow latch. When the domain powers back up, a RESTORE signal copies it back. For DFT:

Clock Gating in DFT

Clock gating is the primary dynamic power reduction technique in synchronous designs. An ICG (Integrated Clock Gate) cell — a latch-based gate that suppresses the clock when the enable is deasserted — sits between the clock distribution network and the flip-flops in a given cluster. For test, clock gating creates a serious problem: if a clock gate is closed during scan shift, the FFs it feeds will not receive any shift clock pulses. Those FFs become invisible to the scan chain — they cannot be loaded or unloaded.

Test Mode Override

The solution is simple in concept but must be implemented carefully: during test mode, the scan enable (SE) signal is connected into the clock gate's enable path so that SE=1 forces the clock gate open. The standard approach is to OR the ICG enable with scan_enable:

Clock Gate Test Override — Verilog
// Integrated Clock Gate (ICG) with test override
module icg_cell (
  input  EN,          // functional enable
  input  SE,          // scan enable (test mode)
  input  CLK,         // clock in
  output GCLK         // gated clock out
);
  reg en_latch;
  // Latch: transparent when CLK=0 (level-sensitive hold)
  always @(*) begin
    if (~CLK) en_latch = EN | SE;  // SE forces gate open in test mode
  end
  assign GCLK = en_latch & CLK;
endmodule

// In the design hierarchy, ICG is instantiated as:
// icg_cell u_icg (.EN(func_en), .SE(scan_enable), .CLK(clk), .GCLK(gated_clk));
// During scan shift: scan_enable=1 → GCLK follows CLK → all FFs receive shift clocks
// During functional: scan_enable=0 → GCLK gated by func_en as normal

Missing clock gate test override is a common DFT bug. The symptom is a chain of FFs that appear stuck — they shift in a constant value regardless of scan_in. The DRC (Design Rule Check) step in scan insertion tools catches this and flags every ICG that lacks SE override as an untestable clock gate violation.

EDT Low-Power Features

Mentor Tessent's EDT (Embedded Deterministic Test) architecture includes built-in low-power features beyond what standard ATPG provides. Understanding these features is important for DFT engineers using EDT-compressed scan in low-power SoCs.

Correlated Pattern Generation

EDT's decompressor generates multiple scan chain inputs from a small number of ATE channels through a linear decompression network. The decompressor introduces correlations between what different chains receive. Low-power EDT exploits this: by choosing decompressor seeds that produce correlated bit streams, adjacent scan cells (which are often in the same clock domain) receive similar values. Since similar adjacent values mean fewer transitions, the correlated seeds naturally reduce switching activity without sacrificing fault coverage.

X-Tolerance in EDT

EDT's decompressor is inherently X-tolerant up to a design-time parameter called the X-tolerance level. If the design has, say, 5% of scan bits containing X values, and the EDT X-tolerance is set to 8%, the decompressor can accommodate all X sources without requiring explicit per-pattern X-masking. This simplifies the test flow significantly.

EDT FeatureWhat It DoesPower BenefitCoverage Impact
Correlated seedsAdjacent chains get similar values from decompressor15–30% toggle rate reductionSlight increase in pattern count
X-toleranceDecompressor absorbs X values; no extra masking neededAvoids mask register power overheadNone (coverage maintained)
Low-power shift (LPSHIFT)Insert shift-pauses; shift slowly in high-power zonesThermal peak reductionNone (coverage same, test time longer)
Segment gatingPower-gate inactive EDT segments during shift30–50% power for gated segmentNone

The combination of power-aware ATPG constraints, EDT correlated decompression, and scan segmentation can reduce test-mode power consumption to within 1.2–1.5× of functional power — a dramatic improvement over unconstrained scan's 3–5× overhead.

Interview FAQ — DFT for Low Power

Why does scan shift cause higher switching activity than functional operation?
During functional operation, flip-flop transitions are driven by the actual workload. Many FFs hold their value between cycles (data correlations, idle paths), so the average toggle rate is 5–15%. During scan shift, the scan chain is essentially a long shift register carrying pseudo-random test pattern data. Because ATPG patterns are designed to exercise all fault sites — not to minimize transitions — adjacent bits in the shifted data are uncorrelated. Statistically, uncorrelated binary data causes a 50% toggle rate per shift clock (each bit has a 50% chance of being different from its predecessor). Without constraints, scan shift reaches 40–50% toggle rate, which is 2–5× higher than functional. This drives proportionally higher dynamic power, causing IR drop and thermal stress.
What is X-masking and when is it needed?
X-masking selectively blocks scan chain outputs that contain X (unknown) values before those outputs reach a compactor or MISR. X-states come from uninitialized memories, power-gated domains, tri-state buses, or clock-domain crossing elements. If X values reach an MISR in LBIST, they corrupt the signature and cause false fails. If X values reach an EDT compactor, they invalidate combined patterns. X-masking is needed whenever X sources propagate to scan output pins and cannot be eliminated by other means (test-mode initialization, power sequencing, or X-bounding). Hardware X-masking uses an AND-mask register; software X-masking has the ATPG tool encode mask bits into each pattern.
How does power-aware ATPG limit switching during scan?
Power-aware ATPG adds a toggle rate constraint to the ATPG engine's don't-care fill algorithm. When filling unspecified (don't-care) scan bit positions, the tool tracks how many transitions the fill would cause on each shift clock. If a fill assignment would push the toggle count above the threshold (e.g., 15% of chain length), the tool selects a different fill value — one that does not add a transition. This is possible because don't-care bits can be assigned 0 or 1 without affecting fault detection; the tool simply assigns whichever value produces fewer transitions. Power-aware fill increases pattern count by 10–40% because some faults that could have been combined in one pattern (using free don't-care bits) now require separate patterns with constrained fills.
What is scan segmentation and why is it used in multi-power-domain designs?
Scan segmentation divides the full scan chain into independent sub-chains (segments), each confined to a single power domain. In multi-VDD / UPF designs, a power domain can be fully shut off during test (power switch open, domain supply at 0 V). If a scan chain crosses into a shut-off domain, the chain is physically broken — shift data is lost and the output is all-X. Scan segmentation solves this by never connecting chains across domain boundaries on-chip. Each segment has its own scan-in, scan-out, and scan-enable path. The DFT controller (or ATE) activates each segment only when its power domain is on. Isolation cells at domain boundaries prevent powered domains from being corrupted by floating outputs from off domains.
How do retention registers affect DFT?
Retention registers are flip-flops with a shadow latch that preserves state through power-off events. For DFT, they must be included in the scan chain — they are testable FFs like any other. The key constraint is that the SAVE and RESTORE control signals must be held inactive during scan shift and capture; asserting SAVE during shift would transfer the shifting data into the shadow latch, and asserting RESTORE would overwrite the shifted data — both corrupt the scan test. The DFT tool must tie these controls appropriately in test mode. ATPG also needs to model the retention register's internal structure to generate patterns that test the shadow path: specifically, patterns that SAVE a value, simulate power-off, RESTORE, and then verify the retained value propagated correctly through the retention latch.
← Day 8: At-Speed Testing Day 10: DFT Sign-off →