Home VLSI Digital Electronics STA RTL Design About Contact
VLSI · Low Power Design

Clock Gating ICG Cells

Clock gating is the single most impactful power reduction technique available in RTL — it eliminates dynamic power by stopping the clock to idle flip-flops. But a naive AND gate on the clock creates glitches that corrupt data. The Integrated Clock Gating (ICG) cell solves this with one elegant trick.

1. Why Gate the Clock?

In a synchronous design, every flip-flop toggles its internal nodes on every clock edge — whether it needs to capture new data or not. The clock tree and flip-flop internals account for 30–45% of total chip dynamic power in a typical SoC. Most of this switching is wasted on cycles where the register holds a stable value.

Dynamic Power Formula

P = α·C·V²·f

Clock gating reduces the activity factor α — the fraction of cycles where a net actually switches. Gate the clock 80% of cycles → save 80% of that register's power.

Typical Savings

20–40% reduction in total chip dynamic power from clock gating alone, depending on enable activity factors. Mobile SoCs achieve 40%+ in idle modes.

What Gets Gated

Both the clock tree buffers driving the gated branch AND the flip-flop internal switching stop when gated. Both contribute to power savings.

2. The Glitch Problem with Simple AND Gating

The intuitive approach — just AND the clock with an enable signal — creates a critical reliability hazard. If the enable signal changes while the clock is HIGH, the AND gate output changes immediately, creating a narrow spurious clock pulse. This glitch causes the downstream flip-flops to sample incorrect data at an unintended moment.

Never gate the clock combinatorially in RTL. The statement assign gated_clk = clk & en; is a functional time bomb. Any glitch on en during the HIGH phase creates a spurious edge that corrupts register state — a bug that is nearly impossible to catch in RTL simulation but manifests in silicon.

✕ Dangerous — glitch-prone
// NEVER do this — combinational clock gating assign gated_clk = clk & en; // If en glitches while clk=1 → // spurious rising edge on gated_clk // → FF captures wrong data
✓ Correct — ICG inference
// Let synthesis insert ICG cell always_ff @(posedge clk) if (en) reg_out <= data_in; // Synthesis sees: conditional clock enable // → inserts ICG cell automatically

3. The ICG Cell — How It Works

An Integrated Clock Gating (ICG) cell combines a level-sensitive (transparent) latch with an AND gate. The latch is transparent when CLK is LOW, capturing the enable value. When CLK goes HIGH, the latch becomes opaque — the enable value is frozen regardless of any glitches on the enable input.

CLK PhaseLatch StateEnable Input Changes?ICG Output (GCLK)
CLK = 0Transparent — EN propagates to QCaptured safely (no output)0 (CLK is LOW — no rising edge possible)
CLK = 0 → 1 (rising edge)Opaque — Q frozenHas no effectGCLK = Q_latch (stable, glitch-free)
CLK = 1Opaque — Q frozenHas no effect on outputGCLK = CLK (full pulse if Q=1)
CLK = 1 → 0 (falling edge)Transparent againNew EN capturedGCLK = 0 (CLK went LOW)

The result: GCLK only ever changes at CLK falling edges or CLK rising edges — never due to glitches on EN. The downstream flip-flops see only clean, full-width clock pulses or silence.

4. RTL Coding for ICG Inference

Modern synthesis tools (Synopsys DC, Cadence Genus) automatically detect clock gating opportunities and insert ICG cells when clock gating optimization is enabled. The RTL pattern that triggers ICG insertion:

// Pattern 1: Conditional enable on always_ff always_ff @(posedge clk or negedge rst_n) begin if (!rst_n) data_reg <= '0; else if (en) data_reg <= data_in; // ICG inferred here end // Pattern 2: Wide bus with single enable always_ff @(posedge clk) if (load_en) coeff_reg[63:0] <= coeff_in; // 64-bit ICG = big savings // Synopsys DC compile command to enable: // set_clock_gating_style -sequential_cell latch // compile -gate_clock

Timing constraint: The enable signal must meet timing to the ICG cell's latch data input — it must be stable before the falling edge of the clock. This setup time is typically smaller than a flip-flop's setup time but must be verified in STA. If the enable path is too slow, pipeline the enable by one cycle.

5. Hierarchical Clock Gating

Effective clock gating is applied at multiple levels of hierarchy for maximum savings. A top-level gate can disable an entire subsystem; sub-module gates add finer granularity within active subsystems.

// Level 1: Top-level — disable entire DSP when core_en=0 always_ff @(posedge clk) if (core_en) dsp_state <= next_state; // Level 2: Sub-module — gate only the filter bank always_ff @(posedge clk) if (core_en && filter_en) filt_coeff <= coeff_in; // Level 3: Register-level — gate individual accumulators always_ff @(posedge clk) if (core_en && filter_en && acc_en) accum <= accum + product;

Interactive: AND Gate vs ICG Timing Simulator

Toggle the enable signal rapidly (especially when CLK is HIGH in SIMPLE mode) to see glitches appear on the output. Switch to ICG mode to see clean, glitch-free output.

Configuration

Architecture
Enable Status
OFF (Gated)
Clock Activity
0%
SIMPLE mode: toggle enable rapidly when CLK=1 to see red glitches on GCLK.

ICG mode: enable changes only propagate on CLK=0 → no glitches possible.
CLK_SRC ENABLE LATCH_Q GCLK_OUT

FAQ

An ICG (Integrated Clock Gating) cell contains a level-sensitive latch followed by an AND gate. The latch samples the enable signal only when the clock is LOW and holds it stable when the clock is HIGH. A simple AND gate passes any glitch on the enable signal directly to the output whenever the clock is HIGH, creating spurious clock pulses that corrupt flip-flop data.
Clock gating eliminates dynamic power on all clock tree branches and flip-flops that are gated OFF. In typical SoC designs, 20–40% of total dynamic power is saved through clock gating. The savings depend on the enable activity factor — an enable signal that is LOW 80% of the time saves 80% of the gated network's power.
Synthesis tools recognize the pattern 'always_ff @(posedge clk) if (enable) register <= data;' and automatically insert an ICG cell on the clock path to that register. You must enable clock gating optimization (e.g., 'compile -gate_clock' in Synopsys DC) and ensure the enable signal meets timing to the ICG latch's data input before the falling clock edge.
Hierarchical clock gating applies ICG cells at multiple levels of hierarchy. A top-level gate disables an entire subsystem; sub-module gates provide finer granularity within active subsystems. Both levels use ICG cells. The top gate saves power on the entire clock sub-tree when the subsystem is idle; sub-module gates further reduce power within active subsystems.