HomeSTA CourseDay 10
DAY 10 · ADVANCED STA

Clock Tree Analysis — Skew, Jitter, Latency, CPPR

By EcrioniX · Updated June 2026

The clock network is the backbone of every synchronous design. Every flip-flop’s timing requirement is measured relative to when its clock edge actually arrives — and that arrival time is determined by the clock tree. Skew, jitter, and latency aren’t abstractions: they directly eat into your setup and hold margins. Understanding how STA models the clock tree is essential for reading any real timing report.

1. Clock latency components

The total delay from the clock oscillator to a flip-flop’s clock pin is the sum of two latency types:

Source latency

The delay from the off-chip clock source (PLL output, oscillator) to the clock input port of the chip. This is modelled in STA as a constant delay applied at the clock definition point. It represents board trace delay plus any I/O buffer delay between the oscillator and the chip’s pad.

Network latency (insertion delay)

The delay from the clock input port through the on-chip clock tree (buffers, clock gates, spine drivers) to each flip-flop’s clock pin. This is the dominant component in most designs. Post-CTS, the actual network latency is read from the SPEF parasitics and computed by propagating clock edges through the physical clock tree.

Clock latency modelling in STA
## Pre-CTS: use ideal clock with estimated total latency ## Source latency (board + I/O buffer delay = 0.2 ns) set_clock_latency -source 0.2 [get_clocks clk_core] ## Network latency estimate (total CTS insertion delay ~0.6 ns) set_clock_latency 0.6 [get_clocks clk_core] ## Clock uncertainty (pre-CTS: includes estimated skew + jitter) set_clock_uncertainty -setup 0.25 [get_clocks clk_core] set_clock_uncertainty -hold 0.05 [get_clocks clk_core] ## Post-CTS: use propagated clock mode (actual delays from SPEF) ## This replaces the estimated network latency with real values set_propagated_clock [get_clocks clk_core] ## Post-CTS: update uncertainty to jitter-only (skew is now in propagated delays) set_clock_uncertainty -setup 0.08 [get_clocks clk_core] set_clock_uncertainty -hold 0.05 [get_clocks clk_core]

2. Clock skew

Clock skew is the spatial variation in clock arrival time across different flip-flops on the same clock domain. Two flip-flops nominally clocked at the same frequency may see the clock edge arrive at different times because of different routing path lengths and buffer delays.

Clock Skew Effect on Setup Timing CLK Source FF_A FF_B Ideal FF_A FF_B skew = 0.02 ns Positive skew (A before B): • FF_A launches data early — more setup margin for A→B path • Reduces hold margin on A→B path
Skew typeDefinitionEffect on setupEffect on hold
Positive (useful) skewLaunch FF clock arrives before capture FF clockIncreases setup margin (data has more time to travel)Decreases hold margin (data must arrive later)
Negative (harmful) skewCapture FF clock arrives before launch FF clockDecreases setup margin (effective period is shorter)Increases hold margin
Local skewSkew between two physically adjacent flip-flopsCritical for timing closure between near cellsPrimary cause of hold violations post-CTS
Global skewMaximum skew across the entire clock domainUsed as CTS quality metric; typical target < 50–100psAny global skew is a worst-case bound

3. Clock jitter types

Jitter is temporal uncertainty in clock edge arrival. Unlike skew (spatial), jitter varies from cycle to cycle. STA models jitter conservatively as a worst-case window:

Period jitter

The deviation of any single clock cycle from the ideal period. If the nominal period is 1.000 ns, period jitter means any given cycle might be 0.985 ns or 1.015 ns. STA uses the maximum period jitter as part of clock uncertainty. Period jitter always reduces setup margin.

Cycle-to-cycle jitter

The change in period between two consecutive cycles. Important for PLL-based designs where the VCO frequency may fluctuate. Cycle-to-cycle jitter determines how much the clock period varies over two adjacent cycles — relevant for flip-flops that might capture data one cycle early or late.

Long-term jitter (accumulated)

The maximum deviation of any clock edge from its ideal position over a long measurement window. Less relevant for digital STA but important for PLL characterisation and spread-spectrum clocking analysis.

Jitter modelling in STA
## Clock uncertainty = jitter + remaining skew margin ## Pre-CTS: uncertainty includes both jitter and estimated skew set_clock_uncertainty -setup 0.20 [get_clocks clk_core] ## Breakdown: 0.10 (estimated skew) + 0.08 (period jitter) + 0.02 (margin) ## Post-CTS: skew is in the propagated delays; only jitter remains set_clock_uncertainty -setup 0.10 [get_clocks clk_core] ## Breakdown: 0.08 (period jitter) + 0.02 (process/margin) ## For inter-clock paths (two PLLs from different sources, larger jitter) set_clock_uncertainty -setup 0.25 \ -from [get_clocks clk_pll_a] \ -to [get_clocks clk_pll_b] ## For paths on the same PLL (correlated jitter — smaller uncertainty) set_clock_uncertainty -setup 0.08 \ -from [get_clocks clk_core] \ -to [get_clocks clk_div2] ;# same PLL, correlated jitter

4. CPPR with a worked example

Let’s trace CPPR through a specific example. Two flip-flops (FF_L = launch, FF_C = capture) share the first two clock tree buffers (BUF_1 and BUF_2), then diverge (BUF_L feeds FF_L, BUF_C feeds FF_C).

CPPR worked example — numbers
## Clock tree structure: ## CLK_PAD → BUF_1 → BUF_2 → BUF_L → FF_L/CK (launch) ## → BUF_2 → BUF_C → FF_C/CK (capture) ## Shared path: CLK_PAD → BUF_1 → BUF_2 ## Nominal delays (ns): ## BUF_1: 0.120 ns, BUF_2: 0.180 ns (shared) ## BUF_L: 0.090 ns (launch-only) ## BUF_C: 0.085 ns (capture-only) ## OCV derating (5%): late = ×1.05, early = ×0.95 ## ── Without CPPR (raw OCV) ── ## Launch clock: all cells derated LATE ## BUF_1: 0.120×1.05 = 0.126 ## BUF_2: 0.180×1.05 = 0.189 ## BUF_L: 0.090×1.05 = 0.095 ## Total launch clock: 0.410 ns ## Capture clock: shared cells derated EARLY ## BUF_1: 0.120×0.95 = 0.114 ## BUF_2: 0.180×0.95 = 0.171 ## BUF_C: 0.085×0.95 = 0.081 ## Total capture clock: 0.366 ns ## Skew used in STA: capture - launch = 0.366 - 0.410 = -0.044 ns ## (Negative skew = pessimistic) ## ── With CPPR ── ## CPPR correction = pessimism on shared segments ## BUF_1 shared pessimism: |0.126 - 0.114| = 0.012 ns ## BUF_2 shared pessimism: |0.189 - 0.171| = 0.018 ns ## Total CPPR correction: 0.030 ns ## CPPR-adjusted skew: -0.044 + 0.030 = -0.014 ns (less pessimistic) ## CPPR slack recovery: 0.030 ns (30 ps recovered!) ## In a timing report, this appears as: ## clock reconvergence pessimism removal: +0.030 ns

5. CTS goals and metrics

Clock Tree Synthesis (CTS) optimises the clock network to achieve specific targets. These targets directly affect STA results:

CTS metricTypical target (28nm)Effect if violated
Global skew< 80 ps (0.08 ns)Setup margin reduced by skew amount; timing violations across chip
Local skew< 30 ps (0.03 ns)Hold violations between adjacent flip-flops
Insertion delay0.3–0.8 ns (depends on die size)Less impact on timing directly; affects power via clock switching
Max transition on CK pins< 0.15 nsHigh slew on clock inputs degrades flip-flop setup/hold window
Clock power20–40% of total power budgetThermal issues if clock buffers are over-sized

6. report_clock_tree

Analysing the clock tree in PrimeTime
## Full clock tree report for one clock report_clock_tree -clock clk_core -setup ## Report shows: ## Clock Name: clk_core ## Period: 2.000 ns ## Waveform: {0 1} ## Sources: 1 (CLK_PAD) ## Sinks: 12,847 (flip-flop CK pins) ## Min Insertion Delay: 0.421 ns (fastest FF) ## Max Insertion Delay: 0.623 ns (slowest FF) ## Global Skew: 0.202 ns <- (bad: target is <0.08 ns) ## Local Skew: 0.055 ns <- (ok) ## Report per-cell clock arrival statistics report_clock_timing -type skew -nworst 20 ## Find worst skew pairs report_clock_timing -type latency -clock clk_core \ -show_latency_components ## Check which nets drive FF clock pins report_net -connections [get_nets -hierarchical -filter {is_clock_net==true}] ## Check max transition on clock nets report_constraint -max_transition -clock_path -verbose

7. Pre-CTS vs Post-CTS STA comparison

AspectPre-CTS STAPost-CTS STA
Clock delaysIdeal (estimated via set_clock_latency)Propagated (actual tree delays from SPEF)
Clock uncertaintyLarge (includes estimated skew, 0.15–0.3 ns)Small (jitter only, 0.05–0.15 ns)
Hold violationsRare (ideal clocks don’t show local skew)Common (real skew appears; hold closure needed)
CPPRNot applied (ideal clock has no common path)Essential (removes shared-buffer pessimism)
AccuracyApproximate; used for synthesis guidanceSign-off quality with correct parasitics

Always run post-CTS STA with propagated clocks

Pre-CTS ideal clocks do not reflect real clock tree delays. Post-CTS STA with set_propagated_clock is required for valid timing closure because: (1) actual insertion delays may differ from estimates by 10–30%; (2) local skew causing hold violations only appears with real tree delays; (3) CPPR correction is only meaningful with a real common clock path structure.

Day 10 Key Takeaways

Frequently Asked Questions

What is the difference between clock source latency and network latency?

Source latency is the off-chip delay from the clock oscillator to the chip’s clock input pad (board trace + I/O buffer). Network latency is the on-chip delay from the clock pad through the clock tree buffers to each flip-flop’s clock pin. Pre-CTS, both are modelled with set_clock_latency. Post-CTS, the network latency is computed by propagating the clock edge through the actual routed clock tree.

What is clock skew?

Skew is the spatial difference in clock arrival time between two flip-flops on the same clock domain. Positive skew (launch FF clock arrives before capture FF) increases setup margin but reduces hold margin. CTS targets minimise global skew (<80ps typical) to avoid both setup violations from harmful skew and hold violations from local skew variations.

What is the difference between period jitter and cycle-to-cycle jitter?

Period jitter is the maximum deviation of any single clock period from the ideal period — it bounds how much any one cycle can be short. Cycle-to-cycle jitter is the maximum change between two adjacent periods — how much the clock “wanders” from cycle to cycle. Both are modelled in STA via clock uncertainty. Period jitter directly limits the maximum achievable clock frequency.

How does CPPR work with a specific example?

If two flip-flops share clock buffers BUF_1 (0.120 ns) and BUF_2 (0.180 ns), OCV with 5% derating makes BUF_1 = 0.126 ns (late for launch) and 0.114 ns (early for capture). CPPR recognises that BUF_1 cannot be simultaneously slow and fast: it removes the 0.012 ns pessimism from that buffer. For BUF_2 it removes another 0.018 ns. Total CPPR recovery = 0.030 ns — slack that would have been wasted without CPPR.

What are the typical post-CTS clock uncertainty values?

Post-CTS clock uncertainty drops to jitter-only because actual skew is now captured in propagated clock delays. Typical values: setup uncertainty 0.05–0.15 ns (period jitter + margin), hold uncertainty 0.03–0.08 ns. Pre-CTS setup uncertainty was 0.15–0.30 ns because it had to estimate skew. After CTS, update set_clock_uncertainty to the smaller post-CTS values before timing sign-off.

← Previous
Day 9: Crosstalk & SI