Everything in STA revolves around timing paths. A timing path is a directed route through the circuit from a startpoint to an endpoint. The STA tool enumerates every possible path, computes the total delay along each one, and checks whether the signal arrives within the required window. Understanding how paths are traced, how arrival time is computed, and what makes a path critical is the foundation for all timing closure work.
A timing path always runs from a startpoint to an endpoint. STA defines these precisely:
| Role | Valid elements | What STA does here |
|---|---|---|
| Startpoint | Clock pin of a flip-flop (CK), primary input port | Initialises arrival time; data launches on this clock edge |
| Endpoint | Data pin of a flip-flop (D), primary output port | Checks timing; computes slack = required − arrival |
Between startpoint and endpoint, the path passes through combinational logic: AND gates, OR gates, muxes, adders, and the interconnect (wires) between them. STA traverses this logic as a directed acyclic graph (DAG), accumulating delay at every node.
The data launches from the flip-flop on the clock edge. STA models this as: at the clock edge, a new value appears at Q after clock-to-Q delay (Tclk2q). The path then starts propagating from Q through combinational logic. So the startpoint is the CK pin, and Tclk2q is the first delay segment in the path.
Every delay in a timing path comes from one of two sources:
Pre-layout STA uses estimated net delays (wireload models). Post-layout STA uses actual extracted parasitics — this is why post-layout timing differs from synthesis timing.
STA propagates arrival time forward from every startpoint using a simple rule at each node:
For a gate with multiple inputs (e.g. AND2), STA takes the worst arrival time across all inputs when computing the output arrival time — because the output cannot be driven until all inputs have settled.
Propagate clock arrival from the source (PLL/oscillator) through the clock buffer tree to the CK pin of the launch FF. This is Tlaunch_clk.
Add Tclk2q (the FF’s own output delay after the clock edge). Arrival at Q = Tlaunch_clk + Tclk2q.
For each cell and net in the path, add cell delay and net delay. At multi-input gates, take the latest arriving input.
The final accumulated value is the data arrival time at the capture FF’s D pin.
For setup analysis, the required time at the endpoint D pin is the latest the data can arrive and still be captured reliably:
For hold analysis, the required time is the earliest the data is allowed to arrive (it must not arrive before the capture FF has safely stored the previous value):
Startpoint : FF1/CK (clock pin of FF1) Endpoint : FF2/D (data pin of FF2) -- Clock path (launch) -- Clock source : 0.00 ns Clock buffer tree to FF1 (CK) : 0.50 ns <- T_launch_clk FF1 clock-to-Q (T_clk2q) : 0.20 ns Subtotal (data starts here) : 0.70 ns -- Data path -- Net: FF1/Q -> AND2/A : 0.10 ns Cell: AND2 (A->Z) : 0.35 ns Net: AND2/Z -> OR2/A : 0.12 ns Cell: OR2 (A->Z) : 0.28 ns Net: OR2/Z -> INV/A : 0.08 ns Cell: INV (A->ZN) : 0.18 ns Net: INV/ZN -> FF2/D : 0.10 ns Data Arrival Time at FF2/D : 1.91 ns -- Required time (setup) -- Next clock edge : 4.00 ns (250 MHz clock) Clock buffer tree to FF2 (CK) : 0.60 ns <- T_capture_clk Setup time (T_su) [library] : -0.15 ns Required Time : 4.45 ns Setup Slack = 4.45 - 1.91 = +2.54 ns (PASS)
The critical path is the path with the worst (most negative) setup slack — the path that most constrains the maximum clock frequency. Once fixed, the next-worst path becomes the new critical path. This is why timing closure is iterative.
## Top 20 worst setup paths across all endpoints
report_timing -max_paths 20 \
-delay_type max \
-sort_by slack
## Top 10 worst paths in a specific clock domain
report_timing -max_paths 10 \
-delay_type max \
-group clk_core
## Show full path detail for the single worst path
report_timing -delay_type max \
-nworst 1 \
-path_type full_clock_expanded
## Quick WNS/TNS summary
report_timing_summary
| Path Type | Startpoint | Endpoint | Constrained by |
|---|---|---|---|
| Reg-to-Reg | FF clock pin (CK) | FF data pin (D) | Clock period (create_clock) |
| Input-to-Reg | Primary input port | FF data pin (D) | set_input_delay |
| Reg-to-Output | FF clock pin (CK) | Primary output port | set_output_delay |
| Input-to-Output | Primary input port | Primary output port | set_input_delay + set_output_delay |
Paths between asynchronous clocks (no defined phase relationship) are reported as unconstrained — they require set_clock_groups -asynchronous or CDC analysis instead. Paths explicitly excluded with set_false_path are ignored. Paths with set_multicycle_path N are checked against N × Tclk instead of 1 × Tclk. Day 4 (SDC) covers these constraints in detail.
report_timing -sort_by slack to find worst paths quickly in PrimeTimeA startpoint is where STA begins tracing a timing path. Valid startpoints are the clock pin (CK) of a flip-flop — the most common — and primary input ports. The STA tool initialises the arrival time at each startpoint and propagates it forward through the combinational logic to the endpoint.
An endpoint is where STA checks timing and computes slack. Valid endpoints are the data pin (D) of a flip-flop and primary output ports. Every endpoint must have non-negative setup slack and non-negative hold slack for the design to meet timing.
A timing arc is the elementary delay between two pins — either through a logic cell (cell arc) or through a wire (net arc). Cell arcs are defined in the Liberty library and depend on input slew and output load. Net arcs are computed from parasitic RC values extracted from the physical layout (SPEF).
A logic gate cannot produce a stable output until all its inputs have settled. Therefore the output arrival time is determined by the last input to arrive. STA conservatively uses the worst (latest) input arrival time to compute the output arrival time, ensuring all paths are checked under the worst case.
Pre-layout (pre-route) STA estimates wire delays using wireload models — statistical estimates based on fanout and design size. Post-layout STA uses actual parasitic RC values extracted from the placed-and-routed netlist (SPEF). Post-layout timing is more accurate and is the sign-off standard; pre-layout timing gives faster but less precise feedback during synthesis and early floorplanning.