Home VLSI Digital Electronics STA RTL Design About Contact
VLSI · Physical Design

Physical Design Flow

Physical design transforms a synthesized gate-level netlist into a manufacturable GDSII layout. Every decision — where to place each cell, how to route every wire, how to distribute the clock — directly determines timing, power, area, and yield of the final chip.

1. Physical Design Flow Overview

The physical design (PD) flow is a sequential series of transformations, each adding more geometric detail to the abstract netlist. Modern PD tools (Cadence Innovus, Synopsys IC Compiler II) execute these stages with iterative feedback loops:

1

Import & Design Setup

Read netlist (Verilog), technology files (LEF/TECH), library timing (Liberty), and constraints (SDC). Initialize the die area.

2

Floorplanning

Define die and core area. Place macros (memories, IPs). Establish power rings and trunk routes. Set aspect ratio and utilization target (typically 70–80%).

3

Power Planning

Create power grid — VDD/GND rings, stripes, and rails. Analyze IR drop. Ensure sufficient current delivery without voltage droop at any cell.

4

Placement

Place standard cells in rows within the core area. Global placement minimizes wirelength; detailed placement legalizes cells to grid-aligned positions.

5

Clock Tree Synthesis (CTS)

Insert clock buffers and inverters to distribute the clock with balanced latency to all flip-flop clock pins, minimizing skew within target (typically < 50 ps).

6

Routing

Connect all nets with metal wires obeying DRC rules. Global routing assigns nets to routing regions; detailed routing assigns exact tracks and vias.

7

Sign-off Verification

DRC, LVS, post-route STA with extracted parasitics (SPEF), IR drop analysis, EM (electromigration) check, and final timing sign-off.

8

GDSII Tape-out

Stream out the verified layout in GDSII format to the foundry for mask generation and fabrication.

2. Floorplanning

Floorplanning is the highest-impact stage — decisions made here echo through every subsequent step. Key objectives:

Die vs. Core

The die includes I/O pads and ESD structures around the perimeter. The core (interior) holds standard cells, macros, and power structures.

Macro Placement

Memories and hard IPs placed at edges or corners to minimize routing congestion and maintain abutment with power stripes.

Utilization

Core utilization = (cell area) / (core area). 70–80% is typical — higher causes routing congestion; lower wastes die area.

Halos & Blockages

Placement blockages around macros prevent cells from being placed in electrically sensitive regions or routing channels.

3. Placement

Placement determines where each standard cell lives within the core area. The problem is NP-hard — tools use heuristic algorithms to approximate the global optimum.

StageGoalAlgorithmConstraint
Global PlacementMinimize total wirelength (HPWL)Analytical / simulated annealingCells can overlap
LegalizationMove cells to legal rows without overlapAbacus / tetrisNo overlaps, on-grid
Detailed PlacementLocal optimization: timing, routabilitySwap, shift, mirrorLegal positions only

Timing-driven placement: Modern tools weight net criticality during global placement — critical-path nets are shortened aggressively at the cost of non-critical wirelength. This is controlled by placement constraints derived from SDC timing budgets.

4. Clock Tree Synthesis (CTS)

After placement, the clock distribution network is built. The clock drives every flip-flop simultaneously — even 100 ps of skew on a 1 GHz clock represents a 10% budget hit. CTS inserts clock buffers and inverters to balance arrival times across all sinks.

Key CTS Metrics

Skew

Max difference in clock arrival time between any two FFs in the same clock domain. Target: < 50–100 ps.

Latency

Total delay from clock source to sink FF. Consistent latency across domains is critical for CDC analysis.

Insertion Delay

Total buffer chain delay added by the CTS. Larger designs with more sinks need longer chains → higher latency.

H-Tree Topology

The H-tree distributes the clock via H-shaped branches of equal wire length at each level. Each split point has identical wire delay to both children, achieving near-zero geometric skew. The interactive lab below lets you explore how buffer insertion on different branches changes skew and latency.

Interactive: CTS H-Tree Skew Balancer

Click any wire segment to insert a buffer (+50 ps). Balance the clock tree to minimize global skew. Green sinks are early; red sinks are late.

CTS Toolkit

Click wires in the diagram to add buffers. Each buffer adds 50 ps delay to that branch.

Max Latency
Min Latency
Global Skew
Green FF = early (low latency)
Red FF = late (high latency)
Target skew < 100 ps

5. Routing

Routing connects all nets with metal wires on the available metal layers. Modern designs use 10–15 metal layers. Lower layers (M1–M3) handle local connections; upper layers (M6–M10+) handle global signals and power distribution.

StageWhat It DoesOutput
Global RoutingAssigns nets to coarse routing regions (G-cells); estimates congestionG-cell routing guide
Track AssignmentAssigns global routes to specific metal tracksTrack-assigned routes
Detailed RoutingDetermines exact wire shapes, vias, and connections obeying all DRC rulesFull GDSII-ready layout
Search and RepairIteratively fixes remaining DRC violationsDRC-clean layout

6. Physical Sign-Off

Before tape-out, the layout must pass a rigorous set of checks. All checks run on the final routed layout with extracted parasitics (SPEF file) from the layout extraction tool.

DRC

Design Rule Check — verifies layout geometry satisfies all foundry rules: minimum width, spacing, enclosure, density. Zero violations required for tape-out.

LVS

Layout vs. Schematic — verifies connectivity and device parameters match the netlist. Detects opens, shorts, missing vias, and parameter mismatches.

Post-Route STA

Static Timing Analysis with RC parasitics extracted from the layout. All setup/hold paths must meet timing at all PVT corners.

IR Drop / EM

Power grid IR drop must stay below 5–10% of VDD. Electromigration checks ensure metal/via current density is within limits for 10+ year lifetime.

FAQ

The physical design flow converts a synthesized netlist into a manufacturable GDSII layout. The main stages are: floorplanning, power planning, placement, clock tree synthesis (CTS), routing, and sign-off verification (DRC, LVS, STA, IR drop). Modern PD tools run these stages iteratively with timing feedback.
CTS distributes the clock from its source (PLL or clock port) to all flip-flop clock pins with balanced delay and minimal skew. The tool inserts clock buffers and inverters along tree branches, targeting zero skew and bounded latency. Common topologies include H-tree (equal-length branches) and fishbone networks for irregular floorplans.
Clock skew is the difference in clock arrival time between two flip-flops. Positive skew (capturing FF gets clock later) relaxes hold but tightens setup. Negative skew helps hold but worsens setup. Excessive skew reduces the effective clock period available for data propagation, making timing closure harder.
DRC (Design Rule Check) verifies the layout geometry satisfies foundry process design rules — minimum spacing, width, enclosure, and density. LVS (Layout vs. Schematic) verifies the layout connectivity and device parameters match the post-place-and-route netlist. Both must pass with zero violations before tape-out.