What is the difference between global placement and detailed placement?

Global placement assigns approximate positions to cells, minimizing total wire length while allowing overlaps. Detailed placement (legalization) resolves overlaps, snaps cells to legal placement rows, and performs local swaps to improve timing. Global placement sets the broad topology; detailed placement refines it.

Why is scan chain reordering done after placement?

Scan chain order is determined during synthesis without knowledge of physical positions. After placement, the physical order of flip-flops is known. Reordering the scan chain to match the physical layout minimizes scan wire length — often by 5–10× — reducing routing congestion and improving scan-mode timing.

What is routing congestion overflow in placement?

Routing overflow is the excess routing demand over available routing tracks in a local region. Overflow = max(0, demand - supply). Values above 1–2% indicate congestion hotspots that may prevent the router from completing without DRC violations. Placement tools penalize high-overflow regions to spread cells and reduce demand.

Placement – Chapter 4 | RTL to Silicon

Q: What is legalization in VLSI placement?

Legalization is the process of moving cells from their approximate global placement positions to legal positions: snapped to placement row boundaries, non-overlapping, and within the core area. It must respect row height, site alignment, and power rail orientation.

In this chapter

Global vs Detailed Placement
Timing-Driven Placement
Congestion-Driven Placement
Legalization
Scan Chain Reordering
Placement Blockages & Halos
Interactive: Placement Visualizer
Interpreting Congestion Maps & Timing
Key Takeaways

1. Global vs Detailed Placement

Placement happens in two stages. Global placement assigns approximate locations to cells, optimizing for wire length and congestion at a coarse level. Cells may overlap at this stage. Detailed placement then resolves overlaps, snaps cells to legal placement rows, and performs local optimization.

Stage	Objective	Cell overlap?	Method
Global placement	Minimize total wire length (HPWL), spread cells evenly	Allowed (cells are "fuzzy")	Force-directed, analytical (e-PLACE, NTUplace)
Legalization	Resolve overlaps, snap to placement rows	None after this step	Abacus algorithm, greedy row packing
Detailed placement	Local cell swaps to improve timing/WL	None	Simulated annealing, dynamic programming
Post-placement opt.	Fix timing violations found after legalization	None	Cell sizing, buffering, topology changes

Modern tools (Cadence Innovus, Synopsys ICC2) run all stages as a single place_opt command, but they iterate through these phases internally. Understanding the phases helps when debugging placement-related timing or congestion issues.

2. Timing-Driven Placement

Standard wire-length minimization (HPWL) treats all nets equally. Timing-driven placement weights critical nets more heavily — cells on the critical path are pulled closer together to minimize delay on those specific connections.

The key metric is criticality: a net's criticality is proportional to how close its slack is to the worst negative slack (WNS). A net on the critical path has criticality = 1.0; a net with large positive slack has criticality ≈ 0.

Timing-driven placement formula:
Net weight = 1 + α × criticality
where α is the timing weight factor (typically 2–5). Critical nets are weighted up to 6× more than non-critical nets.

After global placement, the placer runs incremental STA using estimated wire delays (based on placed positions). Cells with negative slack are moved closer to their fanin/fanout, trading some wire-length optimality for timing improvement.

# Innovus: Run placement with timing optimization
place_opt \
  -effort high \
  -timing_driven \
  -congestion_effort medium

# After placement: check timing estimate
report_timing \
  -max_paths 10 \
  -path_type full \
  -delay_type max

# Check worst slack and total negative slack
report_design -timing_summary

3. Congestion-Driven Placement

Routing congestion occurs when the demand for routing tracks in a local region exceeds supply. A router that cannot find tracks must detour wires, which increases delay and can cause DRC violations. Placement prevents this by spreading cells to avoid local density hotspots.

How the placer estimates congestion

The die is divided into a routing estimation grid (typically 5–10× the standard cell height). For each tile, the placer estimates: (a) the number of routing tracks available (supply) and (b) the number of wires that need to pass through (demand, estimated from net cuts). Overflow = max(0, demand − supply). The placer penalizes placements that increase overflow.

# Check congestion after placement
report_placement_congestion

## Example output:
# Routing congestion summary:
# Layer M3 (horizontal): max overflow = 2 tracks at (450, 320)
# Layer M4 (vertical):   max overflow = 0
# Global overflow:        4.2%  ← target < 1% for clean routing

# Visualize congestion heatmap
display_congestion_map -layer M3

# If congestion hotspot found, add local spreading:
refine_placement \
  -focus_area {400 280 500 360} \
  -congestion_effort ultra

4. Legalization

After global placement, cells are at approximate positions and may overlap. Legalization is the process of moving cells to legal positions: snapped to placement row boundaries, non-overlapping, and within the placement area.

Placement rows

Standard cells must sit in predefined placement rows — horizontal stripes across the core area whose height equals the standard cell height (e.g., 0.72 µm in a 7nm technology). Each row has a power rail (VDD or VSS) running along its top or bottom edge. Cells in adjacent rows share rails — this is the "rail-sharing" structure that makes standard cell layout efficient.

Legalization must also respect:

Site alignment: Cell left edges must align to placement sites (typically 0.09 µm pitch in 7nm)
Orientation: Alternating rows are flipped vertically (MX) so adjacent VDD/VSS rails merge
Well-tap spacing: N-well and P-well tap cells must be within max-spacing rules (prevents latch-up)

Legalization displacement: The quality of global placement determines how much cells move during legalization. A poor global solution forces cells to displace far from their optimal positions, destroying the timing and congestion improvements made in global placement. Displacement > 10% of core edge length is a warning sign.

5. Scan Chain Reordering Post-Placement

For DFT (Design for Test), flip-flops are connected into one or more scan chains — a shift register that allows test vectors to be loaded serially. The scan chain order is originally determined during synthesis, but after placement, the physical positions of FFs are known.

If the scan chain order doesn't match the physical order of FFs, the scan-in wire must zigzag across the chip to reach FFs in sequence. This adds significant wire length and can cause timing violations on the scan path (which must meet its own timing constraints during test mode).

# Reorder scan chains to minimize scan wire length post-placement
set_scan_reorder_mode -effort high -optimize scan_wire_length
reorder_scan

# Verify scan chain connectivity after reordering
report_scan_chain -detail

## Before reorder: scan wire length = 18.4 mm  (zigzag)
## After reorder:  scan wire length = 3.1 mm   (physical order)

6. Placement Blockages & Halos

Certain regions of the core must be kept free of standard cells. These are defined using placement blockages:

Blockage type	Effect	Typical use case
Hard blockage	No cells allowed inside the region	Macro keepout, analog isolation zone
Soft blockage	Placer avoids the region but can use it if needed	Preferred keepout (placer may override under congestion)
Partial blockage	Reduce density to a specified % in the region	Buffer zones around macros for routing access
Halo	Automatically follows a macro as it moves	Pre-placement macro margins (applied in floorplan)

⚡ Interactive: Placement Visualizer

Watch ~40 cells transition through three placement phases. Cells are colored by timing criticality. Use the buttons to step through phases or animate automatically.

Phase: Random (pre-placement)

Est. Wire Length (mm)

Worst Slack (ps)

Routing Overflow %

Critical path cells Near-critical High slack (non-critical)

8. Interpreting Congestion Maps & Timing After Placement

Reading a congestion heatmap

Congestion heatmaps display routing overflow as a color gradient (green → yellow → red). A red hotspot at a specific (x, y) location means the number of wires that need to pass through that tile exceeds available routing tracks. Solutions:

Reduce cell density near the hotspot (use refine_placement with local effort)
Add routing blockages on over-demanded layers near the hotspot to redirect demand
Move a nearby macro that's blocking the channel
Increase die size or reduce utilization in that region

Timing after placement

Post-placement timing uses estimated wire delays (from RC estimation based on placed cell positions). This is less accurate than post-route STA but provides an early indicator of timing problems. Key metrics to check:

Metric	Target	Action if violated
WNS (Worst Negative Slack)	≥ 0 ps (or close to 0 with margin)	Resize critical cells, move closer, pipeline
TNS (Total Negative Slack)	0 ps	Reduce endpoint count in violation
Max transition time	< 200 ps (7nm typical)	Upsize driver cell or add buffer
Max capacitance	Per library cell limit	Upsize driver or split net

Post-placement timing is optimistic: Actual post-route wire delays are 10–30% higher than estimated, because the router must detour around congestion. If WNS is −50 ps after placement, expect −80 to −100 ps after routing. Fix timing aggressively at the placement stage.

✅ Chapter 4 Key Takeaways

Placement = Global placement → Legalization → Detailed placement → Post-placement optimization
Timing-driven placement weights critical nets to reduce delay on the critical path
Congestion-driven placement spreads cells to avoid routing overflow; target < 1% global overflow
Legalization snaps cells to placement rows with no overlap; excessive displacement degrades quality
Reorder scan chains post-placement to minimize scan wire length (often 5–10× reduction)
Fix timing aggressively after placement — post-route delays are 10–30% worse than estimated

Next → Chapter 5

Clock Tree Synthesis (CTS)

Building H-trees and mesh topologies to minimize clock skew and insertion delay across thousands of flip-flops.

→