Unlocking PPA with Innovus: What’s New and How to Unleash it

10 Mar 2026 • 7 minute read

Design teams building low-power silicon face nonstop PPA pressure: reduce dynamic and leakage power, hold or shrink area, and still meet timing on irregular floorplans. The latest Cadence Innovus Implementation System release turns that pressure into predictable wins with upgrades across placement (GigaPlace), optimization (GigaOpt), clocking (CCOpt), routing/closure (New PRO), and AI assistance (Innovus+ AI). At Cadence DSG Tech Day, Manoj Rai from Cadence showcased what has changed, why it matters, and how to apply it, so readers searching for "Innovus PPA improvements" or "what’s new in Innovus" can quickly find clear answers that map to real flow switches. This blog is an excerpt from that session.

Placement that Starts with Power and Timing Realities (GigaPlace)

The latest Innovus Implementation System release significantly enhances GigaPlace with multiple PPA-driven capabilities.

Startpoint TNS Method. Beyond endpoint-only costing, GigaPlace also accounts for critical launch flops with worse Q-slack than D-slack by adding start point slack to the cost.
Total Cost = ∑endpoint_WNS + ∑startpoint_WNS.

This reweights timing force for unbalanced flops so launch–capture pairs converge earlier; be aware that a stronger timing force can move instances and shift the local WL distribution.

Complementing this, the Unbalanced Path‑Based SKP, evaluates criticality on both sides of each flop and applies proportional timing weights across the entire critical path. It improves WNS/TNS through GlobalPlace/GlobalOpt.
GigaPlace also introduces Advanced Pipeline Placement, which automatically collects pure F/F pipelines and balances stage spacing/point-to-point wirelengths; note. It eliminates skewed pipeline structures and produces smoother point-to-point wirelengths, yielding better data path symmetry and higher achievable frequencies.

A major addition is Integrated Congestion‑Driven Placement (ICDP), replacing earlier padding-based approaches. Padding mainly helps local traffic; ICDP relocates long-net sources/sinks out of hotspots, so through traffic over macros/blockages clears more reliably than with padding alone.
Rounding out the improvements, Switching Power Placement (SPP) integrates activity-weighted wirelength directly into the placement cost function. It reduces the wirelengths of high-toggle nets, which helps lower overall switching power. This method is especially effective for designs with highly skewed activity profiles.

Together, these upgrades make GigaPlace far more time-sensitive, congestion-aware, and power-intelligent, enabling substantial PPA gains across modern high-density designs.

Optimization that Moves Instances, Not Just Numbers (GigaOpt)

To reach the desired PPA, Innovus Implementation System offers many options, including mega options that include new path compaction and more, as detailed below:

Mega options offer an easy/clear way to create a flow/recipe for much better PPA. Timing, power, and area effort can be improved at both LUI and CUI levels. Use the explicit knobs to set effort and ROI expectations:

Optimization effort:

LUI: setOptMode -opt_timing_effort

CUI: set_db opt_timing_effort: standard/high

Power effort:

LUI: setOptMode –opt_power_effort

CUI: set_db opt_power_effort: none/low/high/ultrahigh

Area effort:

LUI: setOptMode –opt_area_effort

CUI: set_db opt_area_effort: standard/high

New path compaction (local placement refinement) involves tuning the CPR solver by assigning weights to instances with a higher probability of movement, while minimizing side-path impact. Integrated instance movement transforms, combinational, and sequential path balance, and helps in better placement for critical timing paths. The improved path compaction performs:
- Better critical path exploration and working set creation.
- Weight-based prioritization of instances in the working set to guide CPR on movement.
- Improved core CPR engine for better cost computation for local refinement.

Instead of end‑stage skewing, the Innovus Implementation System applies pervasive global skew throughout the flow to maximize useful skew, reduce upfront power, and create margin for power reclaim while minimizing timing churn.

New hold optimizer improves hold TNS, area, and power, and provides an extra performance boost, automatically improving QoR. Its benefit is clear from the designs considered below:

Power optimization, XOR‑tree gating disables clocks when data is stable; Data‑gating ANDs the D pin with ICG enable on high‑activity flops. Expect post‑CTS power reclaim to require timing recovery (GlobalOpt) to retain power gains.

Clocking: Earlier Intelligence, Consistent Behavior, Lower Power (CCOpt)

Clock Concurrent Optimization (CCopt), is a technology integrated into the Innovus Implementation System.

Early Clock Flow (ECF) V2: ECFV2 improves upon the original ECF by addressing issues such as de-cloning after placement and a lack of timing awareness. It moves de-cloning before placement and adds physical awareness to better guide incremental placement. Additionally, ECFV2 enhances clock clustering and de-clustering for improved clock correlation.

It also enhances activity-driven clock tree synthesis (CTS) by reducing wire length on high-activity nets and optimizing clock network resizing through iterative power-driven resizing. These advancements result in more efficient timing, power management, and overall design performance.

Activity‑Driven CTS V2 and Clock‑Gate Push‑Up

Activity-driven CTS V2 plus clock‑gate push‑up cut activity‑weighted capacitance on the hottest parts of the clock tree: push the ICG upstream (logically and physically where allowed). This makes the highly toggling segment shorter and driven through fewer and lower-capacitance elements; even if a quieter branch becomes longer, the overall activity-weighted wire length decreases, and switching power improves.

Activity-driven CTS V2 then resizes under an activity-weighted cost, accepting small total wirelength increases when they lower power in high-activity regions.

Together, they deliver consistent clock‑power reductions with minimal side effects.

H‑Tree New Synthesis Features: More Cell Choices, Less Power

The latest Innovus Implementation System release introduces significant improvements to H-tree synthesis, focused on reducing power consumption and enhancing insertion delay in clock distribution networks. The new H-Tree features now support multiple trunk cells from the same cell family in the trunk section of the tree. These cells can have lower or equal drive strength compared to the original reference H-Tree, allowing the tool to create a more power-efficient trunk while maintaining timing integrity.

Similarly, Innovus Implementation System now supports multiple options for final (leaf) cells, allowing designers to substitute the traditional buffer with an inverter when polarity constraints allow. This flexibility can reduce one tree level and improve overall insertion delay without compromising correctness.

Best practice guidance: use multiple trunk cells from the same family with a drive equal to or lower than the reference trunk; allow an inverter at the leaf when polarity permits to remove one level. Example commands:

create_flexible_tree –trunk_cell {INVX24 INVX16 INVX8} –final_cell BUFX24
create_flexible_tree –trunk_cell {INVX24 INVX16 INVX8} –final_cell {BUFX24 INVX24}

Post‑Route Optimization: Reimagined (New PRO)

New PRO features overhaul closure after routing by fixing the root causes of late-stage instability and then introducing a staged, timing-accurate optimization ladder. The legacy flow suffered from weak pre‑/post‑route correlation, coarse congestion/SI/topology estimation, a limited trackOpt transform set, and almost no router‑optimizer interplay, causing PPA loss, timing “jumps,” and ecoRoute DRC churn that stretched convergence.

The new PRO addresses this by moving from soft wires and global routes with dRoute-level accuracy (eDR) to hard wires (final detailed routes), letting the optimizer try bigger, smarter changes while seeing near-final parasitics. Concretely, it runs through four stages:

Init, reclaims inefficient buffer chains and poor layer assignments to relieve congestion and set a better topology.
Soft, uses eDR (detail‑route‑level) global routes with SI‑based timing to apply large, non‑legal transforms safely
Medium, narrows to moderate transforms.
Hard/ECO, finishes with legal buffering/resizing only.

This reduces post-route timing jumps, ecoRoute DRC churn, and improves pre/post correlation.

Innovus+ AI: Engine‑Level Intelligence and Accumulated Learning

Innovus AI brings learning and search directly into the implementation engines to improve PPA and predictability. GigaOpt‑integrated AI (`-ai`) evaluates transform candidates in parallel across place/clock/route/signoff and selects higher‑ROI moves with improved MT scalability versus baseline.

Accumulated Learning (JedAI) carries forward cell selection and layer assignment experience to avoid unhelpful cells and discourage layers in problematic regions in subsequent run, thereby improving PPA and TAT across iterations.

For usability and data science workflows, Innovus Implementation System embeds a Python UI so you can call Python AI libraries directly on the live database (and connect to JedAI) to mine instances/pins, prototype heuristics, and operationalize learned policies inside the flow.

What It Means in Practice

On recent customer designs (25.12), -ai runs demonstrated reduced density and total/dynamic power at the cost of longer exploratory TAT, which typically shrinks as recipes stabilize.

Design 1 density decreased by 1.8%, dynamic power improved by 2.7%, while total power declined by 5.8%, and TAT improved approximately 1.7 times. Similarly, Design 2 density decreased by 0.23%, dynamic power increased by 0.9%, total power decreased by 2.1%, and TAT improved about 1.4 times (exploration overhead typically shrinks as recipes stabilize).