Home
  • Products
  • Solutions
  • Support
  • Company
  • Products
  • Solutions
  • Support
  • Company
Community Blogs Breakfast Bytes ST's Experience with Cadence Cerebrus

Author

Paul McLellan
Paul McLellan

Community Member

Blog Activity
Options
  • Subscriptions

    Never miss a story from Breakfast Bytes. Subscribe for in-depth analysis and articles.

    Subscribe by email
  • More
  • Cancel
cerebrus
CadenceLive Europe
cadencelive
stmicroelectronics

ST's Experience with Cadence Cerebrus

20 Jan 2023 • 3 minute read

cadenceLIVEAt CadenceLIVE Europe back in Thanksgiving week, one of the presentations was by Olivier Uliana of STMicroelectronics titled Cerebrus PPA Optimization on the Next-Generation High-End Microcontroller CPU Core. In case you've forgotten what Cadence Cerebrus is, you can see my post when we announced the product Cadence Cerebrus - Intelligent Chip Explorer. Or here is the value proposition off the product page:

The Cadence Cerebrus Intelligent Chip Explorer is a revolutionary, machine learning-driven, automated approach to chip design flow optimization. Block engineers specify the design goals, and Cerebrus will intelligently optimize the Cadence digital full flow to meet these power, performance, and area (PPA) goals in a completely automated way. By adopting Cerebrus, it is possible for engineers to concurrently optimize the flow for multiple blocks, which is especially important for the large, complex system-on-chip (SoC) designs needed for today’s ever more powerful electronic systems. Additionally, through the Cerebrus full flow reinforcement learning technology, engineering team productivity is greatly improved.

st test vehicleThe Design Used

So what was ST's experience? The test vehicle was:

  • Sub-20nm technology
  • 580K instances
  • 28 macros
  • Multi-Vt/PB standard cells
  • FBB (forward body bias)
  • AVS (adaptive voltage scaling)
  • LVF and POCV margins

So not a billion transistor chip but still pretty sizable. And it came with the usual set of implementation challenges:

  • RC effect is a major contributor to timing
  • Several dominant timing corners due to body bias and voltage compensation
  • Choice of dominant corners is key for multi-corners implementation
  • Setup/hold conflict is complex due to RC effect and usage of useful skew
  • Xtalk effect becomes an important contributor to timing
  • Context-based timing requires aggressive spacing cells abutment rules
  • Electromigration rules require specific prevention on clock and data paths
  • Persistent margin (synthesis/placement/cts/route) used for signoff timing correlation

Cadence Cerebrus

ST's expectations were to:

  • Use Cadence Cerebrus machine-learning to optimize the core PPA much more than the traditional manual flow
  • Push the frequency to the target and optimize the leakage and area
  • Optimize the productivity by reducing the time to achieve the PPA target

A key aspect of Cadence Cerebrus is that the user gets to choose how to weight the various attributes of the design to be optimized (for example, is timing more important than area?). The actual metrics to be optimized are a bit deeper, namely:

  • Timing setup worst negative slack (WNS) and total negative slack (TNS)
  • Timing hold wns/tns
  • Power switching/internal/leakage
  • Design congestion/density

These metrics can be weighted by the user, although by default they all are 1 (so equally weighted). As Olivier put it:

cost-weight adjustment is key to optimize your objectives

Another part of setting up Cadence Cerebrus is defining the resources that will be made available (basically, what servers Cadence Cerebrus can use). ST used:

  • 16 maximum parallel jobs
  • 16 CPUs per job
  • 243 scenarios explored
  • 16-day runtime
  • disk space x20

Obviously, if you look at those numbers, the Cadence Cerebrus exploration request is high in resource usage and is time-consuming. But in some ways, that is the point: the idea is to replace limited computing and a lot of engineering time with computers.

PPA Results

st's best results

The implementation had 12 PVT corners for timing and power optimization. The base run (manual) took three days for one iteration, whereas Cadence Cerebrus ran for 16 days. But Cadence Cerebrus produced PPA improvements against the base run of +1% frequency, -22% leakage, and -3.5% area.

By changing the weights and focusing on different specific parameters (at the expense of other parameters, of course), ST could get:

  • Up to 6% frequency (versus base run)
  • Up to 20% area reduction
  • Up to 22% leakage reduction
  • Up to -100% setup TNS reduction
  • Up to 90% hold TNS reduction

replay resultsAnother aspect of Cadence Cerebrus is model replay. Having extracted the best scenario from Cadence Cerebrus, that scenario can be used to apply the best recipes and primitives on a similar database without having to do a full exploration, so it saves a lot of runtime. That produced an improvement of -21% leakage, -7% total power, and -0.5% area.

Summary

  • Cadence Cerebrus flow can easily be plugged into the existing Innovus flow
  • Computing resource needs must be anticipated
  • Cadence Cerebrus ML provides a significant and predictable gain on CPU PPA
  • Cadence Cerebrus model replay is efficient to optimize the final run without launching another full exploration

Learn More

See the Cadence Cerebrus Intellligent Chip Explorer product page.

 

Sign up for Sunday Brunch, the weekly Breakfast Bytes email


© 2023 Cadence Design Systems, Inc. All Rights Reserved.

  • Terms of Use
  • Privacy
  • Cookie Policy
  • US Trademarks
  • Do Not Sell or Share My Personal Information