• Home
  • :
  • Community
  • :
  • Blogs
  • :
  • Breakfast Bytes
  • :
  • Implementing Arm Hercules with Digital Full Flow

Breakfast Bytes Blogs

  • Subscriptions

    Never miss a story from Breakfast Bytes. Subscribe for in-depth analysis and articles.

    Subscribe by email
  • More
  • Cancel
  • All Blog Categories
  • Breakfast Bytes
  • Cadence Academic Network
  • Cadence Support
  • Computational Fluid Dynamics
  • CFD(数値流体力学)
  • 中文技术专区
  • Custom IC Design
  • カスタムIC/ミックスシグナル
  • 定制IC芯片设计
  • Digital Implementation
  • Functional Verification
  • IC Packaging and SiP Design
  • In-Design Analysis
    • In-Design Analysis
    • Electromagnetic Analysis
    • Thermal Analysis
    • Signal and Power Integrity Analysis
    • RF/Microwave Design and Analysis
  • Life at Cadence
  • Mixed-Signal Design
  • PCB Design
  • PCB設計/ICパッケージ設計
  • PCB、IC封装:设计与仿真分析
  • PCB解析/ICパッケージ解析
  • RF Design
  • RF /マイクロ波設計
  • Signal and Power Integrity (PCB/IC Packaging)
  • Silicon Signoff
  • Solutions
  • Spotlight Taiwan
  • System Design and Verification
  • Tensilica and Design IP
  • The India Circuit
  • Whiteboard Wednesdays
  • Archive
    • Cadence on the Beat
    • Industry Insights
    • Logic Design
    • Low Power
    • The Design Chronicles
Paul McLellan
Paul McLellan
22 Nov 2019

Implementing Arm Hercules with Digital Full Flow

 breakfast bytes logoAt Arm TechCon, there was a joint presentation by Arm, Cadence, and Samsung Foundry about implementing Arm's next-generation high-performance CPU in Samsung's 5nm process.

The diagram below, from Arm's segment of the presentation, shows how the whole process was done, what has become known as design technology co-optimization (DTCO). Cadence has to update its tool flow, in parallel Arm needs to develop both the next-generation processor and the physical libraries. In parallel, Samsung has to develop PDKs. The goal is to bring everything to the finish line together, so there is a physical design and signoff flow, a processor core, standard cell libraries and a POP for the processor, and a 1.0 PDK (not to mention the development of the process itself, which is beyond the scope of this post).

This required coordination of five design teams worldwide: Arm's CPU team, Arm's library design and optimization team, Arm's POP library team, Samsung Foundry's 5LPE process team, and the Cadence digital flow team.

Samsung's Kevin Yee kicked off with an overview of the 5LPE process. The process is a shrink of their 7LPP process. This means that they can re-use IP and migrate designs easily. However, there are additional process enhancements that can be taken advantage of for boosted performance: MDB, SDB, single fin cells, CB or RX edge. The process is Samsung's second to use EUV (the prior generation, 7LPP, being the first). The table below summarizes the two processes.

There is a new six-track standard cell library for area and power, along with the 7.5-track library for performance. Next year, 2020, Kevin predicts that the main node will be 5nm. The PDK 1.0 was released in January, and various MPW test vehicles were built during the year.

Arm

Next up was Fakhrunndin Ali Bohra from Arm. He opened by pointing out that over 3 billion chips have been shipped by Samsung based on Arm Artisan physical IP. He pointed out that AI is everywhere, which fits perfectly with Cadence's message of Intelligent System Design.

Hercules is the next-generation processor after Cortex-A76 and Cortex-A77 (maybe it will be Cortex-A78 when it is released, but for now it just has a codename).

He pointed out some of the "new" challenges in 5nm: via ladders, new placement rule, power grid challenges, and addressing variation. Arm has continued to enhance the Artisan logic IP with new compressor cells, new flip-flops, new multi-bit functional cells. This results in about 5-10% area gain at the cell level (10+% for flops).

One area requiring a lot of attention is the design of the power delivery network to avoid EM/IR issues. The cells need to be designed to allow flexible cell-placement but keeping regularity of the power grid. It used to be the case that the cells could be designed independently of the power grid, but increasingly the specific regularity of the vertical power stripes across the standard cell rows needs to be designed-in from the beginning.

Cadence

 Edson Gemersall of Cadence went third, with an overview of the physical design. The 5LPE process and tool co-optimization delivers out of the bod better full-flow PPA but turning on:

  • Global route tuning
  • Optimization tuning
  • Layer promotion
  • M1/M2 usage
  • Cell legalization
  • DRC convergence
  • Statistical via support

One of the biggest challenge in leading-edge nodes is that the interconnect has high resistance. This requires that the whole flow is IR drop aware. Genus (synthesis), Innovus (place and route), Tempus (static timing), and Voltus (IR analysis) all operate off a single data model. This operates all through the flow:

  • Early rail analysis
  • IR drop-aware placement
  • Clock useful skew for peak power
  • Power routing and via optimization
  • Timing and IR drop-aware ECO

Over the period of the project, as the three partners worked together, the performance of the processor improved by 30%, and there is still more to come. This is made easy for customers to take advantage of, since it is captured in a RAK, a Rapid Adoption Kit. This contains flow scripts, example floorplans, application notes, and more.

 

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.

Tags:
  • Genus |
  • hercules |
  • Samsung |
  • Innovus |
  • ARM |
  • full flow |