Get email delivery of the Cadence blog featured here
System on chip (SoC) interconnect must meet the performance requirements of increasingly demanding, complex chips -- but traditional modeling and verification techniques don't shed much light on bandwidth and latency. A new approach to analyzing and debugging performance with ARM system IP (interconnect) will be presented Tuesday, March 12, at CDNLive Silicon Valley in Santa Clara, California.
The new approach will be discussed by William Orme, strategic marketing manager for ARM, and Nick Heaton, senior solutions architect at Cadence, in session DVSY101 Tuesday at 4:45 pm. The approach uses the ARM® AMBA® Designer to generate RTL interconnect, and then uses the Cadence® Interconnect Workbench (see previous blog post here) to automatically analyze bandwidth and latency across hundreds of simulations.
An interconnect performance analysis solution is needed because adequate performance is essential for successful delivery of SoCs. Heterogeneous multi-core architectures are becoming very complex, and interconnects need to intelligently manage traffic from different processors sharing the same memory system. A typical ARM CoreLinkTM CCI-400 Cache Coherent Interconnect, for example, can consume hundreds of thousands of gates. The complexity of the SoC requires new features such as quality of service (QoS), multiple power and clock domains, dynamic queuing, multi-processor support, and traffic management, with many different traffic generators to be set up and verified for performance.
What Doesn't Work
So, why not just ask system architects to run interconnect performance analysis at the transaction-level modeling (TLM) level? According to Heaton, the models at this level aren't accurate enough, and building accurate models would make them very slow. TLM models "abstract the behavior of the interconnect and the memory system completely away," he noted. "You can't measure the way those systems really behave without being cycle accurate."
Thus, what's needed is RTL. And the people who are asked to do the interconnect performance analysis, Heaton said, are most typically RTL verification engineers. "They have been pushed into this kind of analysis and they are struggling," he said.
But pure RTL simulation doesn't really work for interconnect performance analysis, either. Simulation, Heaton said, lets users see a waveform over time. With complex SoC interconnect, there could be hundreds of memory transactions ongoing at any one time, and trying to pick out problems by staring at waveforms is "almost impossible." The better approach is to look at statistical distributions.
What Does Work
"Let's not look at one simulation," Heaton said. "Let's look at 100 simulations that are variations of the same scenario, and do a latency distribution. You can very quickly spot outliers, which are occasional transactions that are taking a lot longer than other ones. You can pick them out and debug them. If it was pure RTL simulation, you'd never find it."
While the AMBA Designer generates RTL for the interconnect, the rest of the SoC doesn't need to be in RTL. For main memory, you could use an approximately timed (AT) model. You would typically replace processors with approximate traffic models. "The traffic analogy is useful because there are a lot of different masters and traffic generators," Orme said. "You have to manage all this competing traffic and make sure no one loses out."
Today's interconnect is highly configurable, and the AMBA Designer can generate RTL for a new configuration at the push of a button. The Interconnect Workbench takes metadata from the AMBA Designer tool and automatically generates testbenches for ARM system IP. In a traditional flow, engineers would have to hand-code the testbench. "You can be running simulations in under an hour without writing a line of code," Heaton said.
The end result is that users can run simulations for different implementations, configurations, and use cases, and get a clearer picture of the impact of design decisions on the way the system will perform. They can make tradeoffs to find the optimal implementation options for the various use cases.
For further information about CDNLive, click here. If you're reading this after March 12 and would like to know more, you can see an ARM guest blog plus video here and a Cadence Industry Insights blog here.