• Skip to main content
  • Skip to search
  • Skip to footer
Cadence Home
  • This search text may be transcribed, used, stored, or accessed by our third-party service providers per our Cookie Policy and Privacy Policy.

  1. Blogs
  2. SoC and IP
  3. Simulating Multiple Cadence DSPs as Multiple x86 Proces…
Nayan Gaywala
Nayan Gaywala

Community Member

Blog Activity
Options
  • Subscribe by email
  • More
  • Cancel
CDNS - RequestDemo

Have a question? Need more information?

Contact Us
Tensilica
Xtensa
SystemC
multicore
simulation
multiprocessing

Simulating Multiple Cadence DSPs as Multiple x86 Processes

31 Oct 2024 • 4 minute read

An increasing number of embedded designs are multi-core systems. At the pre-silicon stage, customers use a simulation platform for architectural exploration and software development. Architects want to quantify the impact of the number of cores, local memory size, system memory latency, and interconnect bandwidth. Software teams wish to have a practical development platform that is not excruciatingly slow.

This blog shares a recipe for simulating Cadence DSPs in a multi-core design as separate x86 processes. The purpose is to reduce simulation time for customers with simple multi-core models where cores interact only through shared memory. It uses a Vision Q8 multi-core design to share details of the XTSC (Xtensa SystemC) model, software application, commands, and debugging. Note the details shared are for a simulation run on an Ubuntu Linux machine, Xtensa tools version RI-2023.11, and core configuration XRC_Vision_Q8_AODP.

Complex vs. Simple Model

A complex model (Figure 1) is one in which one core accesses another core's local memory, or there are inter-core interrupts. Simulation runs as a single x86 process.

Figure 1

A simple model (Figure 2) is one in which cores interact only through shared memory. Shared memory is a file on the Linux host.

Figure 2

Multiple x86 Process – Simple Model

As depicted in Figure 3, each core is simulated using a separate x86 process. Cores use barriers and locks placed in shared memory for synchronization and data sharing. Locks are placed in un-cached memory that support exclusive subordinate access. The XTSC memory component, xtsc_memory, supports exclusive subordinate access. Cadence software tools provide a way to define memory regions as cached or uncached. For more details, please refer to Cadence's Linker Support Packages (LSP) Reference Manual for Xtensa SDK.

Figure 3

Demo Application

A demo application performs a 128x128 matrix multiplication. Work is divided so that each of the 32 cores computes four rows of the 128x128 result matrix. Cores use barriers to synchronize. Cadence tools provide APIs for synchronization and locking. Please refer to Cadence's System Software Reference Manual for more details. Note without a higher-level lock, prints from all cores will get mixed up. Therefore, in the demo application, only core#0 prints.

SystemC Simulation

The following sample command runs the 32-core simulation in such a way that each core is a separate x86 process. It runs a matrix multiplication application in cycle-accurate mode with logging off.
>>for (( N=0; N<32; N=N+1 )); do xtsc-run -define=NumCores=32 -define=N=$N -define=LOGGING=0 -define=TURBO=0 -define=PROG_NAME=/..path../MatMul -i=coreNN.inc & done

"xtsc-run" is a Cadence Xtensa SDK application allowing users to run systemC simulations without C++/systemC programming. For more details, please refer to Cadence's Xtensa SystemC (XTSC) User's Guide. coreNN.inc is the model topology in a readable text format. Since each core runs as a separate x86 process, each x86 process synchronizes at the end of the elaboration phase before starting the simulation.

Figure 4 shows the dump of the Linux "htop" command. It shows 32 separate processes.

Figure 4

It is easy to create a custom simulation using an XTSC script. Capturing wall time with XTSC requires logging, which slows down the simulation. Custom simulations also offer per-core debug control. A single custom simulation executable can simulate each of the 32 cores as a separate x86 process.

One approach to a single custom simulation executable is to generate sc_main.cpp for, say, core#0 from coreNN.inc using the following XTSC script.
>>xtsc-run -define=NumCores=32 -define=N=0 -define=LOGGING=0 -define=TURBO=0 --xxdebug=sync -i=coreNN.inc -sc_main=sc_main.cpp -no_sim

Modify the sc_main.cpp generated for core#0 to create a generic sc_main.cpp to build a single simulation executable for all cores. The Xtensa SDK includes Makefile targets to build custom simulations.

By default, the simulation runs in cycle-accurate mode. Fast functional (Turbo) mode provides additional improvement over cycle-accurate mode. Note that the fast functional mode has an initialization phase, so gains are visible only when running an application with longer run times.

Simulation Wall Time

The table captures simulation wall time improvements. Note that these are illustrative wall time numbers. Actual wall time numbers and improvements will depend on your host machine's performance and your application.

Simulation Type

Wall Time

Comments

Single process cycle accurate mode

17500 seconds

Multiple x86 processes cycle accurate mode

1385 seconds

12X faster than single process

Multiple x86 processes turbo mode

415 seconds

3X faster than cycle accurate mode

Debugging

Attaching a debugger to each of the individual x86 core simulation processes is possible. Synchronous stop/resume and core-specific breakpoints are also supported. Configure the Xplorer launch configuration and attach it to the running simulation processes as follows (Figure 5)

Figure 5

Figure 6 shows 32 debug contexts.

Debug context of 32 cores

Figure 6

As shown, using Xtensa SDK, you can create a multi-core simulation that functions as a practical software development platform. Please visit the Cadence support site for information on building and simulating multi-core Xtensa systems.


CDNS - RequestDemo

Try Cadence Software for your next design!

Free Trials

© 2025 Cadence Design Systems, Inc. All Rights Reserved.

  • Terms of Use
  • Privacy
  • Cookie Policy
  • US Trademarks
  • Do Not Sell or Share My Personal Information