At the recent CadenceLIVE Americas, Yosinori (Yoshi) Watanabe presented what he titled *Accelerate Regression Performance with Machine Learning*. He had to give it such an anodyne title since it appeared in the agenda before the product had been announced. A more descriptive title would be the one I've given this blog post.

I covered Xcelium ML from a high level the day we made the announcement, in my post Xcelium ML: Black-Belt Verification Engineer in a Tool. This post takes a deeper dive into how Xcelium ML works.

Xcelium ML is actually an agent that works with Xcelium Simulator for regression optimization. It learns data observed in regression sessions, iteratively over a series of regression sessions. It captures that knowledge in statistical models that we call machine learning (ML) models. It then produces a condensed regression that achieves the same coverage faster. There are two parts for this: the schedule, which tests from the original regression should be included in this condensed regression; and instructions, how randomization should be done for each run included in the condensed schedule. However, it is still a random regression, and so behaves like any other random regression and doesn't give a deterministic guarantee to achieve a particular coverage.

Xcelium ML works transparently alongside the Xcelium Simulator, both during the learning phase where it transparently extracts the data, and during the execution phase when it transparently controls the randomization engine of the simulator. The diagram shows how it all fits together.

Yoshi said that what we wanted to do was create a feedback loop, which regression produced what behavior. In practice, today, that feedback is done by the verification engineer who understands the testbench and the RTL. However, they don't necessarily understand it that well, so they don't necessarily do that good a job of providing the feedback. Xcelium ML knows nothing about the RTL nor the test strategy, but it uses machine learning to build up a statistical model to provide this knowledge in a quantitative manner.

The user flow is straightforward.

- Run regressions, with the ML Interface of Xcelium enabled
- The user can specify locations of randomization to be observed

- Start XceliumML learning process
- The process collects data from the regressions and creates machine-learning models
- The process can run in parallel to regressions, or for regressions that have been completed

- Once the models are created, generate regressions
- The user invokes the Xcelium ML generation process
- Various customizations are supported
- Target to specific coverage space
- Replace directed tests with random tests as much as possible

Xcelium ML generates regressions along with coverage prediction of a set of regressions. It decides the size of the regression in terms of number of seeds, and predicts not just the coverage ratio of the regression, but also the names of the coverage bins that are predicted to be hit by the regression. Of course, if you increase the size of the regression, you normally will see increased coverage. Note that Xcelium ML is predicting the coverage without actually running the regressions, it is driven off the statistical model.

By default, two regressions are produced: one is the smallest one, and the other is the "knee" of the coverage curve. Each regression consists of a VSIF (input file for vManager to launch a regression) and instruction files. The user can manually generate additional regressions with different sizes from the two defaults.

### Example Commercial Project

Existing (before Xcelium ML) regression:

- 17,050 runs with 235 distinct tests
- 4,099 CPU hours
- 32,007 bins hit

With Xcelium ML:

- Ran the customer’s regression once, and then created machine-learning models with XceliumML
- Let Xcelium ML generate a regression, with a predicted coverage of 99%
- Achieved 99.1% coverage with 1,056 CPU hours (3.8X faster)
- The number of runs was 5,836, with all of the seeds specified as random

- Let Xcelium ML generate a second regression, 3X bigger in terms of number of runs, to investigate the remaining 0.9% of coverage missed by the first regression

The graph above shows the results. The blue is the original (pre-ML), orange is the Xcelium ML run, grey is the second three-times-bigger run at 52%, and yellow is at 66% of the CPU time required for the original run. Looking at the set of bars on the right, the grey bar shows that after 52% of the CPU used (of the original run), the coverage (left-hand bars) is at 99.9%. The yellow bar shows that at 66% of the CPU used, the coverage starts to exceed the original coverage. Although the convergence is very slow, nevertheless Xcelium ML achieves 100% coverage with about two-thirds of the CPU required by the original regressions.

So, in summary:

- XceliumML produces condensed regressions with coverage prediction
- The condensed regressions achieve the coverage faster
- The user can customize the regressions: size, target coverage, etc.
- The correlation between testbench variables and coverage bins can be predicted without running the regression (and other analytics)

### Q&A

There was then a Q&A:

Q: What's the performance overhead?

Usually very small compared to the randomization overhead. Say 0.5% extra time (plus some memory).

Q: Is enough data available after running once to train the models?

Typically you use it for nightly or weekly regressions and then there is enough information to build it. This technology gives statistics on the quality of the data and the models. If more samples are required, the tool will inform you. This tech is built to incrementally add information to the model. If you don’t have enough the first run, you can keep running more regressions and it will learn.

Q: If you have ML model based on initial regression, why not save seeds so it will be deterministic?

If you know a particular seed is really good, it doesn’t give any other behavior, so if you compose them then the regression doesn’t do anything different and you may not find anything new. We started this project to overcome this type of "deterministic" regression that people sometimes use.

### Learn More

For more details on Xcelium ML, see the Xcelium Logic Simulation product page or watch Yoshi give a three-minute explanation of Xcelium ML:

**Sign up for Sunday Brunch, the weekly Breakfast Bytes email.**