In ocean XL monte carlo simulation, it seems the job controller will abort all unfinished jobs once it detects that one of the simulation has an fatal/internal error. Is there a way to resume the computation even if there is such an error?
I'm guessing that this is because you have a relatively small number of jobs set up in your job policy. For example, if you have 1 job set up, and 100 points in your monte carlo, it will tell spectre to do a 100 point monte carlo analysis. If you had 2 jobs and 100 points, then you'd get two jobs, each with 50 points (this is assuming a single test and a single corner). Anyway, since spectre is doing the monte carlo analysis, if there's a fatal error, the simulator will exit - and the remaining monte carlo points in that invocation of the simulator will be aborted.
You could run more jobs - but then they'll run in parallel.
However, the key thing is probably getting the fatal error addressed - this shouldn't happen. Either it's due to something in the setup, or it's a bug in the simulator. I'd guess that it's likely to fail on subsequent points too, depending on the reason for the fatal error.
In reply to Andrew Beckett:
I indeed had set 'maxjobs' less than the number of corners (or number of test points) in a single mc run. Now I changed it to a higher value, hoping that once a simulation encounters an error, it won't affect the other simulations which are still running.
The fatal error is somewhat expected in my simulation, though, I can live with that.
Thanks for your help.