-
Notifications
You must be signed in to change notification settings - Fork 7
Incremental benchmarking script and suite #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
So there's not really a need to "review" this, but I wanted to add some explanation of how to use it somewhere. I will maybe move this into the readme file for this repo. Might be good to have some more explanations of what all the different things are in this repo. But there are many scripts here, and I don't know what half of them are for... :) |
|
I'm using the following:
The other (old) scripts can probably be removed. Copying the csv to LibreOffice to get some ratios/graphs is a bit annoying, but not sure if it's worth generating something since it depends on the benchmark. |
|
We can let the renaming ids serve as baseline. It makes no significant difference to the run time. Something is very fishy with my comparison script or with the benchmarks because currently it seems to imply that we compute exactly the same values even without restart: results. At least, I thought this patch would result in a change in protected mutexes for a specific global. I will have to look into this. |
|
It seems that the script and compare operations are actually working. It does report on example I added. These patches are fairly small... Might need to look at some sort of mutex-based simple example to be sure it really is working. The above patch results in variables like |
|
Also goblint/analyzer#397 is not yet in Goblint's interactive branch, so races (or the lack thereof) has nothing to do with anything in the solution or restarting. It should still matter for protecting locksets, but if there still are other unprotected accesses, then the protecting lockset in the global invariant cannot improve either. |
|
Okay, running with intervals enable doesn't change the story much: results. There was an incomparable thing before as well, but now there is a verification failure with |
|
This is now a fairly good initial benchmark set. With Interval analysis turned on there are examples where precision is lost without restarting, but right now too much gets restarted. It is very fishy because even diffs that should not change anything globally results in restarts. Also, earlyglobs does not help. |
This is a hacky first version of a script that does more or less what I want. The final version will be just as hacky, but it should also allow all these different flags to be compared and maybe integrate the precision comparison from the other script. The critical thing at this stage is to have some way to sanity check performance issues to determine the viability of this approach.
Still a few things to do here:
Some post-finalization steps: