Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmark baseline option #165

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

ntninja
Copy link

@ntninja ntninja commented Apr 9, 2020

Add baseline option that determines (according to the included docs):

Whether this benchmark's results should be considered as possible
base line values when comparing them to other results of the same
group? Use this if you want to include some results just for
comparison, without them affecting the relative scores displayed
for other results.

@ntninja
Copy link
Author

ntninja commented Apr 10, 2020

Anyway, would be cool if somebody could tell me why the test fails, cause looking at the details gives me a 404.

Copy link
Owner

@ionelmc ionelmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, sorry for the tardy review. I added some comments and I hope you haven't lost interest :)

baseline[prop] = max(bench[prop] for _, bench in progress_reporter(
benchmarks, tr, "{line} ({pos}/{total})", line=line)
if bench.get("baseline", True))
except ValueError:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid the try/except somehow? What actually can raise that error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no benchmark in the group marked as baseline this will end up calling max(()) which raises ValueError: max() argument is empty sequence.

Should I convert this to an if? It would require evaluating the array of values up-front, then check their len(…). Or do you prefer me adding a comment about this?

@@ -424,7 +424,7 @@ def pytest_runtest_setup(item):
for name in marker.kwargs:
if name not in (
"max_time", "min_rounds", "min_time", "timer", "group", "disable_gc", "warmup",
"warmup_iterations", "calibration_precision", "cprofile"):
"warmup_iterations", "calibration_precision", "cprofile", "baseline"):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now sure how but we should have some way to validate that there is only 1 baseline per group. It doesn't make sense to have 2 baselines right? Lets not let users wonder why stuff doesn't work as expected (the annoying silent failure).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation marks all results as possible baselines by default and only excludes the ones marked as baseline=False. If there is more than one baseline=True benchmark available it will choose the one with the lowest value/highest score. This perfectly integrates with the existing behaviour and means that I don't have to choose the baseline value selected for all times (as performance may differ between systems, etc). The included docs actually mention this. As you mentioned there cannot be baselines when the output is rendered, but there can be more than one potential baseline scores.

Unless you have strong feelings on this, I'd like to keep it this way for extra flexibility. The wording could be improved however: maybe something along the lines of potential_baseline, but shorter?

@ntninja
Copy link
Author

ntninja commented May 10, 2020

Glad to hear back from you! Did you take a look at the CI yet?

@ntninja
Copy link
Author

ntninja commented Jul 11, 2020

@ionelmc: Ping

@tmct
Copy link

tmct commented Jun 5, 2024

This would be a nice feature

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants