Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare results with tolerance #390

Conversation

AlexanderYastrebov
Copy link
Contributor

As an exercise I've re-implemented #375 idea to compare result numbers with a tolerance:

$ ./test.sh baseline 0 src/test/resources/samples/measurements-rounding-precise.txt
Validating calculate_average_baseline.sh -- src/test/resources/samples/measurements-rounding-precise.txt
Rounding=14.6/25.5/33.6 != Rounding=14.6/25.4/33.6 (avg)

The hassle of parsing highlights the importance of using machine-readable format as suggested in #14

But I think this approach is wrong, the rules should define rounding mode and baseline should be fixed to produce the correct result.

Submissions should be fixed accordingly or qualified as non-passing.

Could have been done earlier but better late than never.

Updates #49

As an exercise I've re-implemented gunnarmorling#375 idea to compare result numbers
with a tolerance:
```
$ ./test.sh baseline 0 src/test/resources/samples/measurements-rounding-precise.txt
Validating calculate_average_baseline.sh -- src/test/resources/samples/measurements-rounding-precise.txt
Rounding=14.6/25.5/33.6 != Rounding=14.6/25.4/33.6 (avg)
```
The hassle of parsing highlights the importance of using
machine-readable format as suggested in gunnarmorling#14

**But** I think this approach is wrong, the rules should define rounding
mode and baseline should be fixed to produce the correct result.

Submissions should be fixed accordingly or qualified as non-passing.

Could have been done earlier but better late than never.

Updates gunnarmorling#49
@gunnarmorling
Copy link
Owner

gunnarmorling commented Jan 14, 2024

Hey @AlexanderYastrebov, yeah, agreed that the "compare with tolerance" approach isn't the best one. I have opened #392 which fixes the baseline and adds a test case to ensure the correct behavior. Note I am not going to re-evaluate existing entries, compliance at the time of original evaluation is what decides (it's a common strategy, for instance also used in JCP when certifying spec implementations). New or updated entries must be compliant with the fixed behavior, by passing the additional test.

@gunnarmorling
Copy link
Owner

So do we still need this in the light of the above, @AlexanderYastrebov?

@AlexanderYastrebov
Copy link
Contributor Author

So do we still need this in the light of the above

No, this was just an exercise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants