Skip to content

Fix a bug with the current implementation of the NumericPercentile Constraint #241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Sep 5, 2024

Conversation

kklein
Copy link
Collaborator

@kklein kklein commented Sep 3, 2024

Previously, the percentile computation was not according to the definition we provided ourselves.

Our definition said that

The percentile is defined as the smallest value present in column for which percentage % of the values in column are less or equal.

Let's now assume we have the following data

value rank rank/n
1 1 1/19 ~ .05
2 2 2/19 ~ .11
3 3 3/19 ~ .16
4 4 4/19 ~ .21
... ... ...
19 19 19/19 = 1

If we set the percentile of interest to be 20%, we see that the value 4 is the smallest value present in the data, for which at least 20% of the data is lesser or equal to that value.

Previously, our retrieval would've yielded 3. This behavior is explicitly reflected in our own tests.

Copy link

codecov bot commented Sep 3, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.36%. Comparing base (1f9280c) to head (eb6eb14).
Report is 12 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #241      +/-   ##
==========================================
+ Coverage   93.35%   93.36%   +0.01%     
==========================================
  Files          18       18              
  Lines        1985     1988       +3     
==========================================
+ Hits         1853     1856       +3     
  Misses        132      132              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kklein kklein added postgres bug Something isn't working labels Sep 3, 2024
@kklein kklein changed the title Illustrate and fix a bug with the current implementation of the NumericPercentile Constraint Fix a bug with the current implementation of the NumericPercentile Constraint Sep 4, 2024
@kklein kklein added ready and removed postgres labels Sep 4, 2024
@kklein kklein marked this pull request as ready for review September 4, 2024 16:07
@kklein kklein requested a review from ivergara September 4, 2024 16:09
Copy link
Collaborator

@ivergara ivergara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you plan on releasing right away you can fix the changelog file before merging.

@kklein kklein merged commit 6a6d756 into main Sep 5, 2024
58 checks passed
@kklein kklein deleted the fix_percentiles branch September 5, 2024 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants