Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrected rank normal distribution #817

Conversation

magneticflux-
Copy link

@magneticflux- magneticflux- commented Jan 16, 2021

This does NOT fix #455 completely, but it corrects the normal distribution function to something that is known to be correct.

In the future, I plan to replace the normal CDF with a power law CDF, which is made easier by migrating the existing normal CDF to a math library right now.

@vercel
Copy link

vercel bot commented Jan 16, 2021

@magneticflux- is attempting to deploy a commit to the github readme stats Team on Vercel.

A member of the Team first needs to authorize it.

@magneticflux- magneticflux- marked this pull request as ready for review January 16, 2021 16:02
@codecov
Copy link

codecov bot commented Jan 16, 2021

Codecov Report

Merging #817 (c36810a) into master (dff732a) will decrease coverage by 0.06%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #817      +/-   ##
==========================================
- Coverage   93.94%   93.88%   -0.07%     
==========================================
  Files          22       22              
  Lines         677      670       -7     
  Branches      188      187       -1     
==========================================
- Hits          636      629       -7     
  Misses         37       37              
  Partials        4        4              
Impacted Files Coverage Δ
src/calculateRank.js 89.74% <100.00%> (-1.75%) ⬇️
src/cards/stats-card.js 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dff732a...c36810a. Read the comment docs.

@anuraghazra
Copy link
Owner

Thanks for working on this @magneticflux-

Can you show/add more examples on how this change is effected the ranks? that would be nice too see and for further tweaking the values.

@magneticflux-
Copy link
Author

This PR replaces the approximation of the error function (used to compute the normal CDF) with a more precise approximation that is accurate to at least 18 significant decimal digits over the domain of the function. The details of the new approximation can be found here: https://mathjs.org/docs/reference/functions/erf.html

In practical terms, this PR reduces error in calculating the normal CDF. It reduces error near the mean slightly (as seen in the changed test value), but it reduces error at the extremes more.

Some examples:

Input | 30, 25, 5
New   | 0.15865525393145707
Old   | 0.15865526383236372
Input | 30, 25, 1.4241
New   | 0.12651182223941376
Old   | 0.12651187738346226

Again, this PR only addresses problem 1 that I bring up in #455. Problem 2, using an ill-suited probability distribution, is still an issue and I intend to open another PR that migrates to a different distribution (using the Math.js library I introduce in this PR).

@stale
Copy link

stale bot commented Feb 16, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Issue is marked as stale. label Feb 16, 2021
@magneticflux-
Copy link
Author

To fix #455 completely, I'd like to log anonymous statistics for requests to the API so that we can create a histogram plot of their score. That would let us calculate a grade based on the average of all github-readme-stats users. Additionally, we could take a random sample of GitHub profiles to calculate and use that if we want to grade based on the average of all GitHub users. That would let us validate the assumption that the score follows a power-law distribution.

@stale stale bot removed the stale Issue is marked as stale. label Feb 16, 2021
@stale
Copy link

stale bot commented Mar 18, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Issue is marked as stale. label Mar 18, 2021
@stale stale bot closed this Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issue is marked as stale.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants