Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ranking System v2 #1186

Merged
merged 17 commits into from
May 26, 2023
Merged

Ranking System v2 #1186

merged 17 commits into from
May 26, 2023

Conversation

francois-rozet
Copy link
Collaborator

@francois-rozet francois-rozet commented Jul 13, 2021

Hello @anuraghazra 👋

As #883, #455 and #1029, I've noticed that the function calculateRank doesn't work as expected.

The first problem is the use of a normal distribution instead of an exponential distribution, as pointed out by #960 . The second is that each metric has the same weight in the score, while some are clearly better indicators (e.g. stars). I've corrected both problems in this PR.

If you want to try it out before merging, I've hosted this version on Vercel.

Importantly, in #960, being extremely good in a single metric (e.g. 100k commits) while being terrible elsewhere still gets an "S+" rank. In my implementation, in order to get "S+", you need to be outstanding everywhere. However, you can still get "A+" or "S" if a single metric is poor (e.g. contributions or issues) but you have a lot of stars and followers.

@vercel
Copy link

vercel bot commented Jul 13, 2021

@francois-rozet is attempting to deploy a commit to the github readme stats Team on Vercel.

A member of the Team first needs to authorize it.

@francois-rozet francois-rozet changed the title Revise rank calculation Patch rank calculation Jul 13, 2021
@francois-rozet francois-rozet changed the title Patch rank calculation Patch rank calculation (not yet another theme) Jul 15, 2021
@anuraghazra
Copy link
Owner

So in #960 seems like linustorvals is getting S+ rank which is good, since atleast we know a baseline that there are people who can get S+ but in this PR your changes seems to stretch the stats maybe a little bit too far?

@anuraghazra anuraghazra mentioned this pull request Jul 18, 2021
@francois-rozet
Copy link
Collaborator Author

Hi @anuraghazra 👋 I've edited the weights such that Linus Torvalds is now S+.

By the way S+ is top 2.5%, S 10%, A+ 25%, A 50% and B+ 75% and B 100%.

@RodrigoDornelles
Copy link

o in #960 seems like linustorvals is getting S+ rank which is good, since atleast we know a baseline that there are people who can get S+ but in this PR your changes seems to stretch the stats maybe a little bit too far?

thought it was great, in the current algorithm most are in the A+, even accounts with less than 5 commits and no stars get this rank, due to the huge amount of registered guests. and accounts with numerous commits, repositories and pull requests also remain with the same rank as this one.

@anuraghazra anuraghazra changed the title Patch rank calculation (not yet another theme) Ranking System v2 Sep 7, 2021
@anuraghazra
Copy link
Owner

By the way S+ is top 2.5%, S 10%, A+ 25%, A 50% and B+ 75% and B 100%.

This ratio seems well balanced.

@francois-rozet
Copy link
Collaborator Author

francois-rozet commented Sep 14, 2021

@anuraghazra Any updates on your plans ?

@akhildevelops
Copy link

This is one of the good ways of modeling the data to exponential distribution. I frequently work with datasets and most of them don't fit into normal distribution (I think the current ranking depends on fitting the data to a normal distribution).

As we currently have no idea on the whole data i.e, meta-data for all the github users with respect to commits, prs, stargazers etc..,. It would be difficult to obtain right parameter values and weights for the model to explain the distribution,

For fun I've tried to see what's the minimum count(MC) required for only one type of parameter (considering others to be zero count) to achieve a given rank.

Rank threshold MC - Commits MC - Prs MC-Followers MC-Stars
S+ 2.5% None None None None
S+ 10% None None None None
A+ 25% None None None None
A 50% None None None 300
B+ 75% None None None 83
B 100% None None None None

Interestingly the weights are so strict that one can achieve only upto Rank A who has only a minimum of 300 stargazers.

I'll try to share more stats based on combinations of parameters.

@francois-rozet
Copy link
Collaborator Author

Hello @Enforcer007, I'm not sure to understand what you want to say.

Do you think it is bad that one cannot get S+ (or even A+ in fact) by maximizing a single metric ?

@akhildevelops
Copy link

Hi @francois-rozet,

The weights you used is a good start, as it restricts users to go higher in ranks who only focus on one kind type of parameter.

@anuraghazra
Copy link
Owner

Just testing out few numbers with the ranking system

https://codesandbox.io/s/calculaterank-9nj6d?file=/src/index.js

One scenario which I'm seeing is that if someone is overall decent on all stats but has 0 stars gained they will always get B+ rank which isn't ideal.

See codesandbox for more info.

@francois-rozet
Copy link
Collaborator Author

francois-rozet commented Sep 19, 2021

Hello @anuraghazra,

You seem to have a really warped view of what is a "decent" number of stars. Most users don't have any stars, let alone more than 100. "just a few thousand stars" is already incredibly outstanding. For instance, here you see that only 5000 users have more than 2750 stars and there are over 40 millions active users on GitHub.

Same for the number of commits, followers, PRs, etc. (2500 commits or 750 followers is not average at all). The values I have chosen were calculated (by hand) as the averages over the stargazers of one of my repos (francois-rozet/piqa). The stargazers of this repository (there are ~30k) would be much more representative. Since you are familiar with GitHub's API it shouldn't be too hard for you to gather the metrics of these users and then compute the averages.

@anuraghazra
Copy link
Owner

anuraghazra commented Sep 19, 2021

You seem to have a really warped view of what is a "decent" number of stars. Most users don't have any stars, let alone more than 100

Exactly, thats why I said if someone is overall decent on all stats but has 0 stars gained they will always get B+ rank which is really unfair.

I'm talking about the first case


// Average user but with 0 stars will always get B+
console.log(
  calculateRank({
    totalCommits: 1500,
    followers: 500,
    prs: 150,
    issues: 65,
    stargazers: 0
  })
);

@francois-rozet
Copy link
Collaborator Author

You can reduce the weight of stars if you want. STARS_WEIGHT = 0.5 seems balanced.

@francois-rozet
Copy link
Collaborator Author

Any updates @anuraghazra ?

@dreamyguy
Copy link

Could this be published and enabled with an extra URL parameter - or with a 'v2' as part of the URL? This would not cause breaking changes for anyone, and we could carry on from there. 🚀 🌔

@rickstaa

This comment has been minimized.

tests/calculateRank.test.js Outdated Show resolved Hide resolved
@francois-rozet
Copy link
Collaborator Author

I increased the targets for commits and stars, such that it is now more challenging (and more motivational IMO) to reach the S rank. To summarize, a user with 1000 commits per year, 200 PRs, 100 issues, 1000 stars and 100 followers is S. 2 times less is A+, 4 times less is A.

@rickstaa
Copy link
Collaborator

rickstaa commented May 19, 2023

Sounds great! Let's merge this in 7 days to give @anuraghazra, @qwerty541 and @Zo-Bro-23 and others some time to look at it. You are welcome to ping me if I forget to merge it 👍🏻.

Will be merged in:

Relative date

Normal

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=torvalds)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=rickstaa)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=anuraghazra)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=francois-rozet)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=FleetAdmiralJakob)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=meurissemax)

include_all_commits=true

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=torvalds&include_all_commits=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=anuraghazra&include_all_commits=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=rickstaa&include_all_commits=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=francois-rozet&include_all_commits=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=FleetAdmiralJakob&include_all_commits=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=meurissemax&include_all_commits=true)

count_private=true

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=torvalds&count_private=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=anuraghazra&count_private=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=rickstaa&count_private=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=francois-rozet&count_private=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=FleetAdmiralJakob&count_private=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=meurissemax&count_private=true)

count_private=true

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=torvalds&include_all_commits=true&count_private=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=anuraghazra&include_all_commits=true&count_private=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=rickstaa&include_all_commits=true&count_private=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=francois-rozet&include_all_commits=true&count_private=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=FleetAdmiralJakob&include_all_commits=true&count_private=true)

![](https://github-readme-stats-git-patch-rank-rickstaa.vercel.app/api?username=meurissemax&include_all_commits=true&count_private=true)

@ZjzMisaka
Copy link

The new system is good, but I'd like to express my personal opinion. I believe that a grading system with only S, A, B tiers is not very good, because people might unconsciously believe that there exist tiers like C, D, F. I've previously stated that a "grading system that satisfies everyone" could ultimately lose its reference value, and the meaning represented by the rank "B" could eventually become the new F.

I trust that people's spirits are not so fragile that they cannot face their own grades, and even lower grades can serve as motivation for personal growth. We don't have to view this system as competitive ranking, instead, we can try to understand this system as a measure of a developer's "growth from the initial state". If we really don't want the grade F to appear, we could use a format like Lv1, Lv2, Lv3... and so forth.

@rickstaa
Copy link
Collaborator

rickstaa commented May 22, 2023

The new system is good, but I'd like to express my personal opinion. I believe that a grading system with only S, A, B tiers is not very good, because people might unconsciously believe that there exist tiers like C, D, F. I've previously stated that a "grading system that satisfies everyone" could ultimately lose its reference value, and the meaning represented by the rank "B" could eventually become the new F.

I trust that people's spirits are not so fragile that they cannot face their own grades, and even lower grades can serve as motivation for personal growth. We don't have to view this system as competitive ranking, instead, we can try to understand this system as a measure of a developer's "growth from the initial state". If we really don't want the grade F to appear, we could use a format like Lv1, Lv2, Lv3... and so forth.

@ZjzMisaka thank you so much for giving your two cents 🙏🏻. I agree that the rank names can be improved. As stated in #1186 (comment) however we will tackle this in a subsequent pull request. Possible improvements are found and discussed in #2265 👍🏻. Maybe you can add your suggestion there?

@everybody Feel free to add your two cents. I will merge this PR in:

Relative date

Copy link
Collaborator

@qwerty541 qwerty541 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very surprising for me that Linus Torvalds, having in his profile a project with an abnormally large number of stars of 150,000+, receives only an A+ grade from the new ranking system. As far as I understand, the new ranking system mainly takes into account the number of open pull requests and issues. Considering that the purpose of this project is to motivate developers and newcomers to contribute more actively in open source projects, then this change looks logical and correct. But I still think that an abnormally large number of stars, commits or followers should also elevate the user to at least the S rank.

@ZjzMisaka
Copy link

It is very surprising for me that Linus Torvalds, having in his profile a project with an abnormally large number of stars of 150,000+, receives only an A+ grade from the new ranking system. As far as I understand, the new ranking system mainly takes into account the number of open pull requests and issues. Considering that the purpose of this project is to motivate developers and newcomers to contribute more actively in open source projects, then this change looks logical and correct. But I still think that an abnormally large number of stars, commits or followers should also elevate the user to at least the S rank.

I agree with this perspective. It's important to recognize that some users may not be involved in a variety of open-source projects, but they could be focusing on their own open-source projects and making significant contributions there. This type of work should also be encouraged, as it has a positive effect on the open-source community as a whole.

@francois-rozet
Copy link
Collaborator Author

francois-rozet commented May 25, 2023

The ranking takes all stats into account (and mainly stars and prs). To have S or S+, you need to be great everywhere. This is the whole point of this PR. If we allow to be good in a single stat, then users with 0 stars, 0 issues, 0 PRs, 0 followers, but 100000 commits still get S+, which makes no sense, especially if the goal is to push devs to collaborate. Besides, it is very rare (try to find another) to find cases like Linus Torvalds with huge number of stars but barely any PRs/issues. Implementing an exception mechanism for a handful of users, which would never use this project anyway, is pointless.

@rickstaa
Copy link
Collaborator

The ranking takes all stats into account (and mainly stars and prs). To have S or S+, you need to be great everywhere. This is the whole reason for this PR. If you allow to be good in a single stat, then users with 0 stars, 0 issues, 0 PRs, 0 followers, but 100000 commits still get S+, which makes no sense, especially if the goal is to push devs to collaborate. Besides, it is very rare to find other cases like Linus Torvalds with huge number of stars but barely any PRs/issues.

I have to agree with @francois-rozet on this one. An excellent developer S+ has to score high in all fields. Although the contributions of Linus Torvald are indispensable for the OS community, and he should therefore have the S+ ranking, we unfortunately cannot deduce this from his GitHub statistics alone 🤔. Furthermore, the big number of commits made by Linus Torvalds that is displayed on the stats card is incorrect. This is caused by a bug in GRS (see #564) 😅. The actual commits made by Linus are in the order of 28k, which is not too far from of @anuraghazra, who scores higher on the other stats.

Although, still would prefer a ranking scheme where I get a S ranking instead of a S+ but I couldn't find weights that don't skew the other developers too much. I therefore still think the solution provided by @francois-rozet is the most balanced one. If you guys can find a combination of weights which gives Linux a S ranking without degrading this balance, feel free to open a subsequent pull request. I will likely merge this pull request tomorrow, since it is an enormous improvement compared to the old behaviour, but will consider any improvements in future PRs 👍🏻.

@ZjzMisaka
Copy link

#1186 (comment)

Here's an idea, but this idea may not be mature enough and it may be difficult to implement......
perhaps we can assign individual rankings for commits, PRs, issues, stars, and followers, we can even create a five-dimensional graph based on these data points. This way, we can have a more intuitive visualization of a developer's overall skill level without struggling to evaluate them using just a single score.

@rickstaa
Copy link
Collaborator

rickstaa commented May 25, 2023

#1186 (comment)

Here's an idea, but this idea may not be mature enough and it may be difficult to implement...... perhaps we can assign individual rankings for commits, PRs, issues, stars, and followers, we can even create a five-dimensional graph based on these data points. This way, we can have a more intuitive visualization of a developer's overall skill level without struggling to evaluate them using just a single score.

Although I agree your idea results in a better developer skill display, I think the extra code and development time required for this idea is not worth it 😅. People can rank developers based on how important they find the individual items displayed on the stats card. The rank was added as a motivator for developers.

@FleetAdmiralJakob
Copy link

@rickstaa

@rickstaa rickstaa merged commit c96e84a into anuraghazra:master May 26, 2023
@rickstaa
Copy link
Collaborator

@rickstaa

Done 🎉!

@rickstaa
Copy link
Collaborator

@francois-rozet thanks again for implementing this improvement 👍🏻.

@sunpm
Copy link

sunpm commented May 29, 2023

The weight of commit is very low

@rickstaa
Copy link
Collaborator

The weight of commit is very low

@sunpm feel free to create an improvement pull request 👍🏻.

@lorezyra
Copy link

ref:

Instead of some letter, why not use what most apps do in gamification: show an XP or numerical level rank? Everyone starts from zero. But few can keep pushing the top.

@francois-rozet francois-rozet deleted the patch-rank branch June 1, 2023 11:21
@Dgdiniz
Copy link

Dgdiniz commented Jun 2, 2023

Hi, I think the formula needs some tweeks. I tested here with some values, and for 1000 commits we have:
{ level: 'B+', score: 69.07599578073207 }

For 10000 commits we have:
{ level: 'B+', score: 67.93963214438497 }

And for 100000 commits we have:
{ level: 'B+', score: 67.93963214436843 }

So the values are converging too fast, and after a while makes no difference to increase a particular value.

@francois-rozet
Copy link
Collaborator Author

This is expected. People with 1000 commits per year are already in the top 5 percentiles (for that statistic). So more commits will not improve the total rank dramatically. If you want to improve the rank, you will have to work on the other stats.

@rickstaa
Copy link
Collaborator

rickstaa commented Jun 2, 2023

Hi, I think the formula needs some tweeks. I tested here with some values, and for 1000 commits we have: { level: 'B+', score: 69.07599578073207 }

For 10000 commits we have: { level: 'B+', score: 67.93963214438497 }

And for 100000 commits we have: { level: 'B+', score: 67.93963214436843 }

So the values are converging too fast, and after a while makes no difference to increase a particular value.

I agree that in the current formula levels out too quickly when people are only good in one statistic. Luckily, @francois-rozet already created a new PR that in my opinion offers a good balance 🙏🏻. @Dgdiniz maybe you can review #2762 (review)? I hosted it on https://github-readme-stats-git-rank-rickstaa.vercel.app:

[![Anurag's GitHub stats](https://github-readme-stats-git-rank-rickstaa.vercel.app/api?username=Dgdiniz)](https://github.com/anuraghazra/github-readme-stats)

Anurag's GitHub stats

@Dgdiniz
Copy link

Dgdiniz commented Jun 3, 2023

Hi @rickstaa, I tested here but the behavior is similar. After a while makes no difference improving a category. I was thinking in a simpler approach. Each category has a weight, so a commit is 2 points and a star is 4 points. So each rank just needs a certain number of points. So each category could reach rank S. The number of points could be calculated using some top contributor as reference. Like 200k points is S+. The points destribution could follow some curve, so from C+ to B- we need few points, but from A+ to S- we need much more points. It's very unlikely to have 50k commits, but some repos have thousands of stars, so I don't know the best way to weight the categories. But today we can have billions of points in a category and still be at rank B. But the idea of forcing points in more than one category is also cool.

@rickstaa
Copy link
Collaborator

rickstaa commented Jun 3, 2023

Hi @rickstaa, I tested here but the behavior is similar. After a while makes no difference improving a category. I was thinking in a simpler approach. Each category has a weight, so a commit is 2 points and a star is 4 points. So each rank just needs a certain number of points. So each category could reach rank S. The number of points could be calculated using some top contributor as reference. Like 200k points is S+. The points destribution could follow some curve, so from C+ to B- we need few points, but from A+ to S- we need much more points. It's very unlikely to have 50k commits, but some repos have thousands of stars, so I don't know the best way to weight the categories. But today we can have billions of points in a category and still be at rank B. But the idea of forcing points in more than one category is also cool.

@Dgdiniz thanks for your feedback. Although I like the simplicity of your system, I love that the current system promotes good code practices like working with pull requests and being a well-rounded developer 🤔. As can be seen in #2762 (comment) I think all ranks are douable when people: make use of both pull requests and commits and get stars and followers by creating OS projects that have value to other people.

@rickstaa
Copy link
Collaborator

rickstaa commented Jun 3, 2023

@Dgdiniz However, as said before, this is a community-owned OS project, so all feedback is welcome. Therefore, You are welcome to open a feature request with your proposal. If enough people upvote your idea, I will review it. I use #1935 to judge which features or bugs I donate my leisure time to 😄.

#1186 and #2762 were/will be merged because there was a lot of demand for a better ranking system (see #455). @francois-rozet did a fantastic job at improving the old system, since getting a rank higher or lower than an A+ in the old system was tough. I also think that creating a rank system that will make everybody happy is not doable since we don't have the ground truth of what is an average GitHub developer 😅.

LucienZhang pushed a commit to LucienZhang/github-readme-stats that referenced this pull request Jun 5, 2023
* Revise rank calculation

* Replace contributions by commits

* Lower average stats and S+ threshold

* Fix calculateRank.test.js

Missing key in dictionary constructor

Co-authored-by: Rick Staa <[email protected]>

* refactor: run prettier

* feat: change star weight to 0.75

* Separate PRs and issues

* Tweak weights

* Add count_private back

* fix: enable 'count_private' again

* test: fix tests

* refactor: improve code formatting

* Higher targets

---------

Co-authored-by: Rick Staa <[email protected]>
devantler pushed a commit to devantler/github-readme-stats that referenced this pull request Sep 24, 2023
* Revise rank calculation

* Replace contributions by commits

* Lower average stats and S+ threshold

* Fix calculateRank.test.js

Missing key in dictionary constructor

Co-authored-by: Rick Staa <[email protected]>

* refactor: run prettier

* feat: change star weight to 0.75

* Separate PRs and issues

* Tweak weights

* Add count_private back

* fix: enable 'count_private' again

* test: fix tests

* refactor: improve code formatting

* Higher targets

---------

Co-authored-by: Rick Staa <[email protected]>
Copy link

@aaron-muti-420 aaron-muti-420 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Con you consider leaving comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-translation README doc translations. documentation Improvements or additions to documentation. enhancement New feature or request. ranks Feature, Bug fix, improvement related to ranking system. ⭐ top pull request Top pull request. stats-card Feature, Enhancement, Fixes related to stats the stats card.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rank doesn't take into account whether include_all_commits is taken into account