-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Rule: large number without underscore separators (PEP 515) #18221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I rebased on current main and changed the code to RUF062 since 061 is now taken by another rule. |
|
I think something went wrong with the rebase, as GitHub is now showing more than 500 commits and over 100,000 lines changed, which would make it pretty difficult to review! |
d005005 to
c30fbe2
Compare
|
Not sure what I did wrong. But I made some clean-up so there is a single commit on top of current main with all the changes now. |
|
Thanks for tidying up and for your work on this! I think we'll still want to resolve the |
|
I agree that we still need to answer the question if we want to have such an opinionated rule that enforces specific grouping of numbers. I do like clippy's rule that are less opinionated but enforce good practice:
The first two seem very useful to me. It's less clear to me if we want to add any more opinionated rules. |
|
I think the point of having an opinionated rule is to allow auto-fixing, which should be suitable for most cases. If some codebase has very specific requirements, the rule should be configurable enough to accommodate them (as long as those requirements are consistent across the whole code-base). Or the rule can be disabled otherwise. Another different rule that only warns about inconsistent, uncommon or too large grouping (like in clippy) may also be created, and user may choose to activate one or the other. Or even both, because if the "enforcer" rule (from this PR) is ran before the "warning" rule, the former should fix any issue before any warning is issued by the later. Would that suit what you had in mind? I can implement that warning rule if needed. |
I think it would be very unusual, probably unprecedented, to have two overlapping rules with different severities, so I don't think that's a viable option. We do plan to add a warning level/severity (#1256) in the future, though. |
Summary
This adds a rule (with code RUF062) that automatically formats large numbers with underscore separators to make them more readable. This is as described in PEP515, and discussed in #12787 which I opened a while back.
E.g:
123456becomes123_456.12345becomes.123_450xDEADBEEFbecomes0xDEAD_BEEF(see test snapshot for more examples)
This rule works for:
9999or less) since they are already readable enough.123456789gives123_456_789123456789.123456789gives123_456_789.123_456_7890xABCD): add underscores to form groups of 4 digits by default (configurable)0x1234567890ABCDgives0x12_3456_7890_ABCD0o1234): add underscores to form groups of 4 digits by default (configurable)0o1234567gives0o123_45670b1010101): add underscores to from groups of 8 bits (octets) by default (configurable)0o1010111100100101gives0o10101111_00100101123e10): the leading part is formatted with the same rules as integers. The exponent part is untouched as it should never be more than 3 chars anyway.+or-sign is not part ofExpr::NumberLiteralinstances once parsed by ruff, so this rule does not modify them in any way, they just stay in place.Support for indian-style number formatting:
According to https://randombits.dev/articles/number-localization/formatting , most of the world groups decimal digits 3 by 3, excepted for India who uses groups of 2 after the first group of 3 (so thousands, hundred of thousands, hundreds of hundreds of thousands, etc.). A configuration option allows enabling this kind of grouping.
I am however not sure about what is the practice for formatting the float part in India. I implemented a "reversed" logic, with separators on thousandth, then hundredth of thousandth, hundredth of hundredth of thousandth, etc. (not sure if anyone ever needed such a float precision ^^'). This may need to be adjusted.
So
123456789.123456789formatted with indian mode will give12_34_56_789.123_45_67_89Test Plan
A new test file
RUF062.pyis part of the PR and is executed oncargo test.TODO / to discuss