Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow underscores in unicode escapes #43692

Closed
behnam opened this issue Aug 5, 2017 · 1 comment
Closed

Allow underscores in unicode escapes #43692

behnam opened this issue Aug 5, 2017 · 1 comment
Labels
A-unicode Area: Unicode C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@behnam
Copy link
Contributor

behnam commented Aug 5, 2017

Underscores are supported in all numeric types in Rust, and a recent clippy update actually warns on not using underscores for long numbers.

However, underscores are not supported for Unicode Escape literals. Therefore, it is not possible to write the first character of the 16th plane as \u{10_0000}, while \u{100000} is not easy to read and can be misread as \u{10000}.

At the moment, underscore results in an error like this:

error: invalid character in unicode escape: _
  --> unic/tests/basics_test.rs:31:23
   |
31 |         Age::of('\u{10_FFFF}'),
   |                       ^

meaning that there's no backward-compatibility issue and supporting underscores would be a compatible enhancement.

Unicode already has clear definition of Planes, numbered 0 to 16, which hint to write literals as \u{<plane>_<4-hex-digits>} sequences.

Optionally, we can opt-in to only allow underscore in a specific position, like the aforementioned format. But I think that would just make it too complicate for no apparent reason. Such a check could be a clippy rule, of course.

What do you think?

@MaloJaffre
Copy link
Contributor

MaloJaffre commented Aug 7, 2017

I will try to open a PR soon and see after if a RFC is needed or not.

@Mark-Simulacrum Mark-Simulacrum added A-unicode Area: Unicode C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Aug 10, 2017
MaloJaffre added a commit to MaloJaffre/rust that referenced this issue Aug 17, 2017
bors added a commit that referenced this issue Sep 12, 2017
Accept underscores in unicode escapes

Fixes #43692.

I don't know if this need an RFC, but at least the impl is here!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-unicode Area: Unicode C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants