-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict codepoints of valid identifiers #5936
Comments
Another possible model would be the Fortress language specification, which is fairly detailed (see chapter 5, although it doesn't discuss normalization) and was unburdened by backwards compatibility (unlike Python). |
It's starting to look like we need more and more of the ICU library. It would be great to rely on libc and call |
The |
Excellent. |
What character categories do we want to allow in identifiers? Certainly we want Sm (symbol, math) to be allowed, unlike Python. As another example, Python does not allow Po (punctuation, other) characters in identifiers. Currently, Julia does, so you can have e.g. |
I really like using prime in variable names. However, we probably want to use other mathematical operators as, well, operators. So, I suspect we'll have to go through the math pages and decide on a case-by-case basis whether they should be allowed in identifiers or become operators (and how they should parse). |
As mentioned in #5434, separate from the question of what unicode normalization we should use for identifiers, it would probably be a good idea to restrict the codepoints of valid identifiers. Currently, you can do crazy things like:
Python 3's valid identifiers provide one possible model.
The text was updated successfully, but these errors were encountered: