Report syntax error for too-long bin/hex/oct integer literals#11447
Report syntax error for too-long bin/hex/oct integer literals#11447straight-shoota merged 6 commits intocrystal-lang:masterfrom
Conversation
Currently the OverflowError causes a compiler crash. Change the operations to bitwise to remove the actual overflow detection but explicitly deduce that an overflow must have happened, through the digit count. Add the actual number literal to the error message (note: needed to ensure that the literal is fully read before declaring the overflow).
|
It feels odd that an explicitly typed literal such as |
|
@straight-shoota I agree and acknowledge it, but implementing the proper detection is very difficult, so I didn't opt for it. Right now what's being leaked is a compiler error so I figured it's enough of an improvement. Or, well, I wouldn't call it "leaking" even, because it's impossible to rely on syntax error messages. |
|
Sure, it's certainly an improvement. But why not improve it even more? We don't even need to fix printing the actual literal type in the error message. If that's too hard to do, I'd rather just omit |
|
This might actually not be that hard to fix properly, though. I think we could just pass a flag to |
|
Something to point out is that it's not just the error itself having an arbitrary restriction, literals over UInt64 could work but are arbitrarily not implemented for bin/hex/oct, and this at least gives a related piece of information. Anyway, I'll think of a full fix some time next week. |
|
OK I did that. I had to duplicate the conversion from |
src/compiler/crystal/syntax/lexer.cr
Outdated
| def finish_scan_prefixed_number(num, negative, start) | ||
| def finish_scan_prefixed_number(num : Int?, negative : Bool, start : Int32) | ||
| if num.nil? # Doesn't even fit in UInt64 | ||
| string_value = string_range_from_pool(start).gsub("_", "") |
There was a problem hiding this comment.
Why modify the literal string with gsub?
There was a problem hiding this comment.
There's no other way to get rid of the underscores that might be there. I needed to get rid of trailing underscore, but figured that in the middle they're also not good. Do you suggest something else?
There was a problem hiding this comment.
I don't understand why you want to get rid of any underscores. I would expect to see the literal exactly as it's written.
There was a problem hiding this comment.
So, then notice this dichotomy:
p 0x1234_1234_1234_i32
Error: 20014853001780 doesn't fit in an Int32(that is the case both before and after^)
This is what you're suggesting:
p 0x1234_1234_1234_1234_1234_i32
Error: 0x1234_1234_1234_1234_1234_ doesn't fit in an Int32This is what I have for the moment
p 0x1234_1234_1234_1234_1234_i32
Error: 0x12341234123412341234 doesn't fit in an Int32Which do you prefer?
There was a problem hiding this comment.
We could also change even the pre-existing case to be like that.
Say,
p 0x1234_1234_1234_i32
Error: 0x1234_1234_1234 doesn't fit in an Int32But that's an even bigger change.
There was a problem hiding this comment.
I don't follow. The first error message is weird and confusing. It presents the literal in a different format than it is written.
I would like to print the literal verbatim:
0x1234_1234_1234_1234_1234_i32 # => Error: 0x1234_1234_1234_1234_1234 doesn't fit in an Int32There was a problem hiding this comment.
My main point is, I intended to not touch the case where the literal happens to fit UInt64. Because that's not what I originally opted to work on.
The example I show is pre-existing, doesn't change with the current state of this PR.
So, then notice this dichotomy:
p 0x1234_1234_1234_i32 Error: 20014853001780 doesn't fit in an Int32(that is the case both before and after^)
Do you think that should be changed too, as per
#11447 (comment) ?
There was a problem hiding this comment.
Yes, I would like that to change. It doesn't need to be in this PR, though.
But we should not go in a different direction with the changes that are in this PR.
There was a problem hiding this comment.
OK so fixed as per your comments, but not changing the pre-existing error message.
There was a problem hiding this comment.
Yeah, that's totally fine. Thanks.
|
|
||
| # 0o177777_77777777_77777777 is the largest UInt64. | ||
| num = nil if num_size > 22 # or > 21 with first digit being 2 through 7 | ||
| num = nil if {num_size, first_digit} > {22, 0o1} |
There was a problem hiding this comment.
This condition looks unconventional and I find it a little harder to read. Maybe this would be a better way to write this?
| num = nil if {num_size, first_digit} > {22, 0o1} | |
| num = nil if num_size > 22 || (num_size == 22 && first_digit == 1) |
It's fine by me to keep it that way, though.
There was a problem hiding this comment.
Well because it's the latest idea of mine, obviously I prefer to keep it 😀
Not a big deal though, others can weigh in as well.
There was a problem hiding this comment.
Looks dandy, but to be honest I can't say for sure what's happening under the hood without consulting docs on Tuple#<=>, not to mention redundant Tuple allocation. IMO it would be more lightweight and understandable if it was written using the more explicit version :)
Currently the OverflowError (for any number that doesn't fit into UInt64 as it's being constructed) causes a compiler crash.
Change the operations to bitwise to remove the actual overflow detection but explicitly deduce that an overflow must have happened, through the digit count.
Add the actual number literal to the error message (note: needed to ensure that the literal is fully read before declaring the overflow).
Now:
There's something to be said about the fact that the errors are still not unified depending if the number happens to cross the size of UInt64 or not:
(both before and after, so this is not affected)