-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix out-of-bound access to "alex_check" array #223
Conversation
5c91321
to
3e87dfc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Could you please add a testcase that demonstrates the bug and how it is fixed by your patch?
= alexIndexInt16OffAddr alex_table offset | ||
|
||
| otherwise | ||
= alexIndexInt16OffAddr alex_deflt s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is too early in the morning and my brain hasn't woken up... but I cannot see the semantic difference between the new and the old code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have explained more ;)
The issue is that 'check' is unlifted, hence it's a strict binding. The way it was written before, it was evaluated before checking that offset
was greater than or equal to 0. I've just moved the binding after the check.
Adding a test case for this seems difficult. We could add an assertion in the indexing function in debug mode that the offset is positive, but that seemed overkill. I've found this bug in the lexer code generated for Cabal-syntax. Any use of this code with the JS backend fails (the JS backend doesn't have an addressable heap, instead 'alex_check' array is represented as a JS array, so out-of-bound access triggers an exception).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that is subtle indeed. I unlearned ML a long time ago... ;-)
- Could you add your explanation to the commit message?
- Could you add a CHANGELOG entry for this?
- Do you need a release of the fix?
- (Let's skip the test case as the change obviously makes the code more correct.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
A release would be nice as I'll have to fix Cabal-syntax's lexer.
3e87dfc
to
6a74a47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@Ericson2314 : Could you please add me to the maintainers: https://hackage.haskell.org/package/alex/maintainers/ |
`offset` can be negative. We must check this before using it to index into `alex_check` array. The issue is that 'check' is unlifted, hence it's a strict binding. The way it was written before, it was evaluated before checking that offset was greater than or equal to 0. I've just moved the binding after the check. Caught with GHC's JS backend running Cabal-syntax's lexer. The JS backend uses arrays to represent unlifted string literals, so the bug results in out-of-bound JS array access. With the native backends the out-of-bound array access might trigger a segfault but it is more likely to just read some random values in the heap. That's why it went unnoticed.
58f9910
to
e6c956a
Compare
@andreasabel Awesome. Thanks a lot for your responsiveness! |
Regenerate Lexer.hs with Alex 3.2.7.2 to fix issue #haskell#8892 (out-of-bound access due to haskell/alex#223).
Regenerate Lexer.hs with Alex 3.2.7.2 to fix issue #haskell#8892 (out-of-bound access due to haskell/alex#223).
Regenerate Lexer.hs with Alex 3.2.7.2 to fix issue #haskell#8892 (out-of-bound access due to haskell/alex#223).
Regenerate Lexer.hs with Alex 3.2.7.2 to fix issue #haskell#8892 (out-of-bound access due to haskell/alex#223).
One consequence of this PR is that
In fact, I had a project break with |
@hsyl20 : Can this be implemented without new_s = if GTE(offset,ILIT(0))
&& let check = alexIndexInt16OffAddr alex_check offset in EQ(check,ord_c)
then alexIndexInt16OffAddr alex_table offset
else alexIndexInt16OffAddr alex_deflt s |
Regenerate Lexer.hs with Alex 3.2.7.2 to fix issue #haskell#8892 (out-of-bound access due to haskell/alex#223).
Regenerate Lexer.hs with Alex 3.2.7.2 to fix issue #haskell#8892 (out-of-bound access due to haskell/alex#223).
Regenerate Lexer.hs with Alex 3.2.7.2 to fix issue ##8892 (out-of-bound access due to haskell/alex#223). (cherry picked from commit ca7a8e2)
offset
can be negative. We must check this before using it to index intoalex_check
array.Caught with GHC's JS backend (native GHC would only segfault in rare circumstances).