Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix out-of-bound access to "alex_check" array #223

Merged
merged 2 commits into from
Apr 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
## Change in 3.2.7.2

* Fix bug with out-of-bound access to `alex_check` array.
(Surfaced with GHC's JS backend, fixed by Sylvain Henry in
PR [#223](https://github.com/haskell/alex/pull/223).)
* Tested with GHC 7.0 - 9.6.1.

_Andreas Abel, 2023-04-03_

## Change in 3.2.7.1

* Fix bug with repeated numeral characters *outside* of `r{n,m}`
Expand Down
2 changes: 1 addition & 1 deletion alex.cabal
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cabal-version: >= 1.10
name: alex
version: 3.2.7.1
version: 3.2.7.2
-- don't forget updating changelog.md!
license: BSD3
license-file: LICENSE
Expand Down
12 changes: 8 additions & 4 deletions data/AlexTemplate.hs
Original file line number Diff line number Diff line change
Expand Up @@ -161,11 +161,15 @@ alex_scan_tkn user__ orig_input len input__ s last_acc =
let
base = alexIndexInt32OffAddr alex_base s
offset = PLUS(base,ord_c)
check = alexIndexInt16OffAddr alex_check offset

new_s = if GTE(offset,ILIT(0)) && EQ(check,ord_c)
then alexIndexInt16OffAddr alex_table offset
else alexIndexInt16OffAddr alex_deflt s
new_s
| GTE(offset,ILIT(0))
, check <- alexIndexInt16OffAddr alex_check offset
, EQ(check,ord_c)
= alexIndexInt16OffAddr alex_table offset

| otherwise
= alexIndexInt16OffAddr alex_deflt s
Copy link
Member

@andreasabel andreasabel Apr 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is too early in the morning and my brain hasn't woken up... but I cannot see the semantic difference between the new and the old code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have explained more ;)

The issue is that 'check' is unlifted, hence it's a strict binding. The way it was written before, it was evaluated before checking that offset was greater than or equal to 0. I've just moved the binding after the check.

Adding a test case for this seems difficult. We could add an assertion in the indexing function in debug mode that the offset is positive, but that seemed overkill. I've found this bug in the lexer code generated for Cabal-syntax. Any use of this code with the JS backend fails (the JS backend doesn't have an addressable heap, instead 'alex_check' array is represented as a JS array, so out-of-bound access triggers an exception).

Copy link
Member

@andreasabel andreasabel Apr 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that is subtle indeed. I unlearned ML a long time ago... ;-)

  • Could you add your explanation to the commit message?
  • Could you add a CHANGELOG entry for this?
  • Do you need a release of the fix?
  • (Let's skip the test case as the change obviously makes the code more correct.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

A release would be nice as I'll have to fix Cabal-syntax's lexer.

in
case new_s of
ILIT(-1) -> (new_acc, input__)
Expand Down