Skip to content

Fix case insensitive option with unicode, RANGE_PROBABLY_CONTAINS_NOT_IMPLIED_CHARACTERS fixes#3435

Merged
parrt merged 4 commits intoantlr:masterfrom
KvanTTT:case-insensitive-fixes
Dec 26, 2021
Merged

Fix case insensitive option with unicode, RANGE_PROBABLY_CONTAINS_NOT_IMPLIED_CHARACTERS fixes#3435
parrt merged 4 commits intoantlr:masterfrom
KvanTTT:case-insensitive-fixes

Conversation

@KvanTTT
Copy link
Member

@KvanTTT KvanTTT commented Dec 26, 2021

I've tested our case insensitive grammars-v4 repository with enabled new option and have found some errors that are fixed by the current PR:

  • Enable RANGE_PROBABLY_CONTAINS_NOT_IMPLIED_CHARACTERS only for ASCII characters because there is probably not fully correct behavior with Unicode ranges. Anyway, I still think it's a useful warning, there are even questions on StackOverflow about [A-z] range: Difference between regex [A-z] and [a-zA-Z], [A-z0-9]+ regexp matching square brackets [duplicate]
  • Convert range with caseInsensitive option if only UPPER range length equals to lower range length
  • Improve location for warnings
  • Remove some duplicated warnings

…CHARACTERS_COLLISION_IN_SET warnings

Don't report twice similar CHARACTERS_COLLISION_IN_SET warnings if caseInsensitive option enabled
@KvanTTT KvanTTT force-pushed the case-insensitive-fixes branch from be5c68a to df62fba Compare December 26, 2021 19:38
@parrt parrt added this to the 4.9.4 milestone Dec 26, 2021
@parrt parrt merged commit faefc25 into antlr:master Dec 26, 2021
@KvanTTT KvanTTT deleted the case-insensitive-fixes branch December 27, 2021 00:40
@KvanTTT KvanTTT restored the case-insensitive-fixes branch December 27, 2021 00:59
@KvanTTT KvanTTT deleted the case-insensitive-fixes branch December 27, 2021 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants