-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change handling of “blank” identifiers #5282
Conversation
Previously, they were tokenized separately from other identifiers, but then most handling checked both tokens anyway. This now always parses “blanks” as normal identifiers and checks their blankness at the few places we care about it. There were two places that treated `Blank` differently than `WordyId`, and those are preserved. There were also two places where `Blank ""` (`_`) was treated differently than `Blank n` (`_withSomeSuffix`), and those have been eliminated. Fixes unisonweb#2822.
This makes it much easier to read the output when debugging the lexer. And it should be `Read`-compatible.. There’s still room for improvement, though: ```haskell Block (Open "scratch.u") [ [ Leaf (WordyId (NameOnly (Name Relative (NameSegment {toUnescapedText = "dontMap"} :| [])))), Leaf (WordyId (NameOnly (Name Relative (NameSegment {toUnescapedText = "f"} :| [])))), Block (Open "=") [ [ Block (Open "cases") [ [ Leaf (WordyId (NameOnly (Name Relative (NameSegment {toUnescapedText = "None"} :| [])))), Block (Open "->") [ [ Leaf (Reserved "false"), ], ] (Just Close), Leaf (Semi True), ], [ Leaf (WordyId (NameOnly (Name Relative (NameSegment {toUnescapedText = "Some"} :| [])))), Leaf (WordyId (NameOnly (Name Relative (NameSegment {toUnescapedText = "_unused"} :| [])))), Block (Open "->") [ [ Leaf (WordyId (NameOnly (Name Relative (NameSegment {toUnescapedText = "f"} :| [])))), Leaf (Numeric "2"), ], ] (Just Close), ], ] (Just Close), ], ] (Just Close), ], ] (Just Close) ```
I couldn't figure out what _used refers to here: | ||
|
||
3 | Some _used -> f _used | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could give a hint here, when an unknown identifier matches an unbound pattern var, but it doesn't have to be in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, agreed. I don’t think it should be in this PR, because it’s not a parser issue, and this behavior already existed.
And this is another case where using color isn’t enough to convey the error. It should be more like
3 | Some _used -> f _used
^^^^^
(So, another vote for integrating Diagnose.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(So, another vote for integrating Diagnose.)
Whoa that looks crazy. I'm interested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long ago we had started implementing something like what you showed, but we weren't good at it and we scrapped it. But off-the-shelf sounds nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To give credit where it‘s due, @ChrisPenner is the one who shared it initially (https://discord.com/channels/862108724948500490/1200119435931942994/1261743434704879796).
Ooh, does this now allow |
If you mean
instead of
then yes! |
Overview
The proximate bug was that the lexer for
Blank
would “steal” the first segment of an identifier. E.g.,_a.blah
would be lexed as[Blank "a", WordyId ".blah"]
, rather than[WordyId "_a.blah"]
. However, fixing that directly was fragile, and left other cases like the one in #4681.Also, most uses of
Blank
exactly mirroredWordyId
, often rebuilding the original segment.This PR removes the
Blank
token, instead treating it as any otherWordyId
. The parser now checks identifiers for “blankness” as needed.Fixes #2822.
Implementation notes
There were two places that treated
Blank
differently thanWordyId
, and those are preserved. There were also two places where a “true”Blank
(_
) was treated differently than a suffixedBlank
(_withSomeSuffix
), and those have been eliminated.I considered storing “blankness” in an extra field of
WordyId
, but that suffers from boolean blindness and also means that it’s possible to create inconsistent terms (e.g.,WordyId IsBlank "definitely.not.blank"
).Test coverage
There is a new transcript with examples pulled from both #2822 and #4681, as well as other coverage I thought was useful.