You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I use lark.Lark with the transformer= argument to run a transformer during parsing. I fould that this sometimes produced the following type of error message when an unexpected token was encountered: lark.exceptions.UnexpectedToken: <exception str() failed>. This seems to mean that an exception is raised in UnexpectedToken.__str__. When I ran the parser without the transformer, this did not happen.
Analysis:UnexpectedToken.__str__ calls InteractiveParser.accepts through the property UnexpectedToken.accepts. InteractiveParser.accepts somehow calls MyTransformer.OP with an empty token as argument. MyTransformer.OP raises a KeyError when it gets that empty ('') token.
Fix: I find that it is acceptable behaviour for the transformer rule to raise an exception when it gets a token that violating the grammar. I guess it should even be acceptable for the transformer to raise an exception in other cases. Possible ways to fix this problem could be:
Do not attach InteractiveParsers to UnexpectedInput exceptions (_Parser.parse_from_state, lalr_parser.py:190) when a transformer is used directly together with a parser. Then, UnexpectedToken.__str__ would produce a error message using self.expected instead of self.accepts. I do not know what the drawbacks of using self.expected are.
Catch + handle any type of exception in UnexpectedToken.accepts. On error, resort to self.expected or return a meaningful string to prevent <exception str() failed>.
Versions: lark-1.1.7, Python 3.11.2
The text was updated successfully, but these errors were encountered:
IMO the bug is that InteractiveParser.accepts calls into the Transformer at all, that can have arbitrary site effects outside of the purpose of that function.
But also, I am pretty sure passing your transformer as an internal transformer is a bad idea since the OP function doesn't return a Token. I am surprised it works at all.
Edit: The semantics of passing Token transformations is slightly different than what I remembered, this is fine.
MegaIng
added a commit
to MegaIng/lark
that referenced
this issue
Sep 28, 2023
Thanks for having a look at this so quickly! Both #1346 and #1347 solve the problem for me. As a user, I do not know which one makes more sense internally.
I use lark.Lark with the transformer= argument to run a transformer during parsing. I fould that this sometimes produced the following type of error message when an unexpected token was encountered:
lark.exceptions.UnexpectedToken: <exception str() failed>
. This seems to mean that an exception is raised inUnexpectedToken.__str__
. When I ran the parser without the transformer, this did not happen.Minimal example to reproduce:
case1()
runs fine, the problem is incase2()
.Analysis:
UnexpectedToken.__str__
callsInteractiveParser.accepts
through the propertyUnexpectedToken.accepts
.InteractiveParser.accepts
somehow callsMyTransformer.OP
with an empty token as argument.MyTransformer.OP
raises a KeyError when it gets that empty ('') token.Fix: I find that it is acceptable behaviour for the transformer rule to raise an exception when it gets a token that violating the grammar. I guess it should even be acceptable for the transformer to raise an exception in other cases. Possible ways to fix this problem could be:
_Parser.parse_from_state
, lalr_parser.py:190) when a transformer is used directly together with a parser. Then,UnexpectedToken.__str__
would produce a error message usingself.expected
instead ofself.accepts
. I do not know what the drawbacks of usingself.expected
are.UnexpectedToken.accepts
. On error, resort to self.expected or return a meaningful string to prevent<exception str() failed>
.Versions: lark-1.1.7, Python 3.11.2
The text was updated successfully, but these errors were encountered: