Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor expression lexers and specialise parsers #42

Closed
jg-rp opened this issue Jan 23, 2022 · 0 comments
Closed

Refactor expression lexers and specialise parsers #42

jg-rp opened this issue Jan 23, 2022 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@jg-rp
Copy link
Owner

jg-rp commented Jan 23, 2022

Both template and expression lexing functions are currently defined in liquid.lex, and parsers for template tags and tag expressions are bundled into liquid.parse. Moreover, all tag expressions are parsed through liquid.parse.ExpressionParser.parse_expression(), which handles liquid identifiers, loops and boolean expressions.

For reasons of easier maintenance and potential improvements in performance, I intend to move and refactor each of the expression lexers into their own package, along with a specialised parser and independent TokenStream (independent from the top-level token stream).

Built-in tags will transition to use these new parsers now, via liquid.Environment.parse_*_expression_value functions. Existing tokenize* functions and the ExpressionParser will be maintained until at least Python Liquid version 2.0, which is quite some time away, for those who use them in custom tags.

Some possible optimisations that can be realised include:

  • Lexers that yield tuples rather than NamedTuples. Benchmarks show the former to be faster.
  • Lexers that recognise identifiers with bracketed indexes and string literals. Doing this with regular expressions in the lexer will be much faster than stepping through a token stream in the parser.
  • Don't do unnecessary infix operator parsing when handling loop or output expressions. They don't have any infix operators.
  • Don't do unnecessary token precedence look-ups when handling expression that don't have any precedence rules. Only boolean expression use precedence rules.
  • Remove unnecessary prefix parsing. No built-in tag expression uses prefix operators. Negative numbers can be handled during tokenization.
@jg-rp jg-rp added the enhancement New feature or request label Jan 23, 2022
@jg-rp jg-rp self-assigned this Jan 23, 2022
@jg-rp jg-rp closed this as completed in fbf6b32 Jan 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant