Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a tolerant parser for Freemarker #1

Open
angelozerr opened this issue Mar 17, 2018 · 1 comment
Open

Implement a tolerant parser for Freemarker #1

angelozerr opened this issue Mar 17, 2018 · 1 comment

Comments

@angelozerr
Copy link
Owner

angelozerr commented Mar 17, 2018

To support advanced LSP feature like completion, outline, etc we need to have a Freemarker AST with location. Today we cannot use the Freemarker FMParser because it is not tolerant.

I create this issue to know how to fix that. There are 2 solutions:

  1. Write a custom parser

Implement a (basic) tolerant parser for Freemarker to implement after LSP commands like HTML Language Server (VSCode) does:

  1. Update Freemarker FMParser

It's the ideal solution to use the existing FMParser but it's an hard task.

I have retrieved an interesting discussion about the FMParser tolerant that I had http://freemarker.624813.n4.nabble.com/FMParser-improvment-for-DLTK-Freemarker-plugin-td2265334.html

It should be very fantastic if we could use the FMParser as tolerant parser with custom structure by using a factory kind, in other words:

  • tolerant parser mode to use it in any IDE to have everytime an AST even if FM content is broken.
  • factory structure to use in my case the LSP4j structure SymbolInformation.

have started to play with FMParser (2.3) and to support tolerant parser, we need:

  • add a setter to FMParser like FMParser#setTolerant(boolean tolerant)
  • check in any code which throw error if tolerant is setted, and in this case we must return a "tolerant" structure like TolerantDollarVariable instead of DollarVariable if there us an error on variable.

The difficulty is to find the whole code which throw error like ParseException, call of jj_consume_token (like jj_consume_token(CLOSING_CURLY_BRACKET); which throw error.

@ddekany
Copy link
Contributor

ddekany commented Mar 22, 2018

If this is the problem you feel the most motivated to address, out of professional curiosity or anything else, then do it, and of course I will help where I can. However, I would like to point out that this is hardly a low hanging fruit. It's also something without which the users can survive; they only see the first error, and until they fix it, there's no outline and such extras. Ugly, but not critical. Reaching a state where this plugin can roughly do what the JBoss Tools FreeMarker IDE can, plus maybe have "web editor" features, is an awesome enough first milestone. So while I think this feature is important (and was requested by others as well), it's not the highest priority at the moment, especially as it's certainly a harder one. But again, of course do whichever part you feel like doing, after all you do it for free.

Back to the issue itself... Just in case you haven't found it, look at this: https://javacc.org/tutorials/errorrecovery

After a quick look I think that:

  • Lexer errors need to become impossible. It's weird to me, but if the JavaCC authors say so... It means that any random sequence of characters must product a stream of tokens without throwing TokenMgrError. Some of those tokens will be something like ILLEGAL_CHAR_SEQUENCE, I presume.
  • After catching a ParseException we need to consume the tokens up to a "synchronizing token". Inside interpolations that will be an unpaired }, inside an FTL tag an >... except, that we have syntactical options, so an unpaired ] is also a possibility in some cases.
  • What makes this whole thing more expensive is that in FM2 (and still in FM3, but that will change) each directive has its own JavaCC "product". So you have to redo this catch-and-synchronize thing for each of them... also, it has to be done separately for the start-tag and end-tag.
  • The tolerant option only has to affect if the "synchronize" part (which we can factor out into one or two common methods) will look for the synchronization token, or instead just re-throws the exception. However, it remains to be seen if this feature makes the lexing and parsing significantly slower even with tolerant = false.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants