Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Adds new RegExp-based "parser" #478

Closed
wants to merge 4 commits into from
Closed

Conversation

dmsnell
Copy link
Member

@dmsnell dmsnell commented Apr 21, 2017

Since we're only delineating the blocks in the "parser" there's no great
need to pull in a heavyweight parser. This uses RegExps and a
recursive-descent approach to split the blocks.

Still yet to do is parse the attributes but that shouldn't take much effort.

Granted, I think we should pull more parsing of the inside of the blocks
into this file, but if we're leaving all the dirty parsing up to the blocks then
this is sufficient.

@nylen
Copy link
Member

nylen commented Apr 21, 2017

Since we're only delineating the blocks in the "parser"

This is not where we want to end up.

I think we should pull more parsing of the inside of the blocks into this file

I agree.

Since we're only delineating the blocks in the "parser" there's no great
need to pull in a heavyweight parser. This uses RegExps and a
recursive-descent approach to split the blocks.

Still yet to do is parse the attributes but that shouldn't take long.
@aduth
Copy link
Member

aduth commented Apr 24, 2017

Related to #413 and current APIs for block transforms, there's an immediate need to be able to transform legacy content, which requires being able to parse content outside block demarcations (currently aggregated into a single freeform block).

@nylen
Copy link
Member

nylen commented Apr 27, 2017

Quoting from Automattic/wp-post-grammar#18 (comment) because it expands a bit more on my previous comment here:

I think the responsibility of the parser should be just to parse a tree of HTML and block-delimiter nodes, then we should hand this off to a separate layer to confirm that what appears inside each block is actually valid. One big reason for this is to make it easy for plugin authors; another is that these do feel like separate tasks.

So I think rather than adding another parser here, we should look to expand the existing ones. A couple of options:

  • Actually use the tree of content that TinyMCE's parser is kindly providing for us. Probably the easiest path forward.
  • Move towards full integration of wp-post-grammar and switch back to the PegJS parser by default. Probably a more robust and portable approach.

@dmsnell
Copy link
Member Author

dmsnell commented May 19, 2017

closing because this was an experiment and not needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants