Block Parser: Start building a unified block/HTML parser. #6381

dmsnell · 2024-04-10T22:38:24Z

Many blocks are starting to be modified with the HTML API. Given that the HTML API will already be parsing block content, and also that it reads HTML comments along the way, it might make sense to combine the multiple passes of block and HTML parsing into one.

Builds on ideas explored in #5705

Combining the parsers could:

introduce new before and after hooks where blocks can be inserted. this could introduce a new block hooks mechanism which doesn't require parsing and traversing the block structure, but can "tag along" with the normal process.
provide new insight and context into the block parsing, such as knowing where in the HTML a block is, or where, in a block tree the given HTML is.
improve efficiency by lazily-decoding the JSON attributes and by reusing the input string rather than breaking it up into many substring copies.
replace block bindings as blocks are parsed, avoiding yet another parsing pass.
process Interactivity API directives as the document is parsed, linearly, and in a streaming fashion. this would avoid the need to run the Server Directive Processor over the entire document after it's already been parsed.
post-process block output to run global policies concerning attribute values and allowable HTML markup.

It could slow some things down, however, which otherwise wouldn't require the block and HTML parsing. For example, the block parser is fast, if memory hungry. Parsing HTML along the way could slow it down.

Many blocks are starting to be modified with the HTML API. Given that the HTML API will already be parsing block content, and also that it reads HTML comments along the way, it might make sense to combine the multiple passes of block and HTML parsing into one. Combining the parsers could: - introduce new _before_ and _after_ hooks where blocks can be inserted. this could introduce a new block hooks mechanism which doesn't require parsing and traversing the block structure, but can "tag along" with the normal process. - provide new insight and context into the block parsing, such as knowing where in the HTML a block is, or where, in a block tree the given HTML is. - improve efficiency by lazily-decoding the JSON attributes and by reusing the input string rather than breaking it up into many substring copies. - process Interactivity API directives as the document is parsed, linearly, and in a streaming fashion. this would avoid the need to run the Server Directive Processor over the entire document after it's already been parsed. - post-process block output to run global policies concerning attribute values and allowable HTML markup. It could slow some things down, however, which otherwise wouldn't require the block and HTML parsing. For example, the block parser is fast, if memory hungry. Parsing HTML along the way could slow it down.

github-actions · 2024-04-10T22:55:14Z

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

The Plugin and Theme Directories cannot be accessed within Playground.
All changes will be lost when closing a tab with a Playground instance.
All changes will be lost when refreshing the page.
A fresh instance is created each time the link below is clicked.
Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

dmsnell added 3 commits April 11, 2024 16:45

Nest blocks and track stack of open blocks.

bf5e15c

Add get_depth() and example code for block comment parser

ce15db1

Add comments

b5ecc9b

dmsnell mentioned this pull request May 1, 2024

Try: Block insertion via HTML processor woocommerce/woocommerce#47089

Closed

11 tasks

dmsnell mentioned this pull request May 10, 2024

Performance best practices WordPress/developer-blog-content#258

Open

This was referenced Jun 6, 2024

Block Parser: Explore a streaming lazy interface #5705

Draft

Dennis' list of broad and interesting things. WordPress/gutenberg#62437

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Block Parser: Start building a unified block/HTML parser. #6381

Block Parser: Start building a unified block/HTML parser. #6381

dmsnell commented Apr 10, 2024 •

edited

Loading

github-actions bot commented Apr 10, 2024

Block Parser: Start building a unified block/HTML parser. #6381

Are you sure you want to change the base?

Block Parser: Start building a unified block/HTML parser. #6381

Conversation

dmsnell commented Apr 10, 2024 • edited Loading

github-actions bot commented Apr 10, 2024

Test using WordPress Playground

Some things to be aware of

dmsnell commented Apr 10, 2024 •

edited

Loading