Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment: Run through esc_attr() in a single optimized pass. #5337

Draft
wants to merge 10 commits into
base: trunk
Choose a base branch
from

Commits on Oct 2, 2023

  1. Experiment: Run through esc_attr() in a single optimized pass.

    The existing implementation of `esc_attr()` runs a jumble of regular expression
    and other search passes over its input.
    
    In this patch, if the site uses UTF-8, then an exploratory single-pass custom
    parser is used to escape the attribute values.
    dmsnell committed Oct 2, 2023
    Configuration menu
    Copy the full SHA
    b99162d View commit details
    Browse the repository at this point in the history
  2. Put it in its own file.

    dmsnell committed Oct 2, 2023
    Configuration menu
    Copy the full SHA
    43db5f0 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1bc7b33 View commit details
    Browse the repository at this point in the history
  4. Final fixes

    dmsnell committed Oct 2, 2023
    Configuration menu
    Copy the full SHA
    2450dd9 View commit details
    Browse the repository at this point in the history
  5. Small docs changes

    dmsnell committed Oct 2, 2023
    Configuration menu
    Copy the full SHA
    41365a6 View commit details
    Browse the repository at this point in the history
  6. More small changes.

    dmsnell committed Oct 2, 2023
    Configuration menu
    Copy the full SHA
    0a4d4c2 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    804969a View commit details
    Browse the repository at this point in the history
  8. Preserve more Core behaviors

    dmsnell committed Oct 2, 2023
    Configuration menu
    Copy the full SHA
    35cc667 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2023

  1. Configuration menu
    Copy the full SHA
    49af2fd View commit details
    Browse the repository at this point in the history
  2. Major refactor: Introduce and use WP_Token_Set

    In order to clarify the main loop of `_esc_attr_single_pass_utf8` I've moved the
    named character reference lookup outside of the function and into a new high-performance
    token set class dubbed `WP_Token_Set`. I created this class to retain the performance
    perks brought by the optimized data format.
    
    There are two lookup sets though because WordPress traditionally has its own custom
    set based on HTML4, but I would like to see us allow everything that HTML5 allows,
    including the common `'` so we don't have to keep writing `&WordPress#39;` (because
    that doesn't stand out as clearly as the name does).
    
    Performance in this change is even better than it was previously because I've removed
    the substitutions from the lookup table and that removes both iteration and working
    memory. In order to provide the reverse function, decoding these entities, it would
    probably be best to create two separate tables, or add a fixed byte length and offset
    value as a lookup into another table so that we can avoid reintroducing the double
    crawling scan that we had before.
    dmsnell committed Oct 3, 2023
    Configuration menu
    Copy the full SHA
    b5caa5c View commit details
    Browse the repository at this point in the history