Collect hashes of "repetitive" content (JS functions, comments, CSS blocks) #150

pmeenan · 2024-10-23T16:16:44Z

We are exploring what a web-wide compression dictionary might look like where we ship a dictionary populated with common strings from web content with the browsers and update it on some cadence. Similar to the dictionary that brotli itself uses but much larger and leveraging compression-dictionary-transport for the negotiation.

This could include things like common code from jquery, react, comment blocks, etc. and would work better than shipping a specific version of a library because multiple versions could be compressed against the latest and still get significant benefit.

As part of this work, it would be helpful to gather all of the functions from all of the scripts (and HTML) and store each along with the URL it was from and the hash of the function body (and maybe do similar for top-level comment blocks and css style blocks). We could then group by the hash and find the most common function bodies and use that as a basis for building the dictionary.

This is probably best done directly on the agent and streamed to a separate table rather than a custom metric but keeping track of it here for discussion.

pmeenan added the enhancement New feature or request label Oct 23, 2024

pmeenan self-assigned this Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect hashes of "repetitive" content (JS functions, comments, CSS blocks) #150

Collect hashes of "repetitive" content (JS functions, comments, CSS blocks) #150

pmeenan commented Oct 23, 2024

Collect hashes of "repetitive" content (JS functions, comments, CSS blocks) #150

Collect hashes of "repetitive" content (JS functions, comments, CSS blocks) #150

Comments

pmeenan commented Oct 23, 2024