Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefer inline representation over static #278

Merged
merged 2 commits into from
Jul 31, 2024

Conversation

overlookmotel
Copy link
Contributor

@overlookmotel overlookmotel commented Jul 4, 2023

As mentioned in #276, current behavior is that short strings can be stored as static strings or inline. This adds an overhead for interning short strings, as they need to be hashed and compared to possible matches in the static set.

This PR changes the behavior to always store strings of 7 bytes or less inline.

This greatly improves the performance of interning short strings - approx 250% speedup, depending on length of the string.

The benchmarks in this repo show no apparent negative effect on any other operations (though there's a lot of noise in the benchmarks).

This PR does not involve any breaking changes.

Even though interning short static atoms is very fast anyway, I imagine this change will translate to a significant real-world performance improvement for some use cases. For example, SWC uses string-cache when parsing Javascript which typically includes a lot of short variable names and tokens, especially in minified code e.g. for (let i = 0; i < len; i++).

Notes on implementation

  • The empty string is still stored in static set. This is slightly non-optimal, but avoids a breaking change (StaticAtomSet::empty_string_index would need to be removed otherwise).
  • Strings <= 7 bytes can still be passed to the codegen but it generates pack_inline() calls in place of pack_static() to generate inline atoms at compile time.
  • Benchmarks needed to be altered to use longer strings when benchmarking static atoms. Benchmark changes are in a separate commit to allow comparing the benchmarks before and after the substantive change.
  • This PR also includes the changes in Faster deref of inline atoms #277. Without this, this PR produced a performance degradation deref-ing short strings as inline was not as performant as static.

@overlookmotel
Copy link
Contributor Author

Would anyone have time to review this please? I believe this represents a significant performance boost for some common use cases, with no apparent downside.

src/atom.rs Outdated Show resolved Hide resolved
@overlookmotel overlookmotel force-pushed the inline-priority2 branch 2 times, most recently from f534a9b to 9db0317 Compare July 17, 2024 10:41
@overlookmotel
Copy link
Contributor Author

Have rebased on top of #277 which is now based on latest main.

Also made changes to improve code style (as mentioned above, I was very new to Rust when I made this PR last year so, while I believe code was correct, it wasn't very idiomatic).

@overlookmotel overlookmotel force-pushed the inline-priority2 branch 2 times, most recently from b84bfc0 to c6ac6ab Compare July 17, 2024 10:48
Copy link
Member

@jdm jdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable! Apologies for the long delay, and thanks for the clear motivation for the change!

@jdm jdm added this pull request to the merge queue Jul 31, 2024
Merged via the queue into servo:main with commit c8fed62 Jul 31, 2024
5 checks passed
@overlookmotel overlookmotel deleted the inline-priority2 branch August 1, 2024 14:31
@overlookmotel
Copy link
Contributor Author

No worries. Really glad to see this merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants