-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wasm] Optimize jiterpreter type cast opcodes & add zero page optimizations #86928
Conversation
Tagging subscribers to 'arch-wasm': @lewing Issue DetailsOpening as draft because it relies on #86403. This PR inlines a sizable portion of CASTCLASS/ISINST logic into jiterpreter traces to create fast paths for exact matches/null pointers that avoids running all the type check machinery we normally would run. Based on my instrumentation these fast paths will be used ~30% of the time, and measurements show a 1-3% improvement for the json section of browser-bench (this is pretty hard to measure, I may try to construct a decent synthetic benchmark for this). This PR also applies zero page optimizations if available, by fusing the null check into the It may be worth inlining the interface check entirely, but I haven't done the work to test that yet. The existence of special interfaces for arrays makes that harder to do.
|
9b4a2c4
to
67bb8e4
Compare
isinst doesn't succeed on nulls. |
Thanks, I'll revise it |
…oing castclass/isinst in traces
…d in the fast check
Address PR feedback
609ed84
to
31d5a91
Compare
I was puzzled that none of the scenarios I expected to get faster were getting faster with these changes. It turned out that the scenarios I was testing were actually relying on MINT_UNBOX. So I wrote some toy BDN microbenchmarks to use and also optimized the unbox operation. Some comparison timings for main vs this PR:
|
This PR inlines a sizable portion of MINT_CASTCLASS/MINT_ISINST logic into jiterpreter traces to create fast paths for exact matches/null pointers that avoids running all the type check machinery we normally would run. Based on my instrumentation these fast paths will be used ~30% of the time, and measurements show a 1-3% improvement for the json section of browser-bench (this is pretty hard to measure, I may try to construct a decent synthetic benchmark for this).
This PR also applies zero page optimizations if available, by fusing the null check into the
obj->vtable
load. The helper code is all tweaked so that in the event a check fails due to a null ptr (null inputs are fairly uncommon in the instrumentation) it is converted into a success since MINT_CASTCLASS and MINT_ISINST both succeed for null instead of throwing.It may be worth inlining the interface check entirely, but I haven't done the work to test that yet. The existence of special interfaces for arrays makes that harder to do.
This PR also fully inlines the implementation of MINT_UNBOX.