Skip to content

perf(allocator/vec2): resolve performance regression for extend by marking reserve as #[cold] and #[inline(never)]#10675

Merged
graphite-app[bot] merged 1 commit intomainfrom
04-28-perf_allocator_vec2_resolve_performance_regression_for_extend
May 3, 2025
Merged

perf(allocator/vec2): resolve performance regression for extend by marking reserve as #[cold] and #[inline(never)]#10675
graphite-app[bot] merged 1 commit intomainfrom
04-28-perf_allocator_vec2_resolve_performance_regression_for_extend

Conversation

@Dunqing
Copy link
Member

@Dunqing Dunqing commented Apr 28, 2025

I guess the performance regression reason is that the current implementation has more instructions than before. Here to use the lower of size_hint to reserve space, which is bloating the loop body. Also, the for loop is easier to optimize by the compiler. reserve inside extend is rarely taken, so mark it as #[cold] and #[inline(never)], which can reduce the instructions in while loop. We got a 3%-4% performance improvement in the minfier, but the transformer performance did not fully get back to before #10670.

Anyway, I think we can accept the less than 1% performance regression; this change can unblock us from pushing forward the Vec improvement; we will get it back in at the end of the stack! See #9856

Copy link
Member Author

Dunqing commented Apr 28, 2025

@github-actions github-actions bot added the C-performance Category - Solution not expected to change functional behavior, only performance label Apr 28, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Apr 28, 2025

CodSpeed Instrumentation Performance Report

Merging #10675 will create unknown performance changes

Comparing 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend (b4953b4) with main (6ce3bbb)

Summary

🆕 36 new benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
🆕 codegen[checker.ts] N/A 22.2 ms N/A
🆕 codegen_sourcemap[checker.ts] N/A 65.2 ms N/A
🆕 formatter[antd.js] N/A 715.8 ms N/A
🆕 formatter[react.development.js] N/A 8.1 ms N/A
🆕 formatter[typescript.js] N/A 1.1 s N/A
🆕 isolated-declarations[vue-id.ts] N/A 58.5 ms N/A
🆕 lexer[RadixUIAdoptionSection.jsx] N/A 21.5 µs N/A
🆕 lexer[antd.js] N/A 24.1 ms N/A
🆕 lexer[cal.com.tsx] N/A 5.8 ms N/A
🆕 lexer[checker.ts] N/A 14.4 ms N/A
🆕 lexer[pdf.mjs] N/A 3.9 ms N/A
🆕 linter[RadixUIAdoptionSection.jsx] N/A 2.7 ms N/A
🆕 linter[cal.com.tsx] N/A 1.2 s N/A
🆕 linter[checker.ts] N/A 3.1 s N/A
🆕 mangler[antd.js] N/A 16 ms N/A
🆕 mangler[react.development.js] N/A 294.5 µs N/A
🆕 mangler[typescript.js] N/A 39.6 ms N/A
🆕 minifier[antd.js] N/A 159.4 ms N/A
🆕 minifier[react.development.js] N/A 1.8 ms N/A
🆕 minifier[typescript.js] N/A 279.3 ms N/A
... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

@Dunqing Dunqing force-pushed the 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend branch 2 times, most recently from 1a7091a to f8fcce8 Compare April 29, 2025 02:01
@Dunqing Dunqing changed the title perf(allocator/vec2): resolve performance regression for extend perf(allocator/vec2): resolve performance regression for extend by marking reserve as #[cold] and #[inline(never)] Apr 29, 2025
@Dunqing Dunqing changed the base branch from 04-28-feat_allocator_vec2_introduce_extend_desugared_method_as_extend_internal_implementation to graphite-base/10675 April 29, 2025 02:38
@Dunqing Dunqing force-pushed the graphite-base/10675 branch from 1e4bc1b to ef784ca Compare April 29, 2025 02:43
@Dunqing Dunqing force-pushed the 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend branch from f8fcce8 to 4769723 Compare April 29, 2025 02:43
@Dunqing Dunqing changed the base branch from graphite-base/10675 to 04-28-feat_allocator_vec2_introduce_extend_desugared_method_as_extend_internal_implementation April 29, 2025 02:43
@Dunqing Dunqing force-pushed the 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend branch 2 times, most recently from 01c3a4d to c70f032 Compare April 29, 2025 02:58
@Dunqing Dunqing marked this pull request as ready for review April 29, 2025 09:34
@graphite-app graphite-app bot force-pushed the 04-28-feat_allocator_vec2_introduce_extend_desugared_method_as_extend_internal_implementation branch from ef784ca to e64c52c Compare April 29, 2025 16:10
@graphite-app graphite-app bot requested a review from overlookmotel as a code owner April 29, 2025 16:10
@graphite-app graphite-app bot force-pushed the 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend branch from c70f032 to 04757c5 Compare April 29, 2025 16:10
@graphite-app graphite-app bot force-pushed the 04-28-feat_allocator_vec2_introduce_extend_desugared_method_as_extend_internal_implementation branch 2 times, most recently from 11d0723 to a3d2995 Compare April 30, 2025 00:09
@graphite-app graphite-app bot force-pushed the 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend branch from 04757c5 to b56f8e1 Compare April 30, 2025 00:10
@Dunqing Dunqing force-pushed the 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend branch from b56f8e1 to 67bd57f Compare April 30, 2025 00:35
@Dunqing Dunqing changed the base branch from 04-28-feat_allocator_vec2_introduce_extend_desugared_method_as_extend_internal_implementation to graphite-base/10675 April 30, 2025 00:50
@Dunqing Dunqing force-pushed the graphite-base/10675 branch from a3d2995 to 9db9639 Compare April 30, 2025 01:28
@Dunqing Dunqing force-pushed the 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend branch from 67bd57f to 3b1f7b7 Compare April 30, 2025 01:28
@Dunqing Dunqing changed the base branch from graphite-base/10675 to 04-28-feat_allocator_vec2_introduce_extend_desugared_method_as_extend_internal_implementation April 30, 2025 01:28
@Dunqing Dunqing force-pushed the 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend branch from 3b1f7b7 to 4682891 Compare April 30, 2025 01:35
@graphite-app graphite-app bot added the 0-merge Merge with Graphite Merge Queue label May 3, 2025
@graphite-app
Copy link
Contributor

graphite-app bot commented May 3, 2025

Merge activity

…marking reserve as `#[cold]` and `#[inline(never)]` (#10675)

I guess the performance regression reason is that the current implementation has more instructions than before. Here to use the lower of `size_hint` to reserve space, which is bloating the loop body. Also, the `for` loop is easier to optimize by the compiler.  `reserve` inside `extend` is rarely taken, so mark it as `#[cold]` and `#[inline(never)]`, which can reduce the instructions in `while` loop. We got a 3%-4% performance improvement in the `minfier`, but the transformer performance did not fully get back to before #10670.

Anyway, I think we can accept the less than 1% performance regression; this change can unblock us from pushing forward the `Vec` improvement; we will get it back in at the end of the stack! See #9856
graphite-app bot pushed a commit that referenced this pull request May 3, 2025
… internal implementation (#10670)

Align `extend` implementation with the standard library, since the `ExtendTrusted` is still unstable, so I haven't introduced the `extend_trusted` yet. Some of the benchmark performance regression has been almost fixed in #10675.
@graphite-app graphite-app bot force-pushed the 04-28-feat_allocator_vec2_introduce_extend_desugared_method_as_extend_internal_implementation branch from 9db9639 to 6ce3bbb Compare May 3, 2025 13:03
@graphite-app graphite-app bot force-pushed the 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend branch from 4682891 to b4953b4 Compare May 3, 2025 13:03
Base automatically changed from 04-28-feat_allocator_vec2_introduce_extend_desugared_method_as_extend_internal_implementation to main May 3, 2025 13:09
@graphite-app graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label May 3, 2025
@graphite-app graphite-app bot merged commit b4953b4 into main May 3, 2025
28 checks passed
@graphite-app graphite-app bot deleted the 04-28-perf_allocator_vec2_resolve_performance_regression_for_extend branch May 3, 2025 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C-performance Category - Solution not expected to change functional behavior, only performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants