Skip to content

Comments

perf(codegen): optimize sourcemap builder to reduce allocations#13670

Merged
graphite-app[bot] merged 1 commit intomainfrom
perf/optimize-sourcemap-builder
Sep 11, 2025
Merged

perf(codegen): optimize sourcemap builder to reduce allocations#13670
graphite-app[bot] merged 1 commit intomainfrom
perf/optimize-sourcemap-builder

Conversation

@Boshen
Copy link
Member

@Boshen Boshen commented Sep 11, 2025

Summary

This PR optimizes the sourcemap builder to reduce memory allocations and improve performance. Profiling showed that update_generated_line_and_column and add_source_mapping were performance bottlenecks.

Changes Made

Optimization 4: Optimize Line Offset Table Generation

  • Pre-allocate columns vector with capacity 256 to avoid frequent reallocations
  • Replace columns.clone().into_boxed_slice() with std::mem::take(&mut columns).into_boxed_slice() to avoid unnecessary cloning
  • Reserve capacity after taking the vector to maintain performance for subsequent lines

Performance Impact

These changes reduce memory allocations when generating sourcemaps, especially for files with Unicode characters. The clone() operation was creating unnecessary copies of potentially large vectors on every Unicode line, which is now eliminated.

Future Optimizations

Additional optimizations from the analysis that could be implemented in follow-up PRs:

  1. Use SIMD-accelerated line break detection with memchr
  2. Optimize UTF-16 column calculation to avoid iterator allocation
  3. Add fast path for sequential token processing
  4. Inline hot functions with #[inline(always)]

Test Plan

  • All existing tests pass
  • No functional changes, only performance optimizations
  • Verified that sourcemap generation still works correctly

🤖 Generated with Claude Code

@graphite-app
Copy link
Contributor

graphite-app bot commented Sep 11, 2025

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

@github-actions github-actions bot added A-codegen Area - Code Generation C-performance Category - Solution not expected to change functional behavior, only performance labels Sep 11, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Sep 11, 2025

CodSpeed Instrumentation Performance Report

Merging #13670 will not alter performance

Comparing perf/optimize-sourcemap-builder (b35bf30) with main (fb9d0f4)1

Summary

✅ 37 untouched benchmarks

Footnotes

  1. No successful run was found on main (b35bf30) during the generation of this report, so fb9d0f4 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@Boshen Boshen added the 0-merge Merge with Graphite Merge Queue label Sep 11, 2025
Copy link
Member Author

Boshen commented Sep 11, 2025

Merge activity

## Summary

This PR optimizes the sourcemap builder to reduce memory allocations and improve performance. Profiling showed that `update_generated_line_and_column` and `add_source_mapping` were performance bottlenecks.

### Changes Made

**Optimization 4: Optimize Line Offset Table Generation**
- Pre-allocate `columns` vector with capacity 256 to avoid frequent reallocations
- Replace `columns.clone().into_boxed_slice()` with `std::mem::take(&mut columns).into_boxed_slice()` to avoid unnecessary cloning
- Reserve capacity after taking the vector to maintain performance for subsequent lines

### Performance Impact

These changes reduce memory allocations when generating sourcemaps, especially for files with Unicode characters. The `clone()` operation was creating unnecessary copies of potentially large vectors on every Unicode line, which is now eliminated.

### Future Optimizations

Additional optimizations from the analysis that could be implemented in follow-up PRs:
1. Use SIMD-accelerated line break detection with `memchr`
2. Optimize UTF-16 column calculation to avoid iterator allocation
3. Add fast path for sequential token processing
5. Inline hot functions with `#[inline(always)]`

## Test Plan

- [x] All existing tests pass
- [x] No functional changes, only performance optimizations
- [x] Verified that sourcemap generation still works correctly

🤖 Generated with [Claude Code](https://claude.ai/code)
@graphite-app graphite-app bot force-pushed the perf/optimize-sourcemap-builder branch from dd78d91 to b35bf30 Compare September 11, 2025 05:28
@graphite-app graphite-app bot merged commit b35bf30 into main Sep 11, 2025
25 checks passed
@graphite-app graphite-app bot deleted the perf/optimize-sourcemap-builder branch September 11, 2025 05:34
@graphite-app graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label Sep 11, 2025
graphite-app bot pushed a commit that referenced this pull request Sep 11, 2025
Follow-on after #13670.

That PR made 2 changes:

1. Pre-allocating capacity in `columns` `Vec`.
2. `mem::take`-ing `columns` for each line, rather than re-using it.

In my opinion, the 1st change is good, but the 2nd is not.

Revert the usage of `mem::take`, and add a comment explaining why the `.clone()` is not as bad as it looks!

Before this PR: `columns` will likely have spare capacity, so `mem::take(&mut columns).into_boxed_slice()` will perform a reallocation to drop the excess capacity, and then `columns.reserve(256)` performs a 2nd allocation.

Approach after this PR:

1. `columns.clone().into_boxed_slice()` performs 1 allocation, and copies the data from `columns` into this new allocation. `columns.clear()` does not perform any reallocation, and is a very cheap operation.

2. `columns` `Vec` is reused over and over, and will grow adaptively depending on how heavy the file's use of unicode chars is, rather than always going back to estimated max capacity of 256.

Benchmarks show a very small positive difference.
graphite-app bot pushed a commit that referenced this pull request Sep 11, 2025
#13670 optimized `SourcemapBuilder` to pre-reserve capacity in `columns` `Vec`.

However, after #13677, `columns` will resize adaptively depending on how many unicode chars in source text. So now initial capacity of 256 (1 KiB) is probably excessive for most cases. Reduce it to 16 (64 bytes, which is 1 x CPU cache line).

Codspeed shows little change in perf (max +0.1%, min -0.1%), and memory usage will definitely be reduced.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-codegen Area - Code Generation C-performance Category - Solution not expected to change functional behavior, only performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant