Skip to content

Comments

refactor(allocator/vec2): move len field from Vec to RawVec#10883

Merged
graphite-app[bot] merged 1 commit intomainfrom
05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec
May 15, 2025
Merged

refactor(allocator/vec2): move len field from Vec to RawVec#10883
graphite-app[bot] merged 1 commit intomainfrom
05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec

Conversation

@Dunqing
Copy link
Member

@Dunqing Dunqing commented May 8, 2025

Prepare for #9706.

Move len field from Vec to RawVec to adapt upcoming change that reduces len and cap fields from usize to u32 so that the Vec size can reduce from 32 to 24 without padding.

Marked Vec as #[repr(transparent)] and RawVec as #[repr(C)] to easier to solve raw transfer test failures because the fields are no longer reordered, we can statically determine the offsets, as mentioned in that comment.

Copy link
Member Author

Dunqing commented May 8, 2025


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions github-actions bot added the C-cleanup Category - technical debt or refactoring. Solution not expected to change behavior label May 8, 2025
@Dunqing Dunqing force-pushed the 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec branch from 882a193 to b762280 Compare May 8, 2025 08:25
@codspeed-hq
Copy link

codspeed-hq bot commented May 8, 2025

CodSpeed Instrumentation Performance Report

Merging #10883 will not alter performance

Comparing 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec (7d54577) with main (b9e51e2)

Summary

✅ 36 untouched benchmarks

@overlookmotel
Copy link
Member

overlookmotel commented May 8, 2025

Now I understand what you're doing.

However, it's rather complicated!

I expect the perf regression is coming from the &mut references in RawVec. You'd expect that wouldn't cost anything if RawVec's methods were inlined into Vec's methods, but they're probably not always (and you probably don't want them always to be).

It's tricky, because we ideally want to maintain the separation between Vec and RawVec, but we can't when len and capacity fields are u32, without ending up with padding bytes in both RawVec and Vec - so Vec remains 32 bytes.

I can see 3 alternative approaches:

  1. Merge Vec and RawVec by moving all RawVec's fields and methods into Vec.
  2. Move len field into RawVec.
  3. Use #[repr] attribute hacks to avoid having to move anything.

#[repr] approach

This is what that option looks like:

#[repr(C)]
struct Vec<'a> {
    _align: [usize; 0],
    buf: RawVec<'a>,
    len: u32,
}

#[repr(C, packed)]
struct RawVec<'a> {
    ptr: NonNull<u8>,
    allocator: &'a Allocator,
    capacity: u32,
}

#[cfg(target_pointer_width = "64")]
const _: () = {
    use std::mem::offset_of;

    assert!(size_of::<RawVec>() == 20); // Not 24 due to `#[repr(packed)]`
    assert!(align_of::<RawVec>() == 1); // Due to `#[repr(packed)]`
    assert!(offset_of!(RawVec, ptr) == 0);
    assert!(offset_of!(RawVec, allocator) == 8);
    assert!(offset_of!(RawVec, capacity) == 16);

    assert!(size_of::<Vec>() == 24);
    assert!(align_of::<Vec>() == 8); // Due to `_align: [usize; 0]`
    assert!(offset_of!(Vec, buf) == 0);
    assert!(offset_of!(Vec, len) == 20);
};
  • #[repr(packed)] makes RawVec act like a [u8; 20], aligned on 1. So it doesn't contain any padding bytes.
  • RawVec and len: u32 can therefore be combined together without any padding bytes (like if it was ([u8; 20], u32)).
  • #[repr(C)] on both Vec and RawVec "locks" the field orders to what we've specified.
  • _align: [usize; 0] in Vec is a zero-sized type which ensures Vec is pointer aligned i.e. aligned on 8 on 64-bit systems.

Because we've carefully made sure every field is laid out where we want it, and the whole of Vec is aligned on 8, that ensures all the fields are correctly aligned for their types.

00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23
<------------------------ RawVec -------------------------> <-- len -->
<-------- ptr --------> <----- allocator -----> <-- cap -->

Note: capacity field has to be last in RawVec. If field order was ptr, capacity, allocator, then allocator field would be out of alignment. Ditto len field must be last in Vec.

I think with these #[repr] attributes we don't have to move any methods or fields between Vec and RawVec.

The downside is it's hacky and dangerous! We MUST NOT perform any operations on a RawVec unless it's wrapped in a Vec, or it's instant undefined behavior, because the fields will be out of alignment.

What to do?

If our aim at this point is to get a fairly accurate idea of what perf gain is from switching to u32, maybe it's fine to take this hacky route for now.

If we do go ahead with u32 (which I hope we will), then we could later on switch to one of the other less hacky options.

What do you think?

@overlookmotel
Copy link
Member

overlookmotel commented May 8, 2025

On second thoughts, forget everything I've said about #[repr(packed)] etc!

Probably just moving len field into RawVec is the simplest option.

If len is pub(super), it can still be accessed from Vec, so very little other code would need to change, and we avoid potential UB and #[repr] hackery. We also avoid the &mut references in the implementation in this PR.

@Boshen
Copy link
Member

Boshen commented May 9, 2025

We should add a assert_eq!(std::mem::size_of::<T>(), 32); to these types.

@Dunqing Dunqing changed the base branch from 05-08-feat_allocator_vec2_introduce_cap_nonmaxu32_for_cap_field to graphite-base/10883 May 12, 2025 05:35
@Dunqing Dunqing force-pushed the graphite-base/10883 branch from b6ac6d6 to bed3f47 Compare May 12, 2025 05:35
@Dunqing Dunqing force-pushed the 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec branch from b762280 to b0b67b2 Compare May 12, 2025 05:35
@Dunqing Dunqing changed the base branch from graphite-base/10883 to main May 12, 2025 05:36
@Dunqing Dunqing force-pushed the 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec branch 3 times, most recently from d9f3c2a to c6b2c72 Compare May 14, 2025 06:56
@github-actions github-actions bot added the A-ast-tools Area - AST tools label May 14, 2025
@Dunqing Dunqing changed the title refactor(allocator/vec2): move RawVec fields to Vec refactor(allocator/vec2): move len field from Vec to RawVec May 14, 2025
@Dunqing Dunqing marked this pull request as ready for review May 14, 2025 07:21
Copilot AI review requested due to automatic review settings May 14, 2025 07:21
@Dunqing Dunqing requested a review from overlookmotel as a code owner May 14, 2025 07:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the memory layout of the allocator’s Vec and RawVec types in preparation for reducing field sizes. The changes include updating field offsets in the raw transfer generator, moving the len field from Vec to RawVec with corresponding constructor updates, and adjusting Vec methods to reference the new buf.len field.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

File Description
tasks/ast_tools/src/generators/raw_transfer.rs Updated the Vec field offset constant and comment to reflect the new layout.
crates/oxc_allocator/src/vec2/raw_vec.rs Added the len field to RawVec and updated its constructors.
crates/oxc_allocator/src/vec2/mod.rs Removed the len field from Vec and modified methods to use buf.len.
Comments suppressed due to low confidence (1)

tasks/ast_tools/src/generators/raw_transfer.rs:30

  • Ensure that the new VEC_LEN_FIELD_OFFSET value is valid on all targeted platforms given the revised memory layout of Vec and RawVec; adding a clarifying comment about this assumption might be helpful.
const VEC_LEN_FIELD_OFFSET: usize = 8;

@Dunqing
Copy link
Member Author

Dunqing commented May 14, 2025

We should add a assert_eq!(std::mem::size_of::<T>(), 32); to these types.

All of generated/assert_layout.rs is already included this assert actually.

@Dunqing Dunqing force-pushed the 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec branch from c6b2c72 to 103e4e9 Compare May 14, 2025 07:39
@overlookmotel
Copy link
Member

overlookmotel commented May 14, 2025

There should be no perf impact from moving len field into RawVec in itself. I assume the perf difference is that field order changes in the process.

But to test that assumption (and make sure we're not getting false results from CodSpeed on either/both of these PRs), I suggest re-ordering the fields of RawVec in this PR as:

#[repr(C)]
pub struct RawVec<'a, T> {
    ptr: NonNull<T>,
    a: &'a Bump,
    cap: usize,
    pub(super) len: usize,
}

This is the same order the fields were in before this PR (with default #[repr(Rust)], compiler moves fields with niches e.g. &T to be before fields without niches e.g. usize). I checked that's correct by adding this code (on main):

const _: () = {
    use std::mem::offset_of;
    assert!(size_of::<Vec<u8>>() == 32);
    assert!(align_of::<Vec<u8>>() == 8);
    assert!(offset_of!(Vec<u8>, buf) == 0);
    assert!(offset_of!(Vec<u8>, len) == 24);
};

const _: () = {
    use std::mem::offset_of;
    assert!(size_of::<RawVec<u8>>() == 24);
    assert!(align_of::<RawVec<u8>>() == 8);
    assert!(offset_of!(RawVec<u8>, ptr) == 0);
    assert!(offset_of!(RawVec<u8>, a) == 8);
    assert!(offset_of!(RawVec<u8>, cap) == 16);
};

Those assertions all pass on main branch. If you re-arrange RawVec's fields as suggested above, they should still pass on this PR.

After that change, this PR should have zero perf impact. If it still does, something is wrong.

@Dunqing Dunqing force-pushed the 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec branch 2 times, most recently from 85a8b52 to 4b8e48e Compare May 14, 2025 09:27
@Dunqing Dunqing force-pushed the 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec branch from 4b8e48e to 085273f Compare May 14, 2025 10:03
@overlookmotel overlookmotel force-pushed the 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec branch from 085273f to 91cad19 Compare May 14, 2025 19:20
@overlookmotel
Copy link
Member

CI got stuck (as usual!). I've rebased on main, fixed the raw transfer test failure, and force pushed (ditto for #10884).

@overlookmotel overlookmotel added the 0-merge Merge with Graphite Merge Queue label May 15, 2025
Copy link
Member

overlookmotel commented May 15, 2025

Merge activity

…0883)

Prepare for #9706.

Move `len` field from `Vec` to `RawVec` to adapt upcoming change that reduces `len` and `cap` fields from `usize` to `u32` so that the `Vec` size can reduce from `32` to `24` without padding.

Marked `Vec` as `#[repr(transparent)]` and `RawVec` as `#[repr(C)]` to easier to solve raw transfer test failures because the fields are no longer reordered, we can statically determine the offsets, as mentioned in that comment.
@graphite-app graphite-app bot force-pushed the 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec branch from 91cad19 to 7d54577 Compare May 15, 2025 21:49
@graphite-app graphite-app bot merged commit 7d54577 into main May 15, 2025
26 checks passed
@graphite-app graphite-app bot deleted the 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec branch May 15, 2025 21:57
@graphite-app graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label May 15, 2025
@Dunqing Dunqing mentioned this pull request Jan 18, 2026
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ast-tools Area - AST tools C-cleanup Category - technical debt or refactoring. Solution not expected to change behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants