Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for enum access in offset_of #120141

Open
1 of 3 tasks
GKFX opened this issue Jan 19, 2024 · 15 comments
Open
1 of 3 tasks

Tracking Issue for enum access in offset_of #120141

GKFX opened this issue Jan 19, 2024 · 15 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. F-offset_of_enum `#![feature(offset_of_enum)]` I-lang-nominated Nominated for discussion during a lang team meeting. proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@GKFX
Copy link
Contributor

GKFX commented Jan 19, 2024

Feature gate: #![feature(offset_of_enum)]

This is a tracking issue for using enum variants in offset_of. Enum variants themselves do not have an offset within their type, so the macro will not give an offset for them, but their fields do. For example, the standard library uses the offset of the content of the Some variant of Option to implement Option::as_slice. The original RFC for offset_of was rust-lang/rfcs#3308.

Public API

pub macro offset_of($Container:ty, $($fields:expr)+ $(,)?) { ... }

const OFFSET: usize = offset_of!(Option<u32>, Some.0);

Steps / History

Unresolved Questions

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@GKFX GKFX added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jan 19, 2024
@est31 est31 added the F-offset_of `#![feature(offset_of)]` label Jan 19, 2024
@fmease fmease added F-offset_of_enum `#![feature(offset_of_enum)]` and removed F-offset_of `#![feature(offset_of)]` labels Feb 17, 2024
@jamesmunns
Copy link
Member

Noting one potential limitation to offset_of with enums: without a way to set the discriminant, it cannot be used to manually initialize an enum.

Blog post explainer of what I mean: https://onevariable.com/blog/pods-from-scratch/

Zulip thread for discussion: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Prior.20discussions.20for.20manually.20building.20an.20enum.3F/near/441992612

@dingxiangfei2009
Copy link
Contributor

dingxiangfei2009 commented Jun 11, 2024

Stabilization Report - Abstract

offset_of! macro is a macro at expression location and it computes the usize offset of, possibly nested, fields in the source type. This report concerns the use of this macro to compute offsets into enums meaningfully, namely covering support tracked by the following two issues.

Feature Summary

The macro computes field offsets for both structs and enums, respecting field privacy.

This macro accepts two argument. The first is the source type that

// crate a
#![feature(offset_of_enum)]
#![feature(offset_of_nested)]

struct AStruct {
    pub field: u8,
    field2: u32,
}

enum AEnum {
    Variant1(u8),
    Variant2 {
        field: u8
    },
    Variant3(AStruct),
}

struct ANestingStruct(AStruct);

// readings as of Rust 1.78.0
// DO NOT EXPECT THESE ASSERTIONS TO WORK ACROSS VERSIONS
fn main() {
    assert_eq!(offset_of!(AStruct, field), 4);
    assert_eq!(offset_of!(AStruct, field2), 0);
    assert_eq!(offset_of!(AEnum, Variant2.field), 1);
    assert_eq!(offset_of!(AEnum, Variant3.0.field), 8);
    assert_eq!(offset_of!(ANestingStruct, 0.field), 4);
}

By respecting privacy, suppose that in another crate b

// crate b
#![feature(offset_of_enum)]
#![feature(offset_of_nested)]

fn main() {
    // these still work
    assert_eq!(offset_of!(AStruct, field), 4);
    assert_eq!(offset_of!(AEnum, Variant2.field), 1);
    assert_eq!(offset_of!(AEnum, Variant3.0.field), 8);
    // this does not compile because field2 is private here
    // assert_eq!(offset_of!(AStruct, field2), 0);
    // this does not compile because 0 is private here
    // assert_eq!(offset_of!(ANestingStruct, 0.field), 4);
}

Documentation

The rustdoc of our core library has been updated with code example enabled by these two feature gates.

Test cases and edge cases

The basic test cases for enums and nesting structures is covered by the following tests, including edge cases for parsing rule involving the field specifiction part of offset_of! macro calls.

Unresolved questions

There has been discussion around in-place construction of enum values and the possibility to know the location of discriminant with offset_of! and manipulate discriminant value. This is still under inception phase and will possibly be supported by a separate utility or interface for safety and better UX.

@dingxiangfei2009
Copy link
Contributor

I discovered that in feature(generic_const_exprs) the support for THIR + MIR OffsetOf is sort of half-baked.

Given that generic_const_exprs is incomplete, the only remaining concern is to use it meaningfully at anonymous constant context, or otherwise it should not block stabilization of it for use in const fn which is a viable substitute for a number of scenarios that I can think of. In any case, I am taking a small initiative to tackle it. 🤞

@Amanieu Amanieu added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Jun 17, 2024
@Amanieu
Copy link
Member

Amanieu commented Jun 17, 2024

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Jun 17, 2024

Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jun 17, 2024
@Amanieu
Copy link
Member

Amanieu commented Jun 17, 2024

There has been discussion around in-place construction of enum values and the possibility to know the location of discriminant with offset_of! and manipulate discriminant value. This is still under inception phase and will possibly be supported by a separate utility or interface for safety and better UX.

I don't think this is a blocker: the discriminant isn't encoded as a real field with an offset, so it can't be represented with offset_of anyways.

@joshtriplett
Copy link
Member

@rfcbot concern consistent-future-syntax

Given that this item (unlike nested fields) isn't as urgent, and that we don't have any syntax for this elsewhere in the language (unlike nested fields), I'd like to make sure that either:

  1. The syntax we choose here is consistent with any future syntax we might want to use elsewhere in the language, or
  2. We don't actually want to provide any syntax for this elsewhere in the language, so we're fine with inventing a syntax for this here.

(We currently have pattern syntax for extracting fields, but the syntax proposed for offset_of here is not pattern syntax. That doesn't necessarily mean it should be, but that means we're introducing a new syntax here and we should decide if that's the syntax we want.)

@GKFX
Copy link
Contributor Author

GKFX commented Jun 19, 2024

elsewhere in the language

The same syntax could be used for the proposed field_of! macro, RFC 3318. It could possibly be used for the proposed discriminant_of!, although this seems less necessary as supporting discriminant_of!(Enum, Variant) alone would likely be sufficient. A type_of!(Type, fields) has also been suggested.

Outside of reflection-related macros, the syntax o.Some.0 (for an Option o) implies to me accessing the Some variant of an enum without doing pattern matching. This would mostly likely be an unsafe operation (unless the user had somehow previously proved to the compiler that o is a Some). Unsafely accessing an enum variant can already be done with let Some(o) = o else { unreachable_unchecked() };, and is unusual enough that I doubt it will ever need a dedicated syntax. Alternatively, o.Some.0 could be understood to be in some way fallible, like a more concise alternative to pattern-matching, but again that seems unlikely to actually happen.

@joshtriplett joshtriplett added the I-lang-nominated Nominated for discussion during a lang team meeting. label Aug 6, 2024
@joshtriplett
Copy link
Member

Marking lang nominated to discuss how we want the future landscape to look here, and how we want to provide this functionality. (There's enough of a use case for this; the question is syntax.)

@jamesmunns
Copy link
Member

Hey @joshtriplett, is this scheduled for a meeting agenda? If possible, I'd love to observe that meeting! I'm still very interested in the ability to construct enums in-place for deserialization reasons

@joshtriplett
Copy link
Member

It'll be in an upcoming meeting, but we don't have it scheduled for a particular date. cc @traviscross who would have a better idea of timing.

@exrok
Copy link

exrok commented Sep 18, 2024

With feature(offset_of_enum) alone it's possible to construct enums in-place with atleast opt-level=1 (for dead store elimination).

At opt-level=0 there are extra copies to the stack & back but that's pretty normal for opt-level=0.

See: https://rust.godbolt.org/z/jz8qc1Pqv

Basically, we can emulate SetDiscriminant although with slightly different semantics.

In-place construction is done by initializing fields in-place using offset_of just like you would a struct then using something like the following:

/// initializes `dst` to be valid AnEnum::X containing the fields already initialized in `dst`
/// # Safety
/// - `dst` must be properly aligned pointing to an `AnEnum`
/// - `dst` must be valid for writes
/// -  The value pointed at by `dst` must have all fields of variant `X` initialized and valid.
unsafe fn set_discriminant_x(dst: NonNull<()>) {
    // All the reads that immediately write the same value to the same location
    // get eliminated with alteast `opt-level=1` leaving only the discriminant write, if any.
    let value = AnEnum::X {
        a: dst.byte_add(offset_of!(AnEnum, X.a)).cast::<u32>().read(),
        b: dst.byte_add(offset_of!(AnEnum, X.b)).cast::<u64>().read(),
        c: dst.byte_add(offset_of!(AnEnum, X.c)).cast::<Box<str>>().read(),
    }; 
    dst.cast::<AnEnum>().write(value);
}

enum AnEnum {
    X{ a: u32, b: u64, c: Box<str>  },
    Y([u8; 12]),
}

Note: Ideally a derive macro is generating the above function.

By using a function pointer you can set the discriminant in a type erased fashion as well.
Also, even with many fields (20) the optimization consistently occurs in my tests so far.
I believe this should resolve the limitation @jamesmunns faced, I think.

Of course, it would still be great to have dedicated support for manipulating discriminant.
However, I agree that the lack of such support should not block feature(offset_of_enum).

Full Zulip thread about the workaround: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/.E2.9C.94.20Workarounds.20for.20setting.20enum.20discriminants/near/470331454

@nikomatsakis
Copy link
Contributor

My take:

I don't know any other place in the language where we might want to use this kind of syntax, though I can imagine it (e.g., MIR has similar paths of this kind). If we want certainty, though, the only syntax that we can be SURE will not conflict with anything else would be to use pattern matching.

Have folks explored pattern matching and what that might look like? I was imagining something like "supply a pattern with exactly one binding" (e.g., offset_of!(path @ Some(x))).

@nikomatsakis
Copy link
Contributor

(Personally though I feel reasonably good about the proposed Variant.field syntax, and we could of course deprecate it in the future if it proves to be a mistake...)

@jamesmunns
Copy link
Member

jamesmunns commented Nov 5, 2024

Just confirming what @exrok said - I don't think "set discriminant" functionality should block this FCP, I'm working on a follow-up RFC for that, and that RFC would build on top of this one.

Edit: I've opened a PR for this RFC: rust-lang/rfcs#3727

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. F-offset_of_enum `#![feature(offset_of_enum)]` I-lang-nominated Nominated for discussion during a lang team meeting. proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants