Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miri engine: avoid having mutliple AllocId for the same static #61345

Closed
RalfJung opened this issue May 30, 2019 · 5 comments · Fixed by #116564
Closed

Miri engine: avoid having mutliple AllocId for the same static #61345

RalfJung opened this issue May 30, 2019 · 5 comments · Fixed by #116564
Labels
A-const-eval Area: Constant evaluation (MIR interpretation) A-miri Area: The miri tool C-cleanup Category: PRs that clean code up or issues documenting cleanup. C-enhancement Category: An issue proposing an enhancement or a PR with one. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

We currently actually assign two AllocId to every static/const: one pointing to the DefId (a "lazy" ID that can be created without evaluating anything), and one pointing to an actual Allocation (a "resolved ID" available only after it has been evaluated). Also see the comments added in #61278. The "resolved ID" should never be visible to other CTFE or Miri evaluations, because then we'd have two different IDs for the same allocation!

The second ID gets assigned when we intern the result of const evaluation. Or rather, it gets assigned and added to the local map of the CTFE engine when we allocate the return place for const evaluation, which is later used as the root for interning at which point it gets moved into the global tcx allocation map.

First of all, are we entirely sure that the stuff we intern will not use the resolved ID anywhere? Rust provides no "obvious" way to take the address of the return place, and I think that would be the only way the "resolved ID" could leak into the evaluated program. Still, can we make interning not intern the "root", just to be really sure that if that ID leaks somehow it will not cause problems? Or even better, can we not allocate a new ID for the return place and instead use the "original" ID? In the CTFE engine, that ID could map to an Allocation in the local memory map, even though it maps to a Static in the global tcx memory. By avoiding even assigning a second ID we'd avoid all problems!

If we cannot avoid assigning a second ID, we also have to make sure that other computations, when they request the content of a static, do not use the "resolved ID". Currently, the const_eval_raw query will return a RawConst pointing exactly to that "resolved ID". Maybe we should instead make it return a &'tcx Allocation? We don't really need the ID in there because whoever made the const_eval_raw query obviously had the DefId needed to do that and that's all it takes to lookup the "lazy" ID in the tcx! But we could still include it for convenience.

Cc @oli-obk

@Centril Centril added A-const-eval Area: Constant evaluation (MIR interpretation) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-miri Area: The miri tool C-cleanup Category: PRs that clean code up or issues documenting cleanup. C-enhancement Category: An issue proposing an enhancement or a PR with one. labels May 30, 2019
@RalfJung RalfJung changed the title Miri engine Miri engine: avoid having mutliple AllocId for the same static May 30, 2019
@RalfJung
Copy link
Member Author

A potentially related problem is that we create new allocations for constant slices every time eval_const_to_op gets executed.

@RalfJung
Copy link
Member Author

Similarly, LLVM codegen also creates new allocations for slices in from_const. That also seems strange.

@RalfJung
Copy link
Member Author

@oli-obk do #67000 and the other planned changes around statics change anything here?

@oli-obk
Copy link
Contributor

oli-obk commented Dec 23, 2019

I think we can do this, yes. Instead of having GlobalAlloc::Static we could just have this AllocId refer to the real allocation directly. Since we have a scheme to lazily fill in the allocation for an ID this should work out by temporarily also having the ID in the local alloc map. The only worry I have is that we'll be able to mutate a static's memory via its name in the initializer of said static. The mutates value will be overridden in the end though, so it's not observable except for the weird situation where you can write to it.

@RalfJung
Copy link
Member Author

With #99420, we have the same situation also for vtables -- they have a "symbolic" alloc ID and a "concrete" one that is backed by an actual &Allocation.

I think my preferred solution at this point would be to

  • make vtable_allocation return a &Allocation, so that we never have a concrete AllocId for vtables in the global alloc_map. However this will require significant changes to a lot of consumers of this function.
  • do something similar for statics: when evaluating a static, put the return place for that static at the AllocId that is symbolic in the global alloc_map -- we will then have a concrete allocation in the interpreter memory, but overlap between these two is fine (it happens all the time in Miri) and the local interpreter memory takes precedence. That avoids allocating a second AllocId for this static. It means that eval_static needs to return an &Allocation (with the concrete contents), not an AllocId -- does that make sense?

@oli-obk oli-obk added the E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. label Jul 20, 2022
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 14, 2024
…, r=RalfJung,cjgillot

Store static initializers in metadata instead of the MIR of statics.

This means that adding generic statics would be even more difficult, as we can't evaluate statics from other crates anymore, but the subtle issue I have encountered make me think that having this be an explicit problem is better.

The issue is that

```rust
static mut FOO: &mut u32 = &mut 42;
static mut BAR = unsafe { FOO };
```

gets different allocations, instead of referring to the same one. This is also true for non-static mut, but promotion makes `static FOO: &u32 = &42;` annoying to demo.

Fixes rust-lang#61345

## Why is this being done?

In order to ensure all crates see the same nested allocations (which is the last issue that needs fixing before we can stabilize [`const_mut_refs`](rust-lang#57349)), I am working on creating anonymous (from the Rust side, to LLVM it's like a regular static item) static items for the nested allocations in a static. If we evaluate the static item in a downstream crate again, we will end up duplicating its nested allocations (and in some cases, like the `match` case, even duplicate the main allocation).
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 14, 2024
…, r=RalfJung,cjgillot

Store static initializers in metadata instead of the MIR of statics.

This means that adding generic statics would be even more difficult, as we can't evaluate statics from other crates anymore, but the subtle issue I have encountered make me think that having this be an explicit problem is better.

The issue is that

```rust
static mut FOO: &mut u32 = &mut 42;
static mut BAR = unsafe { FOO };
```

gets different allocations, instead of referring to the same one. This is also true for non-static mut, but promotion makes `static FOO: &u32 = &42;` annoying to demo.

Fixes rust-lang#61345

## Why is this being done?

In order to ensure all crates see the same nested allocations (which is the last issue that needs fixing before we can stabilize [`const_mut_refs`](rust-lang#57349)), I am working on creating anonymous (from the Rust side, to LLVM it's like a regular static item) static items for the nested allocations in a static. If we evaluate the static item in a downstream crate again, we will end up duplicating its nested allocations (and in some cases, like the `match` case, even duplicate the main allocation).
@bors bors closed this as completed in 6a4222b Feb 15, 2024
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Feb 16, 2024
…ung,cjgillot

Store static initializers in metadata instead of the MIR of statics.

This means that adding generic statics would be even more difficult, as we can't evaluate statics from other crates anymore, but the subtle issue I have encountered make me think that having this be an explicit problem is better.

The issue is that

```rust
static mut FOO: &mut u32 = &mut 42;
static mut BAR = unsafe { FOO };
```

gets different allocations, instead of referring to the same one. This is also true for non-static mut, but promotion makes `static FOO: &u32 = &42;` annoying to demo.

Fixes rust-lang/rust#61345

## Why is this being done?

In order to ensure all crates see the same nested allocations (which is the last issue that needs fixing before we can stabilize [`const_mut_refs`](rust-lang/rust#57349)), I am working on creating anonymous (from the Rust side, to LLVM it's like a regular static item) static items for the nested allocations in a static. If we evaluate the static item in a downstream crate again, we will end up duplicating its nested allocations (and in some cases, like the `match` case, even duplicate the main allocation).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-eval Area: Constant evaluation (MIR interpretation) A-miri Area: The miri tool C-cleanup Category: PRs that clean code up or issues documenting cleanup. C-enhancement Category: An issue proposing an enhancement or a PR with one. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants