Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store static initializers in metadata instead of the MIR of statics. #116564

Merged
merged 6 commits into from
Feb 15, 2024

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Oct 9, 2023

This means that adding generic statics would be even more difficult, as we can't evaluate statics from other crates anymore, but the subtle issue I have encountered make me think that having this be an explicit problem is better.

The issue is that

static mut FOO: &mut u32 = &mut 42;
static mut BAR = unsafe { FOO };

gets different allocations, instead of referring to the same one. This is also true for non-static mut, but promotion makes static FOO: &u32 = &42; annoying to demo.

Fixes #61345

Why is this being done?

In order to ensure all crates see the same nested allocations (which is the last issue that needs fixing before we can stabilize const_mut_refs), I am working on creating anonymous (from the Rust side, to LLVM it's like a regular static item) static items for the nested allocations in a static. If we evaluate the static item in a downstream crate again, we will end up duplicating its nested allocations (and in some cases, like the match case, even duplicate the main allocation).

@rustbot
Copy link
Collaborator

rustbot commented Oct 9, 2023

r? @b-naber

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 9, 2023
@rustbot
Copy link
Collaborator

rustbot commented Oct 9, 2023

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

@oli-obk
Copy link
Contributor Author

oli-obk commented Oct 9, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 9, 2023
@bors
Copy link
Contributor

bors commented Oct 9, 2023

⌛ Trying commit 7a7456d with merge ca76eb8...

bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 9, 2023
…, r=<try>

Store static initializers in metadata instead of the MIR of statics.

This means that adding generic statics would be even more difficult, as we can't evaluate statics from other crates anymore, but the subtle issues I have encountered make me think that having this be an explicit problem is better. Said issues are:

### Nested allocations of static items get duplicated

which leads to issues for statics like

```rust
static mut FOO: &mut u32 = &mut 42;
static mut BAR = unsafe { FOO };
```

getting different allocations, instead of referring to the same one. This is also true for non-static mut, but promotion makes `static FOO: &u32 = &42;` annoying to demo.

### The main allocation of a static gets duplicated across crates

```rust
// crate a

static mut FOO: Option<u32> = Some(42);

// crate b

static mut BAR: &mut u32 = unsafe {
    match &mut a::FOO {
        Some(x) => x,
        None => panic!(),
    }
};
```

## Why is this being done?

In order to ensure all crates see the same nested allocations (which is the last issue that needs fixing before we can stabilize [`const_mut_refs`](rust-lang#57349)), I am working on creating anonymous (from the Rust side, to LLVM it's like a regular static item) static items for the nested allocations in a static. If we evaluate the static item in a downstream crate again, we will end up duplicating its nested allocations (and in some cases, like the `match` case, even duplicate the main allocation).
@bors
Copy link
Contributor

bors commented Oct 9, 2023

☀️ Try build successful - checks-actions
Build commit: ca76eb8 (ca76eb8fbbeb4438f79e87e95fe9f1b627093390)

@rust-timer

This comment has been minimized.

@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2023

The main allocation of a static gets duplicated across crates

Is that something that happens today or happens in your branch?
If that happens today that sounds like a bad bug we need to fix, though I don't quite understand how this could even happen.

trace!("eval_to_allocation: Need to compute {:?}", gid);
let raw_const = self.eval_to_allocation_raw(param_env.and(gid))?;
Ok(self.global_alloc(raw_const.alloc_id).unwrap_memory())
let alloc_id = self.eval_static_initializer_raw(def_id)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling eval_to_allocation to eval a static would be wrong now, right?

Should we hide eval_to_allocation better? It should probably only be called internally by the eval_const queries now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding asserts right now 😆 CI is failing because of "mis-use" (after this PR it would be mis-use) of the other queries for statics

@oli-obk
Copy link
Contributor Author

oli-obk commented Oct 9, 2023

If that happens today that sounds like a bad bug we need to fix, though I don't quite understand how this could even happen.

I'm currently trying to figure it out, but it needs a multi-crate test

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ca76eb8): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
11.3% [0.4%, 81.6%] 11
Improvements ✅
(primary)
-0.4% [-1.2%, -0.2%] 10
Improvements ✅
(secondary)
-3.8% [-9.4%, -0.1%] 12
All ❌✅ (primary) -0.4% [-1.2%, -0.2%] 10

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.0% [4.0%, 4.0%] 1
Regressions ❌
(secondary)
10.2% [0.5%, 74.6%] 14
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.3% [-3.7%, -0.9%] 6
All ❌✅ (primary) 4.0% [4.0%, 4.0%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
18.4% [2.0%, 70.5%] 8
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-6.8% [-14.3%, -3.3%] 9
All ❌✅ (primary) - - 0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.5%] 70
Regressions ❌
(secondary)
64.6% [0.0%, 107.8%] 10
Improvements ✅
(primary)
-2.5% [-6.5%, -0.1%] 16
Improvements ✅
(secondary)
-76.3% [-77.3%, -75.5%] 6
All ❌✅ (primary) -0.3% [-6.5%, 0.5%] 86

Bootstrap: 626.784s -> 625.386s (-0.22%)
Artifact size: 270.86 MiB -> 270.89 MiB (0.01%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Oct 9, 2023
@oli-obk
Copy link
Contributor Author

oli-obk commented Oct 9, 2023

Ok, so I think it can't happen, but bugs in the interpreter could make it happen. Basically the reason it can't happen is that we always use the static's alloc id that gives us a GlobalAlloc::Static, and only transiently obtain the static's "memory alloc id" by evaluating the static. Most of the time, we immediately turn an AllocId obtained through such means into a ConstAllocation, so

  • eval_to_allocation (irrelevant, hides the AllocId from the user
  • eval_to_valtree (irrelevant, can't be used for statics)
  • InterpCx::eval_global (irrelevantish, only used by miri's eval_path_scalar, should probably simplify this)

are all fine.

The only potential issue came from eval_to_const_value, which is used in mir::Const::eval to evaluate unevaluated constants, but we never store static items this way in MIR, so it should never happen. I added an assertion for this and will create a PR just for that assertion.

@oli-obk
Copy link
Contributor Author

oli-obk commented Oct 9, 2023

oof that's a harsh regression in incremental-unchanged for crates with lots of static items. Seems like encoding all that extra data is expensive.

@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2023

Ok, so I think it can't happen, but bugs in the interpreter could make it happen. Basically the reason it can't happen is that we always use the static's alloc id that gives us a GlobalAlloc::Static, and only transiently obtain the static's "memory alloc id" by evaluating the static.

I think you just re-did the analysis I did for #61345. :)
We could possibly add some extra sanity checks here by asserting, after evaluating the static, that the Allocid that was used for the "internal" allocation during evaluation has not been interned.

oof that's a harsh regression in incremental-unchanged for crates with lots of static items. Seems like encoding all that extra data is expensive.

Can't we stop encoding other data now, in exchange?

@oli-obk
Copy link
Contributor Author

oli-obk commented Oct 9, 2023

I think you just re-did the analysis I did for #61345. :)

Oh yea whoops. I'm trying to get to the place that you described (returning Allocation instead of AllocId). Seems doable

@oli-obk
Copy link
Contributor Author

oli-obk commented Oct 9, 2023

Can't we stop encoding other data now, in exchange?

The issue here is not metadata encoding, from which I indeed removed the MIR encoding. The issue is the incremental cache. So I'm trying to stop calling the eval to allocation query from the eval static initializer query, which should hopefully eliminate this issue entirely by actually having fewer total queries called

@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2023 via email

@oli-obk
Copy link
Contributor Author

oli-obk commented Oct 9, 2023

I'm not sure if those ideas from years ago are still good ideas today

I reinvented your ideas from trying to make the implementation do what I need it to, so seems good we arrived at the same place

@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2023

The issue here is not metadata encoding, from which I indeed removed the MIR encoding. The issue is the incremental cache. So I'm trying to stop calling the eval to allocation query from the eval static initializer query, which should hopefully eliminate this issue entirely by actually having fewer total queries called

We have quite a few queries for CTFE that often just dispatch to a slightly different lower-level query... I do wonder if that's not just a bunch of unnecessary overhead. Do these all have to be queries with a full cache and everything?

@rustbot
Copy link
Collaborator

rustbot commented Oct 10, 2023

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

@oli-obk
Copy link
Contributor Author

oli-obk commented Oct 10, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 14, 2024
…, r=RalfJung,cjgillot

Store static initializers in metadata instead of the MIR of statics.

This means that adding generic statics would be even more difficult, as we can't evaluate statics from other crates anymore, but the subtle issue I have encountered make me think that having this be an explicit problem is better.

The issue is that

```rust
static mut FOO: &mut u32 = &mut 42;
static mut BAR = unsafe { FOO };
```

gets different allocations, instead of referring to the same one. This is also true for non-static mut, but promotion makes `static FOO: &u32 = &42;` annoying to demo.

Fixes rust-lang#61345

## Why is this being done?

In order to ensure all crates see the same nested allocations (which is the last issue that needs fixing before we can stabilize [`const_mut_refs`](rust-lang#57349)), I am working on creating anonymous (from the Rust side, to LLVM it's like a regular static item) static items for the nested allocations in a static. If we evaluate the static item in a downstream crate again, we will end up duplicating its nested allocations (and in some cases, like the `match` case, even duplicate the main allocation).
@bors
Copy link
Contributor

bors commented Feb 14, 2024

⌛ Testing commit d500a17 with merge eb9354e...

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Feb 14, 2024

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Feb 14, 2024
@oli-obk oli-obk force-pushed the evaluated_static_in_metadata branch from d500a17 to debb4cb Compare February 15, 2024 10:24
@oli-obk oli-obk force-pushed the evaluated_static_in_metadata branch from debb4cb to 73b38c6 Compare February 15, 2024 10:25
@oli-obk
Copy link
Contributor Author

oli-obk commented Feb 15, 2024

@bors r=RalfJung,cjgillot

@bors
Copy link
Contributor

bors commented Feb 15, 2024

📌 Commit 73b38c6 has been approved by RalfJung,cjgillot

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 15, 2024
@bors
Copy link
Contributor

bors commented Feb 15, 2024

⌛ Testing commit 73b38c6 with merge 6a4222b...

@bors
Copy link
Contributor

bors commented Feb 15, 2024

☀️ Test successful - checks-actions
Approved by: RalfJung,cjgillot
Pushing 6a4222b to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 15, 2024
@bors bors merged commit 6a4222b into rust-lang:master Feb 15, 2024
12 checks passed
@rustbot rustbot added this to the 1.78.0 milestone Feb 15, 2024
@oli-obk oli-obk deleted the evaluated_static_in_metadata branch February 15, 2024 12:37
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6a4222b): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
15.9% [0.4%, 81.2%] 8
Improvements ✅
(primary)
-0.4% [-0.5%, -0.2%] 8
Improvements ✅
(secondary)
-2.5% [-8.9%, -0.4%] 18
All ❌✅ (primary) -0.4% [-0.5%, -0.2%] 8

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
29.6% [4.0%, 75.3%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.5% [-5.0%, -2.4%] 5
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
25.6% [4.0%, 73.8%] 6
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-6.8% [-14.1%, -3.4%] 9
All ❌✅ (primary) - - 0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.2% [0.0%, 0.6%] 73
Regressions ❌
(secondary)
71.0% [0.0%, 107.9%] 9
Improvements ✅
(primary)
-2.4% [-6.5%, -0.1%] 16
Improvements ✅
(secondary)
-74.8% [-77.9%, -73.3%] 6
All ❌✅ (primary) -0.3% [-6.5%, 0.6%] 89

Bootstrap: 635.116s -> 635.436s (0.05%)
Artifact size: 306.13 MiB -> 306.15 MiB (0.01%)

@oli-obk
Copy link
Contributor Author

oli-obk commented Feb 15, 2024

@rustbot label: +perf-regression-triaged

The stress test now writes an 8MB constant to metadata instead of a tiny bit of MIR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Miri engine: avoid having mutliple AllocId for the same static