Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazify SourceFile::lines. #97575

Merged
merged 2 commits into from
Jun 2, 2022
Merged

Conversation

nnethercote
Copy link
Contributor

SourceFile::lines is a big part of metadata. It's stored in a compressed form
(a difference list) to save disk space. Decoding it is a big fraction of
compile time for very small crates/programs.

This commit introduces a new type SourceFileLines which has a Lines
form and a Diffs form. The latter is used when the metadata is first
read, and it is only decoded into the Lines form when line data is
actually needed. This avoids the decoding cost for many files,
especially in std. It's a performance win of up to 15% for tiny
crates/programs where metadata decoding is a high part of compilation
costs.

A RefCell is needed because the methods that access lines data (which can
trigger decoding) take &self rather than &mut self. To allow for this,
SourceFile::lines now takes a FnMut that operates on the lines slice rather
than returning the lines slice.

r? @Mark-Simulacrum

@rust-highfive
Copy link
Collaborator

Some changes occurred in src/tools/clippy.

cc @rust-lang/clippy

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label May 31, 2022
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 31, 2022
@nnethercote
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 31, 2022
@bors
Copy link
Contributor

bors commented May 31, 2022

⌛ Trying commit ffd9172b548b8b831bcee64a05127b7fbadd7c17 with merge 0ca780e66ec23e02858753c0c8f112e63e3718ef...

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented May 31, 2022

☀️ Try build successful - checks-actions
Build commit: 0ca780e66ec23e02858753c0c8f112e63e3718ef (0ca780e66ec23e02858753c0c8f112e63e3718ef)

@rust-timer
Copy link
Collaborator

Queued 0ca780e66ec23e02858753c0c8f112e63e3718ef with parent 47365c0, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (0ca780e66ec23e02858753c0c8f112e63e3718ef): comparison url.

Instruction count

  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
0.4% 0.4% 4
Improvements 🎉
(primary)
-1.5% -15.7% 66
Improvements 🎉
(secondary)
-2.6% -14.2% 148
All 😿🎉 (primary) -1.5% -15.7% 66

Max RSS (memory usage)

Results
  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
2.0% 2.0% 1
Regressions 😿
(secondary)
3.4% 4.6% 3
Improvements 🎉
(primary)
-1.3% -2.5% 7
Improvements 🎉
(secondary)
-2.5% -4.3% 31
All 😿🎉 (primary) -0.9% -2.5% 8

Cycles

Results
  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-4.0% -5.2% 6
Improvements 🎉
(secondary)
-3.2% -5.7% 22
All 😿🎉 (primary) -4.0% -5.2% 6

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 31, 2022
@nnethercote
Copy link
Contributor Author

I was able to run x.py check and x.py build --stage 1 library/std and x.py test locally without any problems. And the CI perf run succeeded. But there are compile errors on the mingw-check build. scratches head

@tmiasko
Copy link
Contributor

tmiasko commented May 31, 2022

I was able to run x.py check and x.py build --stage 1 library/std and x.py test locally without any problems. And the CI perf run succeeded. But there are compile errors on the mingw-check build. scratches head

The CI also checks parallel-compiler = true build configuration.

@nnethercote nnethercote force-pushed the lazify-SourceFile-lines branch from ffd9172 to 54a8a87 Compare May 31, 2022 23:00
@nnethercote
Copy link
Contributor Author

I have addressed the review comments and changed the RefCell to a Lock to fix the parallel compiler build.

@nnethercote nnethercote force-pushed the lazify-SourceFile-lines branch from 54a8a87 to 5175a71 Compare May 31, 2022 23:34
`SourceFile::lines` is a big part of metadata. It's stored in a compressed form
(a difference list) to save disk space. Decoding it is a big fraction of
compile time for very small crates/programs.

This commit introduces a new type `SourceFileLines` which has a `Lines`
form and a `Diffs` form. The latter is used when the metadata is first
read, and it is only decoded into the `Lines` form when line data is
actually needed. This avoids the decoding cost for many files,
especially in `std`. It's a performance win of up to 15% for tiny
crates/programs where metadata decoding is a high part of compilation
costs.

A `Lock` is needed because the methods that access lines data (which can
trigger decoding) take `&self` rather than `&mut self`. To allow for this,
`SourceFile::lines` now takes a `FnMut` that operates on the lines slice rather
than returning the lines slice.
@nnethercote nnethercote force-pushed the lazify-SourceFile-lines branch from 5175a71 to 0b81d7c Compare June 1, 2022 00:36
compiler/rustc_query_system/src/ich/impls_syntax.rs Outdated Show resolved Hide resolved
compiler/rustc_span/src/lib.rs Outdated Show resolved Hide resolved
compiler/rustc_span/src/lib.rs Show resolved Hide resolved
compiler/rustc_span/src/lib.rs Outdated Show resolved Hide resolved
@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 1, 2022
@nnethercote
Copy link
Contributor Author

I have addressed the latest review comments.

@Mark-Simulacrum
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Jun 2, 2022

📌 Commit 72de7c4 has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 2, 2022
@bors
Copy link
Contributor

bors commented Jun 2, 2022

⌛ Testing commit 72de7c4 with merge e714405...

@bors
Copy link
Contributor

bors commented Jun 2, 2022

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing e714405 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 2, 2022
@bors bors merged commit e714405 into rust-lang:master Jun 2, 2022
@rustbot rustbot added this to the 1.63.0 milestone Jun 2, 2022
@nnethercote nnethercote deleted the lazify-SourceFile-lines branch June 2, 2022 21:31
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (e714405): comparison url.

Instruction count

  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
0.4% 0.5% 6
Improvements 🎉
(primary)
-1.8% -15.3% 52
Improvements 🎉
(secondary)
-2.9% -13.8% 124
All 😿🎉 (primary) -1.8% -15.3% 52

Max RSS (memory usage)

Results
  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
2.3% 2.3% 1
Regressions 😿
(secondary)
2.5% 2.9% 2
Improvements 🎉
(primary)
-1.8% -3.4% 12
Improvements 🎉
(secondary)
-2.8% -7.2% 15
All 😿🎉 (primary) -1.5% -3.4% 13

Cycles

Results
  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-4.8% -5.9% 3
Improvements 🎉
(secondary)
-3.1% -5.2% 23
All 😿🎉 (primary) -4.8% -5.9% 3

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

flip1995 pushed a commit to flip1995/rust that referenced this pull request Jun 4, 2022
…r=Mark-Simulacrum

Lazify `SourceFile::lines`.

`SourceFile::lines` is a big part of metadata. It's stored in a compressed form
(a difference list) to save disk space. Decoding it is a big fraction of
compile time for very small crates/programs.

This commit introduces a new type `SourceFileLines` which has a `Lines`
form and a `Diffs` form. The latter is used when the metadata is first
read, and it is only decoded into the `Lines` form when line data is
actually needed. This avoids the decoding cost for many files,
especially in `std`. It's a performance win of up to 15% for tiny
crates/programs where metadata decoding is a high part of compilation
costs.

A `RefCell` is needed because the methods that access lines data (which can
trigger decoding) take `&self` rather than `&mut self`. To allow for this,
`SourceFile::lines` now takes a `FnMut` that operates on the lines slice rather
than returning the lines slice.

r? `@Mark-Simulacrum`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants