Speed up `Parser::expected_tokens` #133793

nnethercote · 2024-12-03T09:21:43Z

The constant pushing/clearing of Parser::expected_tokens during parsing is slow. This PR speeds it up greatly.

r? @estebank

nnethercote · 2024-12-03T09:21:55Z

@bors try @rust-timer queue

…, r=<try> Speed up `Parser::expected_tokens` r? `@ghost`

bors · 2024-12-03T09:23:07Z

⌛ Trying commit 0133601 with merge 4e6952e...

bors · 2024-12-03T11:07:22Z

☀️ Try build successful - checks-actions
Build commit: 4e6952e (4e6952e2fa4367d9a5ef87505fa18f0dd3fedcc4)

rust-timer · 2024-12-03T13:34:53Z

Finished benchmarking commit (4e6952e): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.9%	[-2.4%, -0.2%]	211
Improvements ✅ (secondary)	-0.8%	[-2.6%, -0.1%]	101
All ❌✅ (primary)	-0.9%	[-2.4%, -0.2%]	211

Max RSS (memory usage)

Results (primary -1.2%, secondary 1.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.5%, 0.5%]	1
Regressions ❌ (secondary)	2.9%	[0.9%, 5.4%]	3
Improvements ✅ (primary)	-2.1%	[-2.3%, -1.8%]	2
Improvements ✅ (secondary)	-2.2%	[-2.2%, -2.2%]	1
All ❌✅ (primary)	-1.2%	[-2.3%, 0.5%]	3

Cycles

Results (primary -1.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.5%	[-1.5%, -1.5%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.5%	[-1.5%, -1.5%]	2

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 767.333s -> 766.554s (-0.10%)
Artifact size: 332.08 MiB -> 332.14 MiB (0.02%)

rustbot · 2024-12-04T05:40:54Z

Some changes occurred in src/tools/rustfmt

cc @rust-lang/rustfmt

nnethercote · 2024-12-04T05:42:51Z

Best reviewed one commit at a time.

Let's re-run perf just to be sure: @bors try @rust-timer queue

…, r=<try> Speed up `Parser::expected_tokens` The constant pushing/clearing of `Parser::expected_tokens` during parsing is slow. This PR speeds it up greatly. r? `@estebank`

bors · 2024-12-04T06:13:47Z

⌛ Trying commit f5482df with merge 26060e6...

compiler/rustc_parse/src/parser/diagnostics.rs

estebank · 2024-12-04T06:18:06Z

compiler/rustc_parse/src/parser/token_type.rs

+/// We really want to keep the number of variants to 128 or fewer, sot that
+/// `TokenTypeSet` can be implemented with a `u128`.


On the one hand, we should be able to do so. On the other, I can see this becoming a point of contention with t-lang in the medium future if we push back on a feature for this reason :)

The 17 asm symbols would be a good place to cut things down if necessary. I'm a bit annoyed that they are even in there; so many of them for such a rare use case.

estebank · 2024-12-04T06:21:28Z

compiler/rustc_parse/src/parser/token_type.rs

+        // This assertion will detect if this method and the type definition get out of sync.
+        assert_eq!(token_type as u32, val);
+        token_type


Can't the function just be the as cast with a <=104 check?

No. You can convert a C-style enum to an integer with as, but you can't convert in the other direction, e.g. as per this StackOverflow answer. There are proc macros to do it, but that answer pointed out an alternative that is suitable here: transmute is fine so long as the enum is repr(uN) for some value of N. So I will do that with repr(u8), which will cut over 100 lines of code, yay.

Should we have a static assertion that all of them roundtrip? I'm always concerned about a careless future reformat breaking the bidirectional mapping.

I'd be happy for extra protection, but I'm having trouble imagining what such a static assertion would look like. Can you explain more?

Alternatively, I can go back to an explicit match. The StackOverflow answer mentioned this style:

match v { x if x == MyEnum::A as i32 => Ok(MyEnum::A), x if x == MyEnum::B as i32 => Ok(MyEnum::B), x if x == MyEnum::C as i32 => Ok(MyEnum::C), _ => Err(()), }

It requires a line for every variant, but avoids having to write a number on each line.

I'd be happy for extra protection, but I'm having trouble imagining what such a static assertion would look like. Can you explain more?

I'd forgotten that the range operation doesn't work today in const, but I was picturing something like:

const __CHECK: () = const { for i in 0..2 { assert_eq!(E::to_i32(&E::from_i32(i)), i); } };

I've gone with the guard-based version, using a macro to avoid excessive boilerplate. It doesn't rely on unsafe, and also doesn't rely on matching up the right integer with the right variant.

compiler/rustc_parse/src/parser/token_type.rs

estebank · 2024-12-04T06:29:15Z

I'll finish reviewing tomorrow

bors · 2024-12-04T07:56:59Z

☀️ Try build successful - checks-actions
Build commit: 26060e6 (26060e63f06a4dcd55fc0757eb5b0bdc8136ed3b)

compiler/rustc_parse/src/parser/token_type.rs

bors · 2024-12-19T10:54:18Z

☀️ Try build successful - checks-actions
Build commit: 7ebe629 (7ebe629a8d2d74c9bf12e339350633096488d6ae)

nnethercote · 2024-12-19T11:10:04Z

@bors r=spastorino

bors · 2024-12-19T11:10:07Z

📌 Commit 0f7dccf has been approved by spastorino

It is now in the queue for this repository.

…, r=spastorino Speed up `Parser::expected_tokens` The constant pushing/clearing of `Parser::expected_tokens` during parsing is slow. This PR speeds it up greatly. r? `@estebank`

bors · 2024-12-19T14:25:19Z

⌛ Testing commit 0f7dccf with merge 9c6d84c...

jieyouxu · 2024-12-19T14:28:22Z

Some try jobs are not getting picked up correctly, trying a sync cc https://rust-lang.zulipchat.com/#narrow/channel/242791-t-infra/topic/try.20jobs.20not.20kicking.20off.
@bors treeclosed=1000

jieyouxu · 2024-12-19T14:28:57Z

@bors r- retry (sync)

nnethercote · 2024-12-19T19:54:53Z

The tree seems to be back open?

@bors r=spastorino

bors · 2024-12-19T19:54:56Z

📌 Commit 0f7dccf has been approved by spastorino

It is now in the queue for this repository.

bors · 2024-12-19T19:59:01Z

⌛ Testing commit 0f7dccf with merge 9e136a3...

jieyouxu · 2024-12-19T20:10:10Z

Oops yeah, sorry about that -- bors picked up treeclosed from another PR, and I lost track of this PR due to notifications flood...

bors · 2024-12-19T22:37:43Z

☀️ Test successful - checks-actions
Approved by: spastorino
Pushing 9e136a3 to master...

rust-timer · 2024-12-20T02:55:08Z

Finished benchmarking commit (9e136a3): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.9%	[-2.5%, -0.2%]	213
Improvements ✅ (secondary)	-0.8%	[-2.5%, -0.1%]	105
All ❌✅ (primary)	-0.9%	[-2.5%, -0.2%]	213

Max RSS (memory usage)

Results (primary -0.1%, secondary -0.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.0%	[2.0%, 2.1%]	2
Regressions ❌ (secondary)	4.8%	[4.8%, 4.8%]	1
Improvements ✅ (primary)	-1.5%	[-2.1%, -0.9%]	3
Improvements ✅ (secondary)	-2.4%	[-3.7%, -1.8%]	4
All ❌✅ (primary)	-0.1%	[-2.1%, 2.1%]	5

Cycles

Results (primary -1.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.2%	[-1.7%, -1.0%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.2%	[-1.7%, -1.0%]	4

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 769.451s -> 766.727s (-0.35%)
Artifact size: 330.39 MiB -> 330.36 MiB (-0.01%)

cuviper · 2025-01-22T00:37:42Z

The job dist-s390x-linux failed! Check out the build log: (web) (plain)
Click to see the possible cause of the failure (guessed by this bot)

   Compiling rustc_parse v0.0.0 (/checkout/compiler/rustc_parse)
error[E0308]: mismatched types
   --> /rustc/e46b2c453cbcb2f95ccc20904d6944711c1c9aa4/compiler/rustc_index/src/lib.rs:40:32
    |
38  | macro_rules! static_assert_size {
    | ------------------------------- in this expansion of `rustc_data_structures::static_assert_size!`
39  |     ($ty:ty, $size:expr) => {
40  |         const _: [(); $size] = [(); ::std::mem::size_of::<$ty>()];
    |                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected an array with a size of 288, found one with a size of 280
   ::: compiler/rustc_parse/src/parser/mod.rs:194:56
    |
    |
194 | rustc_data_structures::static_assert_size!(Parser<'_>, 288);
    | |                                                      |
    | |                                                      help: consider specifying the actual array length: `280`
    | in this macro invocation

s390s size assertion failure, fun.

I just got the same error with 280 on powerpc64le in a Fedora scratch build of 1.85.0-beta.5:
https://kojipkgs.fedoraproject.org//work/tasks/4794/128284794/build.log

I have no idea why it would be different than Rust CI though, which apparently did pass on that arch...

cuviper · 2025-01-22T01:11:36Z

I think the difference is due to the alignment of TokenTypeSet(u128), which is only 8 on ppc64 and s390x... at least until #134115 takes effect on ppc64 with LLVM 20. But again, I don't know why Rust CI didn't see that too!

cuviper · 2025-01-23T22:58:13Z

I have no idea why it would be different than Rust CI though

I figured it out -- #134115 changed ppc64 alignment regardless of LLVM version, but when I'm building natively it's starting with the old alignment in stage0, while Rust CI only cross-compiles using the last stage.

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 3, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 3, 2024

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 3, 2024

Auto merge of rust-lang#133793 - nnethercote:speed-up-expected_tokens…

4e6952e

…, r=<try> Speed up `Parser::expected_tokens` r? `@ghost`

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 3, 2024

nnethercote force-pushed the speed-up-expected_tokens branch from 0133601 to f5482df Compare December 4, 2024 05:40

nnethercote marked this pull request as ready for review December 4, 2024 05:40

rustbot assigned estebank Dec 4, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 4, 2024

estebank reviewed Dec 4, 2024

View reviewed changes

compiler/rustc_parse/src/parser/diagnostics.rs Show resolved Hide resolved

estebank reviewed Dec 4, 2024

View reviewed changes

compiler/rustc_parse/src/parser/token_type.rs Show resolved Hide resolved

estebank reviewed Dec 4, 2024

View reviewed changes

compiler/rustc_parse/src/parser/token_type.rs Show resolved Hide resolved

estebank approved these changes Dec 4, 2024

View reviewed changes

This comment has been minimized.

Sign in to view

jieyouxu reviewed Dec 19, 2024

View reviewed changes

compiler/rustc_parse/src/parser/token_type.rs Show resolved Hide resolved

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 19, 2024

bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 19, 2024

bors merged commit 9e136a3 into rust-lang:master Dec 19, 2024
7 checks passed

rustbot added this to the 1.85.0 milestone Dec 19, 2024

This was referenced Dec 19, 2024

Implement default_could_be_derived and default_overrides_default_fields lints #134441

Closed

setup typos check in CI #134006

Merged

nnethercote deleted the speed-up-expected_tokens branch December 20, 2024 04:19

cuviper mentioned this pull request Jan 22, 2025

Only assert the Parser size on specific arches #135855

Merged

		/// We really want to keep the number of variants to 128 or fewer, sot that
		/// `TokenTypeSet` can be implemented with a `u128`.

Speed up Parser::expected_tokens #133793

Speed up Parser::expected_tokens #133793

Uh oh!

Conversation

nnethercote commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nnethercote commented Dec 3, 2024

Uh oh!

This comment has been minimized.

bors commented Dec 3, 2024

Uh oh!

This comment has been minimized.

bors commented Dec 3, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Dec 3, 2024

Overall result: ✅ improvements - no action needed

Uh oh!

rustbot commented Dec 4, 2024

Uh oh!

nnethercote commented Dec 4, 2024

Uh oh!

This comment has been minimized.

bors commented Dec 4, 2024

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nnethercote Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nnethercote Dec 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

estebank commented Dec 4, 2024

Uh oh!

bors commented Dec 4, 2024

Uh oh!

This comment has been minimized.

Uh oh!

bors commented Dec 19, 2024

Uh oh!

nnethercote commented Dec 19, 2024

Uh oh!

bors commented Dec 19, 2024

Uh oh!

bors commented Dec 19, 2024

Uh oh!

jieyouxu commented Dec 19, 2024

Uh oh!

jieyouxu commented Dec 19, 2024

Uh oh!

nnethercote commented Dec 19, 2024

Uh oh!

bors commented Dec 19, 2024

Uh oh!

bors commented Dec 19, 2024

Uh oh!

jieyouxu commented Dec 19, 2024

Uh oh!

bors commented Dec 19, 2024

Uh oh!

Uh oh!

Speed up `Parser::expected_tokens` #133793

Speed up `Parser::expected_tokens` #133793

nnethercote commented Dec 3, 2024 •

edited

Loading

nnethercote Dec 4, 2024 •

edited

Loading

nnethercote Dec 5, 2024 •

edited

Loading

cuviper commented Jan 23, 2025 •

edited

Loading