Simplify `rustc_span` `analyze_source_file` #136460

real-eren · 2025-02-03T03:05:51Z

Simplifies the logic to what the code actually does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases.
Also uses a simpler (and more efficient) means of iterating the set bits of the mask.

Only newlines and multibyte characters are actually relevant

rustbot · 2025-02-03T03:05:59Z

r? @fee1-dead

rustbot has assigned @fee1-dead.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

fee1-dead · 2025-02-04T13:08:24Z

r? compiler

Noratrieb

Thanks for the cleanup! I did some digging into why the code is weird like this, and I found #127528 to be the cause, previously this complexity around control characters was useful, but now it no longer is, so it makes sense to remove.
Since this is apparently performance-sensitive code, I will do a perf run to see if there are any improvements.

Noratrieb · 2025-02-12T18:37:46Z

compiler/rustc_span/src/analyze_source_file.rs

@@ -95,65 +95,32 @@ cfg_match! {
                if multibyte_mask == 0 {
                    assert!(intra_chunk_offset == 0);

-                    // Check if there are any control characters in the chunk. All


The for loop above could be a lot cleaner by using chunks_exact if you're interested for a future PR.

That's a good point, and now I realize that since the first byte of a UTF8 code point is enough to determine its length, the generic function could just iterate over bytes instead of chars. That's feasible to vectorize and the SSE2 function wouldn't need to bail on non-ASCII chunks. On the one hand, this would require baking in slightly more knowledge of how UTF8 code points work; on the other, we wouldn't need code to handle chars that straddle 2 chunks.

The chunks_exact change would be good either way.

Noratrieb · 2025-02-12T18:49:08Z

@bors try @rust-timer queue

bors · 2025-02-12T18:50:23Z

⌛ Trying commit d6ca7ad with merge c32c2c2...

…e, r=<try> Simplify `rustc_span` `analyze_source_file` Simplifies the logic to what the code *actually* does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases. Also uses a simpler (and more efficient) means of iterating the set bits of the mask.

bors · 2025-02-12T20:39:06Z

☀️ Try build successful - checks-actions
Build commit: c32c2c2 (c32c2c25815becf86de3ab2b141cb186420150f2)

rust-timer · 2025-02-12T23:35:22Z

Finished benchmarking commit (c32c2c2): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 4.0%, secondary 4.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	4.0%	[2.3%, 5.8%]	2
Regressions ❌ (secondary)	4.6%	[3.6%, 6.3%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	4.0%	[2.3%, 5.8%]	2

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 788.874s -> 790.207s (0.17%)
Artifact size: 347.76 MiB -> 347.67 MiB (-0.03%)

Noratrieb · 2025-02-13T06:49:14Z

@bors r+ rollup=maybe

bors · 2025-02-13T06:49:17Z

📌 Commit d6ca7ad has been approved by Noratrieb

It is now in the queue for this repository.

real-eren · 2025-02-13T08:55:21Z

Since this is apparently performance-sensitive code, I will do a perf run to see if there are any improvements.

I don't have a good idea yet of how relatively hot this function is, but it can be made faster if that would be useful. Between vectorizing the multibyte detection and extending the chunk to 32/64 bytes, there's definitely room for improvement, it's just a question of priorities.

Oh, and porting to neon for aarch64 targets.

Noratrieb · 2025-02-13T08:59:58Z

Honestly I'm not sure how important this function really is. One way to find out would be to make a new PR that removes the SSE optimization and then do a perf run to see if it makes the compiler slower.

…yze, r=Noratrieb Simplify `rustc_span` `analyze_source_file` Simplifies the logic to what the code *actually* does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases. Also uses a simpler (and more efficient) means of iterating the set bits of the mask.

…kingjubilee Rollup of 13 pull requests Successful merges: - rust-lang#135439 (Make `-O` mean `OptLevel::Aggressive`) - rust-lang#136460 (Simplify `rustc_span` `analyze_source_file`) - rust-lang#136642 (Put the alloc unit tests in a separate alloctests package) - rust-lang#136904 (add `IntoBounds` trait) - rust-lang#136908 ([AIX] expect `EINVAL` for `pthread_mutex_destroy`) - rust-lang#136924 (Add profiling of bootstrap commands using Chrome events) - rust-lang#136951 (Use the right binder for rebinding `PolyTraitRef`) - rust-lang#136956 (add vendor directory to .gitignore) - rust-lang#136967 (Use `slice::fill` in `io::Repeat` implementation) - rust-lang#136976 (alloc boxed: docs: use MaybeUninit::write instead of as_mut_ptr) - rust-lang#136981 (ci: switch loongarch jobs to free runners) - rust-lang#136992 (Update backtrace) - rust-lang#136993 ([cg_llvm] Remove dead error message) r? `@ghost` `@rustbot` modify labels: rollup

…kingjubilee Rollup of 9 pull requests Successful merges: - rust-lang#135439 (Make `-O` mean `OptLevel::Aggressive`) - rust-lang#136460 (Simplify `rustc_span` `analyze_source_file`) - rust-lang#136904 (add `IntoBounds` trait) - rust-lang#136908 ([AIX] expect `EINVAL` for `pthread_mutex_destroy`) - rust-lang#136924 (Add profiling of bootstrap commands using Chrome events) - rust-lang#136951 (Use the right binder for rebinding `PolyTraitRef`) - rust-lang#136981 (ci: switch loongarch jobs to free runners) - rust-lang#136992 (Update backtrace) - rust-lang#136993 ([cg_llvm] Remove dead error message) r? `@ghost` `@rustbot` modify labels: rollup

Rollup merge of rust-lang#136460 - real-eren:simplify-rustc_span-analyze, r=Noratrieb Simplify `rustc_span` `analyze_source_file` Simplifies the logic to what the code *actually* does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases. Also uses a simpler (and more efficient) means of iterating the set bits of the mask.

…h, r=<try> Remove SSE2 path from `rustc_span` `analyze_source_file` Follow-up to rust-lang#136460. When the SSE2 optimization was [introduced](https://github.com/michaelwoerister/rust/blob/a1f8a6ce80a340d51074071c0d9e30eb14f65d25/src/libsyntax_pos/analyze_filemap.rs), the generic path recorded `NonNarrowChar`s for ASCII control characters. Nowadays, `analyze_source_file` only deals with newlines and multi-byte chars. The point of this PR is to see, via a perf run, whether the SSE2 path still provides a meaningful improvement over the generic path. If it doesn't, it could be removed. The function can be simplified further after inlining; I left it as-is for the initial perf run so that it's easier to see that the behavior is unchanged. r? `@Noratrieb`

real-eren added 2 commits February 2, 2025 21:51

Simplify SSE2 implementation

3c7c38a

Remove dead control char logic

d6ca7ad

Only newlines and multibyte characters are actually relevant

rustbot assigned fee1-dead Feb 3, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 3, 2025

rustbot assigned Noratrieb and unassigned fee1-dead Feb 4, 2025

Noratrieb approved these changes Feb 12, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 12, 2025

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 12, 2025

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 13, 2025

workingjubilee mentioned this pull request Feb 14, 2025

Rollup of 13 pull requests #136998

Closed

workingjubilee mentioned this pull request Feb 14, 2025

Rollup of 9 pull requests #137010

Merged

bors merged commit 8f4b766 into rust-lang:master Feb 14, 2025
7 checks passed

rustbot added this to the 1.86.0 milestone Feb 14, 2025

real-eren mentioned this pull request Feb 16, 2025

Remove SSE2 path from rustc_span analyze_source_file #137110

Closed

real-eren mentioned this pull request Feb 21, 2025

Use chunks_exact in analyze_source_file_sse2 #137360

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify `rustc_span` `analyze_source_file` #136460

Simplify `rustc_span` `analyze_source_file` #136460

real-eren commented Feb 3, 2025

rustbot commented Feb 3, 2025

fee1-dead commented Feb 4, 2025

Noratrieb left a comment

Noratrieb Feb 12, 2025

real-eren Feb 13, 2025

Noratrieb commented Feb 12, 2025

This comment has been minimized.

bors commented Feb 12, 2025

bors commented Feb 12, 2025

This comment has been minimized.

rust-timer commented Feb 12, 2025

Noratrieb commented Feb 13, 2025

bors commented Feb 13, 2025

real-eren commented Feb 13, 2025

Noratrieb commented Feb 13, 2025

Simplify rustc_span analyze_source_file #136460

Simplify rustc_span analyze_source_file #136460

Conversation

real-eren commented Feb 3, 2025

rustbot commented Feb 3, 2025

fee1-dead commented Feb 4, 2025

Noratrieb left a comment

Choose a reason for hiding this comment

Noratrieb Feb 12, 2025

Choose a reason for hiding this comment

real-eren Feb 13, 2025

Choose a reason for hiding this comment

Noratrieb commented Feb 12, 2025

This comment has been minimized.

bors commented Feb 12, 2025

bors commented Feb 12, 2025

This comment has been minimized.

rust-timer commented Feb 12, 2025

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

Noratrieb commented Feb 13, 2025

bors commented Feb 13, 2025

real-eren commented Feb 13, 2025

Noratrieb commented Feb 13, 2025

Simplify `rustc_span` `analyze_source_file` #136460

Simplify `rustc_span` `analyze_source_file` #136460