Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify rustc_span analyze_source_file #136460

Merged
merged 2 commits into from
Feb 14, 2025

Conversation

real-eren
Copy link
Contributor

Simplifies the logic to what the code actually does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases.
Also uses a simpler (and more efficient) means of iterating the set bits of the mask.

Only newlines and multibyte characters are actually relevant
@rustbot
Copy link
Collaborator

rustbot commented Feb 3, 2025

r? @fee1-dead

rustbot has assigned @fee1-dead.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 3, 2025
@fee1-dead
Copy link
Member

r? compiler

@rustbot rustbot assigned Noratrieb and unassigned fee1-dead Feb 4, 2025
Copy link
Member

@Noratrieb Noratrieb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup! I did some digging into why the code is weird like this, and I found #127528 to be the cause, previously this complexity around control characters was useful, but now it no longer is, so it makes sense to remove.
Since this is apparently performance-sensitive code, I will do a perf run to see if there are any improvements.

@@ -95,65 +95,32 @@ cfg_match! {
if multibyte_mask == 0 {
assert!(intra_chunk_offset == 0);

// Check if there are any control characters in the chunk. All
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The for loop above could be a lot cleaner by using chunks_exact if you're interested for a future PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, and now I realize that since the first byte of a UTF8 code point is enough to determine its length, the generic function could just iterate over bytes instead of chars. That's feasible to vectorize and the SSE2 function wouldn't need to bail on non-ASCII chunks. On the one hand, this would require baking in slightly more knowledge of how UTF8 code points work; on the other, we wouldn't need code to handle chars that straddle 2 chunks.

The chunks_exact change would be good either way.

@Noratrieb
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 12, 2025
@bors
Copy link
Contributor

bors commented Feb 12, 2025

⌛ Trying commit d6ca7ad with merge c32c2c2...

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 12, 2025
…e, r=<try>

Simplify `rustc_span` `analyze_source_file`

Simplifies the logic to what the code *actually* does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases.
Also uses a simpler (and more efficient) means of iterating the set bits of the mask.
@bors
Copy link
Contributor

bors commented Feb 12, 2025

☀️ Try build successful - checks-actions
Build commit: c32c2c2 (c32c2c25815becf86de3ab2b141cb186420150f2)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (c32c2c2): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 4.0%, secondary 4.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.0% [2.3%, 5.8%] 2
Regressions ❌
(secondary)
4.6% [3.6%, 6.3%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 4.0% [2.3%, 5.8%] 2

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 788.874s -> 790.207s (0.17%)
Artifact size: 347.76 MiB -> 347.67 MiB (-0.03%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 12, 2025
@Noratrieb
Copy link
Member

@bors r+ rollup=maybe

@bors
Copy link
Contributor

bors commented Feb 13, 2025

📌 Commit d6ca7ad has been approved by Noratrieb

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 13, 2025
@real-eren
Copy link
Contributor Author

Since this is apparently performance-sensitive code, I will do a perf run to see if there are any improvements.

I don't have a good idea yet of how relatively hot this function is, but it can be made faster if that would be useful. Between vectorizing the multibyte detection and extending the chunk to 32/64 bytes, there's definitely room for improvement, it's just a question of priorities.

Oh, and porting to neon for aarch64 targets.

@Noratrieb
Copy link
Member

Honestly I'm not sure how important this function really is. One way to find out would be to make a new PR that removes the SSE optimization and then do a perf run to see if it makes the compiler slower.

workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Feb 14, 2025
…yze, r=Noratrieb

Simplify `rustc_span` `analyze_source_file`

Simplifies the logic to what the code *actually* does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases.
Also uses a simpler (and more efficient) means of iterating the set bits of the mask.
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 14, 2025
…kingjubilee

Rollup of 13 pull requests

Successful merges:

 - rust-lang#135439 (Make `-O` mean `OptLevel::Aggressive`)
 - rust-lang#136460 (Simplify `rustc_span` `analyze_source_file`)
 - rust-lang#136642 (Put the alloc unit tests in a separate alloctests package)
 - rust-lang#136904 (add `IntoBounds` trait)
 - rust-lang#136908 ([AIX] expect `EINVAL` for `pthread_mutex_destroy`)
 - rust-lang#136924 (Add profiling of bootstrap commands using Chrome events)
 - rust-lang#136951 (Use the right binder for rebinding `PolyTraitRef`)
 - rust-lang#136956 (add vendor directory to .gitignore)
 - rust-lang#136967 (Use `slice::fill` in `io::Repeat` implementation)
 - rust-lang#136976 (alloc boxed: docs: use MaybeUninit::write instead of as_mut_ptr)
 - rust-lang#136981 (ci: switch loongarch jobs to free runners)
 - rust-lang#136992 (Update backtrace)
 - rust-lang#136993 ([cg_llvm] Remove dead error message)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 14, 2025
…kingjubilee

Rollup of 9 pull requests

Successful merges:

 - rust-lang#135439 (Make `-O` mean `OptLevel::Aggressive`)
 - rust-lang#136460 (Simplify `rustc_span` `analyze_source_file`)
 - rust-lang#136904 (add `IntoBounds` trait)
 - rust-lang#136908 ([AIX] expect `EINVAL` for `pthread_mutex_destroy`)
 - rust-lang#136924 (Add profiling of bootstrap commands using Chrome events)
 - rust-lang#136951 (Use the right binder for rebinding `PolyTraitRef`)
 - rust-lang#136981 (ci: switch loongarch jobs to free runners)
 - rust-lang#136992 (Update backtrace)
 - rust-lang#136993 ([cg_llvm] Remove dead error message)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 8f4b766 into rust-lang:master Feb 14, 2025
7 checks passed
@rustbot rustbot added this to the 1.86.0 milestone Feb 14, 2025
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Feb 14, 2025
Rollup merge of rust-lang#136460 - real-eren:simplify-rustc_span-analyze, r=Noratrieb

Simplify `rustc_span` `analyze_source_file`

Simplifies the logic to what the code *actually* does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases.
Also uses a simpler (and more efficient) means of iterating the set bits of the mask.
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 16, 2025
…h, r=<try>

Remove SSE2 path from `rustc_span` `analyze_source_file`

Follow-up to  rust-lang#136460. When the SSE2 optimization was [introduced](https://github.com/michaelwoerister/rust/blob/a1f8a6ce80a340d51074071c0d9e30eb14f65d25/src/libsyntax_pos/analyze_filemap.rs), the generic path recorded `NonNarrowChar`s for ASCII control characters. Nowadays, `analyze_source_file` only deals with newlines and multi-byte chars.

The point of this PR is to see, via a perf run, whether the SSE2 path still provides a meaningful improvement over the generic path. If it doesn't, it could be removed.

The function can be simplified further after inlining; I left it as-is for the initial perf run so that it's easier to see that the behavior is unchanged.

r? `@Noratrieb`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants