-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify rustc_span
analyze_source_file
#136460
Conversation
Only newlines and multibyte characters are actually relevant
r? @fee1-dead rustbot has assigned @fee1-dead. Use |
r? compiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the cleanup! I did some digging into why the code is weird like this, and I found #127528 to be the cause, previously this complexity around control characters was useful, but now it no longer is, so it makes sense to remove.
Since this is apparently performance-sensitive code, I will do a perf run to see if there are any improvements.
@@ -95,65 +95,32 @@ cfg_match! { | |||
if multibyte_mask == 0 { | |||
assert!(intra_chunk_offset == 0); | |||
|
|||
// Check if there are any control characters in the chunk. All |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The for loop above could be a lot cleaner by using chunks_exact
if you're interested for a future PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, and now I realize that since the first byte of a UTF8 code point is enough to determine its length, the generic function could just iterate over bytes instead of chars. That's feasible to vectorize and the SSE2 function wouldn't need to bail on non-ASCII chunks. On the one hand, this would require baking in slightly more knowledge of how UTF8 code points work; on the other, we wouldn't need code to handle chars that straddle 2 chunks.
The chunks_exact
change would be good either way.
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…e, r=<try> Simplify `rustc_span` `analyze_source_file` Simplifies the logic to what the code *actually* does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases. Also uses a simpler (and more efficient) means of iterating the set bits of the mask.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (c32c2c2): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary 4.0%, secondary 4.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 788.874s -> 790.207s (0.17%) |
@bors r+ rollup=maybe |
I don't have a good idea yet of how relatively hot this function is, but it can be made faster if that would be useful. Between vectorizing the multibyte detection and extending the chunk to 32/64 bytes, there's definitely room for improvement, it's just a question of priorities. Oh, and porting to |
Honestly I'm not sure how important this function really is. One way to find out would be to make a new PR that removes the SSE optimization and then do a perf run to see if it makes the compiler slower. |
…yze, r=Noratrieb Simplify `rustc_span` `analyze_source_file` Simplifies the logic to what the code *actually* does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases. Also uses a simpler (and more efficient) means of iterating the set bits of the mask.
…kingjubilee Rollup of 13 pull requests Successful merges: - rust-lang#135439 (Make `-O` mean `OptLevel::Aggressive`) - rust-lang#136460 (Simplify `rustc_span` `analyze_source_file`) - rust-lang#136642 (Put the alloc unit tests in a separate alloctests package) - rust-lang#136904 (add `IntoBounds` trait) - rust-lang#136908 ([AIX] expect `EINVAL` for `pthread_mutex_destroy`) - rust-lang#136924 (Add profiling of bootstrap commands using Chrome events) - rust-lang#136951 (Use the right binder for rebinding `PolyTraitRef`) - rust-lang#136956 (add vendor directory to .gitignore) - rust-lang#136967 (Use `slice::fill` in `io::Repeat` implementation) - rust-lang#136976 (alloc boxed: docs: use MaybeUninit::write instead of as_mut_ptr) - rust-lang#136981 (ci: switch loongarch jobs to free runners) - rust-lang#136992 (Update backtrace) - rust-lang#136993 ([cg_llvm] Remove dead error message) r? `@ghost` `@rustbot` modify labels: rollup
…kingjubilee Rollup of 9 pull requests Successful merges: - rust-lang#135439 (Make `-O` mean `OptLevel::Aggressive`) - rust-lang#136460 (Simplify `rustc_span` `analyze_source_file`) - rust-lang#136904 (add `IntoBounds` trait) - rust-lang#136908 ([AIX] expect `EINVAL` for `pthread_mutex_destroy`) - rust-lang#136924 (Add profiling of bootstrap commands using Chrome events) - rust-lang#136951 (Use the right binder for rebinding `PolyTraitRef`) - rust-lang#136981 (ci: switch loongarch jobs to free runners) - rust-lang#136992 (Update backtrace) - rust-lang#136993 ([cg_llvm] Remove dead error message) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#136460 - real-eren:simplify-rustc_span-analyze, r=Noratrieb Simplify `rustc_span` `analyze_source_file` Simplifies the logic to what the code *actually* does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases. Also uses a simpler (and more efficient) means of iterating the set bits of the mask.
…h, r=<try> Remove SSE2 path from `rustc_span` `analyze_source_file` Follow-up to rust-lang#136460. When the SSE2 optimization was [introduced](https://github.com/michaelwoerister/rust/blob/a1f8a6ce80a340d51074071c0d9e30eb14f65d25/src/libsyntax_pos/analyze_filemap.rs), the generic path recorded `NonNarrowChar`s for ASCII control characters. Nowadays, `analyze_source_file` only deals with newlines and multi-byte chars. The point of this PR is to see, via a perf run, whether the SSE2 path still provides a meaningful improvement over the generic path. If it doesn't, it could be removed. The function can be simplified further after inlining; I left it as-is for the initial perf run so that it's easier to see that the behavior is unchanged. r? `@Noratrieb`
Simplifies the logic to what the code actually does, which is to just record newlines and multibyte characters. Checking for other ASCII control characters is unnecessary because the generic fallback doesn't do anything for those cases.
Also uses a simpler (and more efficient) means of iterating the set bits of the mask.