Skip to content

Faster sir decoding #116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 7, 2020
Merged

Faster sir decoding #116

merged 2 commits into from
Sep 7, 2020

Conversation

bjorn3
Copy link
Collaborator

@bjorn3 bjorn3 commented Sep 5, 2020

No description provided.

.section_data_by_name(id.name())
.unwrap_or(borrow::Cow::Borrowed(&[][..])))
.section_by_name(id.name())
.map(|sec| sec.data().expect("failed to decompress section"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What impact does compression have on performance?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. ykrustc is not emitting compressed elf sections. They are an ELF extension that is as far as I know only used for debug info in some cases.

@@ -64,7 +65,9 @@ lazy_static! {

impl Sir {
pub fn read_file(file: &Path) -> Result<Sir, ()> {
let ef = elf::File::open_path(file).unwrap();
// SAFETY: Not really, we hope that nobody changes the file underneath our feet.
let data = unsafe { Mmap::map(&File::open(file).unwrap()).unwrap() };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So mmap is faster?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two options when using object. Either you load the whole file into a Vec<u8> or you mmap the file. The former would also load the .text and .data sections, not just the .yksir_* sections. Rustc also uses mmap for metadata loading. I have not benchmarked both approaches though.

@vext01
Copy link
Contributor

vext01 commented Sep 5, 2020

Thanks for this.

How much time does all of this shave off?

@bjorn3
Copy link
Collaborator Author

bjorn3 commented Sep 5, 2020

In debug mode:

Benchmark #1: ./ykview_master_debug ./yktrace_test_master
  Time (mean ± σ):      8.852 s ±  0.094 s    [User: 8.636 s, System: 0.216 s]
  Range (min … max):    8.777 s …  9.103 s    10 runs
 
Benchmark #2: ./ykview_faster_sir_decoding_debug ./yktrace_test_faster_sir_decoding
  Time (mean ± σ):      6.552 s ±  0.010 s    [User: 6.435 s, System: 0.117 s]
  Range (min … max):    6.536 s …  6.564 s    10 runs
 
Summary
  './ykview_faster_sir_decoding_debug ./yktrace_test_faster_sir_decoding' ran
    1.35 ± 0.01 times faster than './ykview_master_debug ./yktrace_test_master'

File size:

-rwxr-xr-x 1 bjorn3 bjorn3  84M Sep  5 17:14 yktrace_test_faster_sir_decoding
-rwxr-xr-x 1 bjorn3 bjorn3 110M Sep  5 16:55 yktrace_test_master

It is faster and produces smaller SIR.

Bonus benchmark

In release mode:

Benchmark #1: ./ykview_master ./yktrace_test_master
  Time (mean ± σ):      1.043 s ±  0.013 s    [User: 847.8 ms, System: 194.7 ms]
  Range (min … max):    1.035 s …  1.093 s    20 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark #2: ./ykview_faster_sir_decoding ./yktrace_test_faster_sir_decoding
  Time (mean ± σ):     666.1 ms ±   1.7 ms    [User: 558.0 ms, System: 108.0 ms]
  Range (min … max):   663.7 ms … 669.6 ms    20 runs
 
Summary
  './ykview_faster_sir_decoding ./yktrace_test_faster_sir_decoding' ran
    1.57 ± 0.02 times faster than './ykview_master ./yktrace_test_master'

};
use ykpack::{self, bodyflags, Body, CguHash, Decoder, Local, Pack, Ty};
use memmap::Mmap;
use object::{ObjectSection, Object}; // FIXME kill.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the FIXME about?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I can't remember writing that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is even weirder is that that line was created using the rust-analyzer import assist as far as I know.

@vext01
Copy link
Contributor

vext01 commented Sep 5, 2020

I wish we could build just the SIR loading stuff with optimisations. We never need to trace that so we can optimise to our heart's content!

@vext01
Copy link
Contributor

vext01 commented Sep 5, 2020

Related:
rust-lang/rust#54882

@bjorn3
Copy link
Collaborator Author

bjorn3 commented Sep 5, 2020

Would it be possible to build everything except for ykrt without tracing and then make ykrt call into the rest of the crates using #[do_not_trace] wrappers when necessary?

@vext01
Copy link
Contributor

vext01 commented Sep 6, 2020

(I'm debugging a buildbot problem, please ignore these bors trys)

bors try

bors bot added a commit that referenced this pull request Sep 6, 2020
@bors
Copy link
Contributor

bors bot commented Sep 6, 2020

try

Build failed:

@vext01 vext01 self-assigned this Sep 7, 2020
@vext01
Copy link
Contributor

vext01 commented Sep 7, 2020

Would it be possible to build everything except for ykrt without tracing and then make ykrt call into the rest of the crates using #[do_not_trace] wrappers when necessary?

In theory yes, I guess we could force the optimisation flags in those crates (in Cargo.toml)?

@vext01
Copy link
Contributor

vext01 commented Sep 7, 2020

Please squash.

@bjorn3 bjorn3 force-pushed the faster_sir_decoding branch from d7695b9 to 529fdb9 Compare September 7, 2020 09:53
@bjorn3
Copy link
Collaborator Author

bjorn3 commented Sep 7, 2020

Splat

@vext01
Copy link
Contributor

vext01 commented Sep 7, 2020

bors r+

bors bot added a commit that referenced this pull request Sep 7, 2020
116: Faster sir decoding r=vext01 a=bjorn3



Co-authored-by: bjorn3 <[email protected]>
@bors
Copy link
Contributor

bors bot commented Sep 7, 2020

Build failed:

@bjorn3 bjorn3 force-pushed the faster_sir_decoding branch from 529fdb9 to 94224da Compare September 7, 2020 09:59
@vext01
Copy link
Contributor

vext01 commented Sep 7, 2020

bors r+

bors bot added a commit that referenced this pull request Sep 7, 2020
116: Faster sir decoding r=vext01 a=bjorn3



Co-authored-by: bjorn3 <[email protected]>
@bjorn3
Copy link
Collaborator Author

bjorn3 commented Sep 7, 2020

Fixed rustfmt. This will need a ykrustc PR too because of the switch to bincode.

@bors
Copy link
Contributor

bors bot commented Sep 7, 2020

Build failed:

@vext01
Copy link
Contributor

vext01 commented Sep 7, 2020

Fixed rustfmt. This will need a ykrustc PR too because of the switch to bincode.

Do you just need a rebuild of ykrustc, or are there code changes?

@bjorn3
Copy link
Collaborator Author

bjorn3 commented Sep 7, 2020

Just a rebuild against the new ykpack. Opened as softdevteam/ykrustc#128. Can you check that I made all changes necessary for cycle breaking?

@bjorn3
Copy link
Collaborator Author

bjorn3 commented Sep 7, 2020

Wait, I need to fix a warning in this PR.

@bjorn3 bjorn3 force-pushed the faster_sir_decoding branch from 94224da to 99637e6 Compare September 7, 2020 10:09
@vext01
Copy link
Contributor

vext01 commented Sep 7, 2020

Can we wait for the upstream sync to go in before merging this?

I'm just stuck with:
https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/path.20trimming.20problem

@bjorn3
Copy link
Collaborator Author

bjorn3 commented Sep 7, 2020

The ykrustc PR already got a bors try, so the cached rustc already this PR.

@vext01
Copy link
Contributor

vext01 commented Sep 7, 2020

OK. I'll probably end up having to redo the sync. Rebasing pretty much never works well for them.

bors r+

@bors
Copy link
Contributor

bors bot commented Sep 7, 2020

Build succeeded:

@bors bors bot merged commit 2c909e2 into ykjit:master Sep 7, 2020
@bjorn3 bjorn3 deleted the faster_sir_decoding branch September 7, 2020 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants