-
-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up #441
Speed up #441
Conversation
Codecov Report
@@ Coverage Diff @@
## master #441 +/- ##
==========================================
- Coverage 88.48% 88.29% -0.20%
==========================================
Files 36 36
Lines 3535 3289 -246
==========================================
- Hits 3128 2904 -224
+ Misses 407 385 -22
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR. I was not able to get this big a difference but a huge improvement however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey man! you're right some changes weren't necessary, I reverted some changes and tidied some expressions in some places. This commit now speeds things up in 2 ways:
String allocation
- Use
Cow
to thePath
instead of an ownedString
in theName
struct, so allocating a newString
only happens if and when needed. - Use
to_ascii_lowercase
instead ofto_lowercase
since it is faster and the keys we have inIcons
only contain, and probably will always only contain, ascii letters.
Syscalls
On linux fetching the file info for all the files was quite slow, especially when the information isnt needed (like -R
and --tree
). The syscalls have been deferred to when needed in rendering the Meta
entry.
What do you think about jwalk? |
I tried to re-implement it with If |
This commit makes some changes based on analysing the stack trace with `cargo flamegraph`. All changes are within the creation of the `Meta` struct since this takes up most of the processing time. Some heavy operations, namely memory allocations and system calls, have been either reduced or deferred until the information is needed, significantly speeding up the `-R` and `--tree` options. This commit also simplifies some code and makes the repo ignore `Cargo.lock` since it is not needed for binary projects.
Upon looking into it I will see if it helps specifically for these recursive descents I'll try it |
@0jdxt Do you have planed to finish this PR and resolve the merge conflicts? |
Hi! I've been busy so feel free. It's just such a mess I feel i should be responsible for cleaning it up. Alas, this PR could be merged soon, I'm just a bit concerned on how I've changed the flow of the program a bit and the structure of the program is getting quite complex. Either way the repo needs a tidy up at least with some comments describing the flow of the program so its easier to follow, especially for those who want to contribute to the repo and is reading the code for the first time. |
Hey @0jdxt , you will have to rebase on master instead of merging from master. |
summary: Some code is merely tidied. Replaced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, this has a great deal of code cleanup as well.
Just took a quick look at the changes.
Hey @0jdxt , please don't squash and force push every change. It makes it harder to review just the changes. |
You can click on force pushed to see an interdiff. |
Hi sorry still not used to this PR stuff |
this pr reduces syscalls, especially repeated ones. We should only get the metadata once per file. Reduce allocating extra data, especially when not used.
// Check through libc if stdout is a tty. Unix specific so not on windows. | ||
// Determine color output availability (and initialize color output (for Windows 10)) | ||
#[cfg(not(target_os = "windows"))] | ||
let tty_available = unsafe { libc::isatty(io::stdout().as_raw_fd()) == 1 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/display.rs
Outdated
) | ||
} | ||
|
||
generate_counter!(DIR_COUNT, u32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to use this here? I am thinking it would be better(simpler) to just have usize
values created in fn tree
.
src/display.rs
Outdated
let index = match flags.blocks.0.iter().position(|&b| b == Block::Name) { | ||
Some(i) => i, | ||
None => 0, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let index = match flags.blocks.0.iter().position(|&b| b == Block::Name) { | |
Some(i) => i, | |
None => 0, | |
}; | |
let index = flags.blocks.0.iter().position(|&b| b == Block::Name).unwrap_or(0); |
Codecov Report
@@ Coverage Diff @@
## master #441 +/- ##
==========================================
+ Coverage 80.71% 81.16% +0.44%
==========================================
Files 35 35
Lines 3449 3413 -36
==========================================
- Hits 2784 2770 -14
+ Misses 665 643 -22
Continue to review full report at Codecov.
|
Hi @0jdxt, thanks so much for working for this massive PR! I want to try to make this PR fire just now, but I found it mixed up with several functionality changes, and some of it break the original function. for example:
how about break down the PR into several ones which contains one functionality only and we can make sure it work as expected, also we can do some quicker reviews and merges. |
I'm closing this PR for now as it has deviated quite a bit from master, but let me know if you are still interested in this and we can open it back up. |
Whilst working on my other PR I found
--tree
to be quite slow compared totree
and so analysedcargo flamegraph
results resulting in reducing unnecessary allocations and system calls, producing a significant speed gain for the--tree
and-R
options.Original flamegraph: (--tree)
After optimisations: (--tree)
After optimisations: (-R)
Originally, the main bottleneck was system calls in order to retrieve
uid
andgid
properties and the following processing into strings when the-R
and--tree
options do not need this information so now on unix, the information is lazy loaded.Now, in the optimised
--tree
graph, we see the main bottleneck now is creating the rest of theMeta
struct and that the sorting and display isn't too shabby. In the optimised-R
graph, we see the displaying and sorting of information into a grid is the biggest bottleneck for this option. This would indicate for future optimisations, some information and processing may need to be lazy loaded for theMeta
struct or perhaps simply the algorithms need optimising, on top of improving the grid/tree display performance.Nevertheless, overall this PR has managed to make the following speed gains, tested only on my linux x86_64 machine with hyperfine, compared with native equivalents
ls -R
andtree
:~ 5 000 directories, 56 000 files, 27G
lsd --tree
lsd --tree
(opt.)tree
lsd -R
lsd -R
(opt.)ls -R