-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
install.sh: Perfomance: Use more shell builtins #106
base: master
Are you sure you want to change the base?
Conversation
Replace echo/grep/cut/dirname/basename by variable substitutions and case pattern matching to reduce the amount of subprocesses called for every copied file.
eval $v=$val | ||
fi | ||
case "${arg}" in | ||
"--${op}="* ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this is technically a behavioral change because the original code grep
ped for "--$op"
instead of "^--$op"
-- testing for the prefix is probably intended, though. Same for the --$option
case below.
Since the manifests have sorted file lists, consecutive invocations of abs_path (make_dir_recursive) likely do redundant work. Avoiding these redundancies reduces the amount of subprocessing per file further.
local file_path_dirname="${file_path%/*}" | ||
local file_path_basename="${file_path##*/}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB: These dirname
/basename
substitutions assume more well-formed paths, e.g., no trailing slashes etc.
Let me know if you aren't comfortable with adding these constraints and if I should revert to dirname
/basename
.
Thanks for the PR! It might take a bit for me to review the PR, but I'll eventually get it done :) |
Does this address the apparent hang after |
@fweimer, these changes replace all those |
install-template.sh
Outdated
# Skip if the last invocation of make_dir_recursive had the same argument | ||
if ! [ "$_dir" = "${_make_dir_recursive_cached_key:-}" ] | ||
then | ||
_make_dir_recursive_cached_key="$_dir" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like kind of an odd optimization, it only works if the very last call was the same directory. Have you compared the times with and without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see commit message 53d557b : input lists are sorted by path, the directories usually contain more than one file and as such this optimization is hit frequently.
I did compare times and the difference was significant enough to warrant the added complexity. However, I did not record any numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also significantly reduced the log file size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went AFK for a bit to let this run:
# pwd
/tmp/rust-1.51.0-x86_64-unknown-linux-gnu
# stat -fc%T /tmp
tmpfs
# dst=/tmp/destdir
# for c in 53d557b 3d7ed69 5254dbf ; do curl -sLO "https://raw.githubusercontent.com/mbargull/rust-installer/${c}/install-template.sh" ; rm -rf "${dst}" ; printf %s\\n "${c}" ; time bash install-template.sh --destdir="${dst}" >/dev/null ; find "${dst}" -name \*.log -exec stat -c'%n %s' {} \; ; done
53d557b
install: WARNING: failed to run ldconfig. this may happen when not installing as root. run with --verbose to see the error
real 44.19
user 35.54
sys 11.26
/tmp/destdir/usr/local/lib/%%TEMPLATE_REL_MANIFEST_DIR%%/install.log 11361126
3d7ed69
install: WARNING: failed to run ldconfig. this may happen when not installing as root. run with --verbose to see the error
real 105.68
user 82.46
sys 30.43
/tmp/destdir/usr/local/lib/%%TEMPLATE_REL_MANIFEST_DIR%%/install.log 12935429
5254dbf
install: WARNING: failed to run ldconfig. this may happen when not installing as root. run with --verbose to see the error
real 688.47
user 570.21
sys 207.40
/tmp/destdir/usr/local/lib/%%TEMPLATE_REL_MANIFEST_DIR%%/install.log 12935429
So for me that commit cuts about 60 % of the time on a tmpfs
.
(install.log
size difference is not that big in this case due to short prefix/destdir. At https://github.com/conda-forge/rust-feedstock we have longer (~ 255 characters) install prefixes, resulting in a much larger log file.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I wonder if you could get almost as much of a speedup by removing the logging instead. It sounds like it takes a while and isn't super useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the logging is just uses the echo
builtin which should be able to offer high throughput.
One thing to try is whether we can just use a test -d DIR
for make_dir_recursive
instead of the memorized logic. (For the abs path function I don't see a similarly performing alternative.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, test -d
gives the same run time. I've added two cleanup commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Never mind the 2nd commit which I force-pushed away. I just forgot why (abs path from file vs dir) I split those functions.)
7294f01
to
2ccfecc
Compare
@@ -625,7 +626,7 @@ install_components() { | |||
|
|||
maybe_backup_path "$_file_install_path" | |||
|
|||
if echo "$_file" | grep "^bin/" > /dev/null || test -x "$_src_dir/$_component/$_file" | |||
if test -z "${_file##bin/*}" || test -x "$_src_dir/$_component/$_file" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test -z "${_file##bin/*}"
unintentionally accepts the empty string; [ "${_file#bin/}" != "$_file" ]
would be a slightly more faithful translation. (Presumably this doesn’t really matter.)
Or there’s [[ "$_file" = bin/* ]]
since this is #!/bin/bash
, although the script seems to otherwise avoid bashisms so I guess that avoidance should be preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
${_file}
is never empty here because of https://github.com/rust-lang/rust-installer/pull/106/files#diff-ec68db39ae4ea5bfe559a47a8a880d5d383dfa81ba067652d2adfcc3c0cd2a17R571 .
So,
if test -z "${_file##bin/*}" || ...
if [ "${_file#bin/*}" != "${_file}" ] || ...
if case "${_file}" in bin/*) ;; *) false ; esac || ...
(POSIX-y[[ "${_file}" = bin/*
)
should all be equivalent here.
If someone has a strong preference for any of those options (I don't), then I'm happy to change it accordingly!
Replace echo/grep/cut/dirname/basename by variable substitutions and
case pattern matching to reduce the amount of subprocesses called for
every copied file.
The current
echo | grep
/echo | cut
/dirname
/basename
invocations slow down the installation process, esp. when components with many files (rust-docs
) are processed.This pull request replaces them with shell builtin substitutions to avoid process calling overhead.
Locally for me, these small changes reduced the install time (for
rust-docs
only) from around 20 minutes to around 3 minutes.