Should download and install concurrently #731

nagisa · 2016-09-23T14:09:16Z

Currently rustup is scarily sequential. It could easily download more stuff while installing the previously downloaded stuff.

brson · 2017-05-12T02:05:58Z

I don't know if it would be profitable to do multiple network requests in parallel, or multiple file ops in parallel, but it would definitely be profitable to be doing networking while doing file I/O.

Today rustup does installation in two phases: first it acquires all the resources off the network; then it installs them. It does this to eliminate the uncertainty of the network failing during installation.

The actual file I/O though in rustup does have a transactional system that is supposed to be able to roll back, and it definitely does in the test suite, and I've seen it do rollbacks live. I am not super-confident that it is bulletproof, though we could try interleaving downloading and installation and see how it goes.

Adding parallelism here would make status messages nondeterministic and more confusing.

There are definitely opportunities for improvement here, though I'm not sure I'm ready to pull the trigger yet without thinking about the constraints more. If somebody wanted to give it a shot I'd be happy to review.

nagisa · 2017-05-12T02:13:45Z

I don't know if it would be profitable to do multiple network requests in parallel

~~I think the reason I proposed this in the first place is that rustc’s CDN is terribly bad with not yet cached stuff.~~ Rereading original report, I guess it wasn’t the reason, but it is still worth doing it IMO.

For reference, on my 100Mbps connection, downloading the tarball for x86_64-unknown-linux-gnu or x86_64-pc-windows-msvc is pretty fast (~8-11MB/s), but downloading artefacts for something like powerpc64-unknown-linux-gnu will run at some abysmal sub-1MB/s (yes, it is THAT terrible). Having 3 parallel network streams (rustc/libstd/cargo) downloading at 3MB/s combined would cut the installation time by 3. Using HTTP range requests could improve the download times even further.

As for output status/messages, maybe something like indicatif could work?

rbtcollins · 2019-04-08T22:14:03Z

One thing I think thats worth noting is that the failure modes leading to a partial install go up with concurrent download-and-install : the transactional system is only an approximation of one - it has no write ahead journal, nor any ability to recover after interruptions.

My recommendation would be to eliminate the transactional system in favour of a interrupt-safe eventually correct system: use the manifests and installed metadata to (coarsely, not per-individual-file!) cleanup after interrupted executions; this would permit streaming installation where the archive doesn't have to get written to disk at all.

kinnison · 2019-04-09T07:00:58Z

@rbtcollins That would certainly be a better approach. Unfortunately wg-rustup doesn't have a lot of time for large architectural changes like this right now. :(

rbtcollins · 2020-07-15T10:19:16Z

See also #2417

dtolnay · 2024-01-05T22:18:45Z

I did an experiment to see how much performance is being left on the table in the current architecture. See https://github.com/dtolnay/fast-rustup.

rustup:

$ rustup toolchain remove nightly-2024-01-01
$ time rustup toolchain install nightly-2024-01-01
17.9 seconds

fast-rustup:

$ rustup toolchain remove nightly-2024-01-01
$ time target/release/fast-rustup nightly-2024-01-01
5.4 seconds

This is tested on my laptop where I get 90+ MiB/s from static.rust-lang.org.

Right now it just supports the "default" profile. It installs the same contents as rustup aside from what looks like some bookkeeping differences. diff -r:

Only in rustup/lib/rustlib: components
Only in rustup/lib/rustlib: manifest-cargo-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: manifest-clippy-preview-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: manifest-rustc-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: manifest-rust-docs-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: manifest-rustfmt-preview-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: manifest-rust-std-x86_64-unknown-linux-gnu
Only in rustup/lib/rustlib: multirust-channel-manifest.toml
Only in rustup/lib/rustlib: multirust-config.toml
Only in rustup/lib/rustlib: rust-installer-version
Only in fast-rustup/lib/rustlib/x86_64-unknown-linux-gnu/lib: self-contained

If someone wants to help see if we can make it even faster, I would welcome PRs. Especially the filesystem I/O: currently a single component's contents are still being written to the filesystem serially. One file must finish being written before the next file from the same component begins being written. I am not an expert in filesystem performance characteristics and I have not tried looking into whether there is a way to do this better, but maybe buffering files in memory with a SPMC threadpool to perform the filesystem I/O.

rbtcollins · 2024-10-13T08:24:16Z

Two considerations:

must not unpack unvalidated content so downloading and unpacking concurrently might be an issue). OTOH the recent change to drop GPG signatures means we now consider just HTTPS download as sufficient integrity - but there is still the possibility of a bad archive on a mirror server, so possibly adding in blake2 streaming validation would be a good idea, but not strictly needed.
need to keep working on low memory devices - the archives we currently ship are optimised for compression, which involves quite large compression windows. Currently we don't track that buffer size, but if we were to unpack concurrently we'd likely need to do so

Diggsey added enhancement performance labels May 5, 2017

brson added the help wanted label May 12, 2017

rbtcollins mentioned this issue Apr 9, 2019

Installing rust-docs component on Windows 10 is very slow #1540

Closed

dtolnay mentioned this issue Jan 5, 2024

Consider providing zstd tarballs on static.r-l.o rust-lang/infra-team#97

Open

rami3l added this to the On Deck milestone Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should download and install concurrently #731

Should download and install concurrently #731

nagisa commented Sep 23, 2016

brson commented May 12, 2017

nagisa commented May 12, 2017 •

edited

Loading

rbtcollins commented Apr 8, 2019

kinnison commented Apr 9, 2019

rbtcollins commented Jul 15, 2020

dtolnay commented Jan 5, 2024

rbtcollins commented Oct 13, 2024

Should download and install concurrently #731

Should download and install concurrently #731

Comments

nagisa commented Sep 23, 2016

brson commented May 12, 2017

nagisa commented May 12, 2017 • edited Loading

rbtcollins commented Apr 8, 2019

kinnison commented Apr 9, 2019

rbtcollins commented Jul 15, 2020

dtolnay commented Jan 5, 2024

rbtcollins commented Oct 13, 2024

nagisa commented May 12, 2017 •

edited

Loading