idea for further improvement of build scheduling #5125

matthiaskrgr · 2018-03-05T21:34:23Z

I'm trying to understand #5100
Lets assume we have a dependency tree like this

      ROOT CRATE
    /   |   |   \
   A    B   C    D
 / |    |
E  F    G
|       | \
H       I  J
|          |
K          L
           |
           M

depth annotation would be something like this?

     1ROOT CRATE
    /   |   |   \
  2A   2B   2C  2D
 / |    |
3E 3F  3G
|       | \
4H     4I  4J
|          |
5K        5L
           |
          6M

as far as I understand, build order will be like this:

     14ROOT CRATE
    /   |   |   \
  10A  11B 12C  13D
 / |    |
7E 8F  9G
|       | \
4H     5I  6J
|          |
2K        3L
           |
          1M

It might be useful to not only take depth but also the number of child nodes (in braces) into account, like this:

         1(14)ROOT CRATE--,
        /    |     \     `\
   ,-2(5)A  2(6)B  2(1)C  2(1)D
  /      |    |
3(3)E 3(1)F  3(5)G-,
  |          |      \
4(2)H      4(1)I  4(3)J
  |                 |
5(1)K             5(2)L
                    |
                  6(1)M

If we have several crates of equal depth, we could look at the number of child nodes and prioritize the crate with the most (total) child nodes (assuming that crates with a lot of dependencies are bigger that those with only few dependencies).
New build order:

     14ROOT CRATE
    /   |   |   \
  11A  10B 12C  13D
 / |    |
8E 9F  7G
|       | \
5H     6I  4J
|          |
2K        3L
           |
          1M

Does this sound resonable?

edit: aid for debugging RUST_LOG=cargo::core::compiler::job_queue cargo check

The text was updated successfully, but these errors were encountered:

alexcrichton · 2018-03-06T00:05:01Z

Certainly! Our goal is to always keep all the cpus busy, so anything that optimizes work to get opened up I think looks great as an ordering.

fabricedesre · 2018-05-01T01:17:47Z

I am totally unfamiliar with the dependency scheduling in cargo, but is a "static" order the best way? Shouldn't we re-compute which crate to build every time we are done with one and a cpu is free?
Naively, we would start with leaf nodes and dynamically move on as the build progresses.

alexcrichton · 2018-05-01T03:04:54Z

@fabricedesre yeah that's basically what happens today. The question here is how to prioritize work when there's multiple candidates to be scheduled, the code for the current iteration is at https://github.com/rust-lang/cargo/blob/master/src/cargo/util/dependency_queue.rs

Eh2406 · 2019-09-20T15:53:43Z

So #7390 changed the build order, but not in a way that will affect the OP. #7390 fixes the case

  d
 / \ \ \
c   e f g
|   \ | /
b     I
|
a

Where finishing I freeze up opportunities for parallelism. In practise this proitatizes winapi and syn and so speeds things up. But that is not necessarily true, some other project could be bottleneck on the a->c pilpline instead.

est31 · 2019-09-21T11:20:42Z

To extend my comment from #7396:

The problem that cargo has to solve is known as the multiprocessor DAG scheduling problem, which is NP complete so we can only apply approximations. The good news is that the problem is well-known in the literature and you can find a wealth of papers about the topic. E.g. this one is fairly recent, or this one.

I suppose it would be a good idea to somehow account for the projected build time of a certain crate. Those are dependent from the feature flags as well as the platform, winapi is quite trivial to build on linux while it takes longer on windows. Issue #7396 is exploring this idea a bit.

A project to find a better scheduling algorithm is a great opportunity for experimentation. One could e.g. parse crater logs to find out how long each crate takes to build and then use that information to build a "virtual" crater where the results of different scheduling algorithms are compared using different CPU numbers etc. Then we'd find the scheduling algorithm that fits the cargo use case best. The final implementation of the scheduling algorithm could be even put into its own crate, for use in the ecosystem, similar to polonius.

epage · 2023-05-24T19:35:00Z

#11032 made us take priority into account in more cases.

As part of the experiments for that, some other priority tweaks were looked into. See the zulip thread

epage · 2023-10-18T01:06:55Z

Reading more on #11032, it sounds like this is fully implemented

From:

There's already a notion of cost per unit of work in cargo: it's synthethic (always 100) but is used to compute a priority (as the cost of a unit of work and all the units that depend on it). This cost/priority is taken into account when a job is now ready to be built and dequeued from the dependency queue: the highest priority job is dequeued, and transferred to the pending queue.

So the pending queue will contain jobs with different costs, but doesn't use it to choose the next job to build. This experiment uses the cost/priority to sort the pending queue, so that the highest cost/priority jobs are picked first. The hope is that this makes bigger dependency subgraphs processed sooner, and brings more parallelism opportunities are earlier in the pipeline.

If there is something I'm missing, let us know!

matthiaskrgr changed the title ~~ìdea for further improvement of build scheduling~~ idea for further improvement of build scheduling Mar 5, 2018

est31 mentioned this issue Sep 21, 2019

optimize dep queue by recording and using crate build times #7396

Open

ehuss added the Performance Gotta go fast! label Sep 21, 2019

SimonSapin mentioned this issue Sep 26, 2019

Provide scheduling hints? #7437

Open

epage closed this as completed Oct 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

idea for further improvement of build scheduling #5125

idea for further improvement of build scheduling #5125

matthiaskrgr commented Mar 5, 2018 •

edited

Loading

alexcrichton commented Mar 6, 2018

fabricedesre commented May 1, 2018

alexcrichton commented May 1, 2018

Eh2406 commented Sep 20, 2019 •

edited

Loading

est31 commented Sep 21, 2019

epage commented May 24, 2023

epage commented Oct 18, 2023

idea for further improvement of build scheduling #5125

idea for further improvement of build scheduling #5125

Comments

matthiaskrgr commented Mar 5, 2018 • edited Loading

alexcrichton commented Mar 6, 2018

fabricedesre commented May 1, 2018

alexcrichton commented May 1, 2018

Eh2406 commented Sep 20, 2019 • edited Loading

est31 commented Sep 21, 2019

epage commented May 24, 2023

epage commented Oct 18, 2023

matthiaskrgr commented Mar 5, 2018 •

edited

Loading

Eh2406 commented Sep 20, 2019 •

edited

Loading