Parallelise cabal build over modules #976

23Skidoo · 2012-07-13T02:49:45Z

Updated summary by @ezyang. Previously, this ticket talked about all sorts of parallelism at many levels. Component-level parallelism was already addressed in #2623 (fixed by per-component builds), so all that remains is per-module parallelism. This is substantially more difficult, because right now we build by invoking ghc --make; achieving module parallelism would require teaching Cabal how to build using ghc -c. But this too has a hazard: if you don't have enough cores/have a serial dependency graph, ghc -c will be slower, because GHC spends more time reloading interface files. In #976 (comment) @dcoutts describes how to overcome this problem.

There are several phases to the problem:

First building the GHC build server and parallelism infrastructure. This can be done completely independently of Cabal: imagine a program which has a command line identical to GHC, but is internally implemented by spinning up multiple GHC processes and farming out the compilation process. You can tell if this was worthwhile when you get scaling better than GHC's built-in -j and a traditional -c setup.
Next, we need to teach Cabal/cabal-install how to take advantage of this functionality. If you implemented your driver program with exactly the same command line flags as GHC, then this is as simple as just passing -w $your_parallel_ghc_impl. However, this is a problem doing it this way: cabal-install will attempt to spin up N parallel package/component builds, which each in turn will try to spin up M GHC build servers; this is bad; you want the total number of GHC build servers to equal the number of cores. So then you will need to setup some sort signalling mechanism to avoid too many build servers from running at once, OR have cabal new-build orchestrate the entire build down to the module level so it can plan parallelism (but you would probably have to rearchitect according to Rewrite Cabal in Shake #4174 before you can do this.)

Now that the package-level parallel install has been implemented (see #440), the next logical step is to extend cabal build with support for building multiple modules, components and/or build variants (static/shared/profiling) in parallel. This functionality should be also integrated with cabal install in such a way that we don't over- or underutilise the available cores.

A prototype implementation of a parallel cabal build is already available as a standalone tool. It works by first extracting a module dependency graph with 'ghc -M' and then running multiple 'ghc -c' processes in parallel.

Since the parallel install code uses the external setup method exclusively, integrating parallel cabal build with parallel install will require using IPC. A single coordinating cabal install -j N process will spawn a number of setup.exe build --semaphore=/path/to/semaphore children, and each child will be building at most N modules simultaneously. An added benefit of this approach is that nothing special will have to be done to support custom setup scripts.

An important issue is that compiling with ghc -c is slow compared to ghc --make because the interface files are not cached. One way to fix this is to implement a "build server" mode for GHC. Instead of repeatedly running ghc -c, each build process will spawn at most N persistent ghcs and distribute the modules between them. Evan Laforge has done some work in this direction.

Other issues:

Building internal components in parallel requires knowing their dependency graph (this is being implemented as part of integrating cabal repl patches).
Generating documentation in parallel may be only safe for build-type: Simple.

The text was updated successfully, but these errors were encountered:

bos · 2012-07-16T19:21:30Z

This will be a huge win if it can make effective use of all cores. I've had quite a few multi-minute builds of individual packages, where the newly added per-package parallelism only helps with dependencies during the very first build, but not at all during ongoing development.

23Skidoo · 2012-07-16T19:38:06Z

@bos The main obstacle here is reloading of interface files, which slows down the parallel compilation considerably compared to ghc --make. See e.g. Neil Mitchell's Shake paper, where he found that "building the same project with ghc --make takes 7.69 seconds, compared to Shake with 11.83 seconds on one processor and 7.41 seconds on four processors." So far, the most promising approach seems to be implementing a "compile server" mode for GHC.

23Skidoo · 2012-07-16T19:40:43Z

An e-mail from @dcoutts that describes the "compile server" idea in more detail:

So here's an idea I've been mulling over recently...

For IDEs and build tools, we want a ghc api interface where we have very
explicit control over the environment in which new modules are compiled.
We want to be in full control, not using --make, and not using any
search paths etc. We know exactly where each .hi and .o file for all
dependent modules are. We should be able to build up an environment of
module name to (interface, object code) by starting from empty, adding
packages and individual module (.hi, .o) files.

Now that'd give us an api a lot like the current command line interface
of ghc -c single shot mode, except that we would be able to specify .hi
files on the command line rather than having ghc find them by searching.

But once we have that api, it'll be useful for IDEs, and useful for a
ghc server. This should give us the performance advantages of ghc --make
but still give us the control and flexibility of single shot mode. I'll
come to parallel builds in a moment.

The way it'd work is you start the server with some initial environment
(e.g. the packages) and you tell it to compile a module, then you can
tell it to extend its environment e.g. with the module you just compiled
and use the extended environment to compile more modules. So clearly you
could do the same thing as ghc --make does but with the dependency
manager being external to ghc.

Now for parallelism. Suppose we have two cores. We launch two ghc server
processes with the same initial package environment. We start compiling
two independent modules. Now we load the .hi files into *both* ghc
server processes to compile more modules. (In practice we don't load
them into each server when they become available, rather we do it on
demand when we see the module we need to compile needs the module
imports in question based on our module dep graph).

So, a short analysis of the number of times that .hi files are loaded:

In the current ghc --make mode, each .hi file is loaded once. So let's
say M modules. In the current ghc -c mode, for M modules we're loading
at most m * m/2 modules (right?) because in a chain of M modules we have
to load all previous .hi files for each ghc -c invocation.

In the hypothetical ghc server mode, with N servers, the worst case is
something like M * N module loads. Also, the N is parallelised. So the
single threaded performance is the same as --make. If you use 8 cores,
the overhead is 8 times higher in total, but distributed across 8 cores
so the wall clock time is no worse.

Actually, it's probably more sensible to look not at the cost of loading
the .hi files for M modules, but for P packages which is likely the
dominant cost. Again, it's P cost for the --make mode, and M * P for the
ghc -c mode, but N * P for the server mode. So this means it might not
be necessary to do the whole-package .hi file optimisation since the
cost is dramatically reduced.

So overall then, there's two parts to the work in ghc: extend the ghc
api to give IDEs and build managers this precise control over the
environment, then extend the main ghc command line interface to use the
new ghc api feature by providing a --server mode. It'd accept inputs on
stdin or something. It only needs very minimal commands: extend the
environment with a .hi .o pair and compile a .hs file. You can assume
that packages and other initial environment things are specified on the
--server command line.

Finally if there's time, add support for this mode into cabal, but that
might be too much (since that needs a dependency based build manager).

I'll also admit an ulterior motive for this feature, in addition to use
in cabal, which is that I'm working on Visual Studio integration and so
I've been thinking about what IDEs need in terms of the ghc api and I
think very explicit control of the environment is the way to go.

tibbe · 2012-07-17T17:12:04Z

Even though using ghc -c leads to a slowdown on one core, having it as an option (for people with more cores) in the meantime seems worthwhile to me.

bos · 2012-07-18T22:40:40Z

@tibbe, I thought the point was that ghc -c doesn't break even until 4 cores. Mind you, Neil was surely testing on Windows, where the OS and filesystem could be reasonably expected to hurt performance quite severely.

tibbe · 2012-07-18T23:27:14Z

@bos I've heard the number 2 tossed around as well, but we should test and see. Doing parallelism at the module level should also expose many more opportunities for parallelism. The current parallel build system suffers quite a bit from lack of that (since there are lots of linear chains of package dependencies.)

nh2 · 2012-07-31T01:30:04Z

What about profiling builds? Due to the structure of the compilations (exactly the same things as in a normal compilaiton are built), I'd guess might easily be run in parallel, and we might get almost ~x2 time saved.

23Skidoo · 2012-07-31T01:33:17Z

@nh2 Parallel cabal build will make this possible.

nh2 · 2013-05-17T17:09:09Z

I am currently working on this. I got good results with ghc-parmake for compiling large libraries and am now making executables build in parallel.

23Skidoo · 2013-05-17T17:17:19Z

@nh2 Cool! BTW, I proposed this as a GSoC project for this summer. Maybe we can work together if my project gets accepted?

23Skidoo · 2013-05-17T17:19:43Z

@nh2

I got good results with ghc-parmake for compiling large libraries

I'm interested in the details. How large was the speedup? On how many cores? In my testing, the difference was negligible.

nh2 · 2013-05-18T05:23:15Z

How large was the speedup? On how many cores?

The project I'm working on has a library with ~400 modules and 40 executables. I'm using an i7-2600K with 4 real (8 virtual) cores. For building the library only, I get:

* cabal build:                                              4:50 mins
* cabal build --with-ghc=ghc-parmake --ghc-options="-j 2":  4:20 mins 
* cabal build --with-ghc=ghc-parmake --ghc-options="-j 4":  3:00 mins 
* cabal build --with-ghc=ghc-parmake --ghc-options="-j 8":  2:45 mins

I had to make minimal changes to ghc-parmake to get this to work, and thus got a 2x speedup almost for free :)

As you can see, the speed-up is not as big as we can probably expect from ghc --make itself being parallel or your --server - due to the caching, those should be a good bit faster, and I hope your project gets accepted. I'd be glad to help a bit if I can - but while I'm ok with hacking around on cabal, I've never touched GHC.

Building the executables in parallel is independent from all this and will also probably be a small change.

23Skidoo · 2013-05-18T07:19:43Z

* cabal build:                                              4:50 mins
* cabal build --with-ghc=ghc-parmake --ghc-options="-j 2":  4:20 mins 
* cabal build --with-ghc=ghc-parmake --ghc-options="-j 4":  3:00 mins 
* cabal build --with-ghc=ghc-parmake --ghc-options="-j 8":  2:45 mins

Nice to hear that it can give a noticeable speedup on large projects. I should try testing it some more.

Building the executables in parallel is independent from all this and will also probably be a small change.

Maybe if you don't integrate build -j and install -j. Then you won't need to implement the IPC design sketched above.

nh2 · 2013-05-20T06:55:46Z

@23Skidoo I made a prototype at https://github.com/nh2/cabal/compare/build-executables-in-parallel. It would be nice if you could take a look.

I haven't rebased on the latest master yet. Once the other points are sorted out, I'll do that and send a proper pull request (I will probably rewrite my history on that branch as we go towards that).
The copying of Semaphore and JobControl from cabal-install is not so nice. Is that the way to go nevertheless or should they be moved to some Internal package in Cabal? Update: We are discussing that here.
I still have to sort out that pressing Ctrl-C kills everything nicely and to get failure exit codes right.
It looks like I can't use macros (need MIN_VERSION_base) in the Cabal package - is that correct? The way how I work around it is very ugly (just using the deprecated old functions in Exception, creating warnings).
We probably want to make parallel jobs a config setting as well, or use the same number as the existing --jobs.

Feedback appreciated.

nh2 · 2013-05-23T07:58:07Z

I have updated my branch to fix some minor bugs in my code. I can now build my project with cabal build --with-ghc=ghc-parmake --ghc-options="-j 8" -j8 to get both parallel library compilation and parallel executable building.

The questions above still remain.

23Skidoo · 2013-05-23T11:35:36Z

@nh2 Thanks, I'll take look.

23Skidoo · 2013-05-25T20:34:49Z

@nh2

The copying of Semaphore and JobControl from cabal-install is not so nice. Is that the way to go nevertheless or should they be moved to some Internal package in Cabal?

Can't you just export them from Cabal and remove the copies in cabal-install?

It looks like I can't use macros (need MIN_VERSION_base) in the Cabal package - is that correct?

Yes, this doesn't work because of bootstrapping. You can do this, however:

#if !defined(VERSION_base)
-- we're bootstrapping, do something that works everywhere
#else

#if MIN_VERSION_base(...)
...
#else
...
#endif

#endif

Or maybe we should add a configure script.

nh2 · 2013-05-26T08:15:50Z

Yes, this doesn't work because of bootstrapping. You can do this, however

Good idea, but when we do the something that works everywhere, we will still get the warnings, this time only in one of the two phases.

Or maybe we should add a configure script.

If that would be enough to find out the version of base, that sounds like the better solution. I don't know what the reliable way to find that out is, though.

23Skidoo · 2013-05-26T13:26:10Z

I have another idea - since Cabal only supports building on GHC nowadays, you can use

#if __GLASGOW_HASKELL__ < 700
-- Code that uses block
#else 
-- Code that uses mask
#endif

23Skidoo · 2013-05-26T13:37:42Z

@nh2

We probably want to make parallel jobs a config setting as well, or use the same number as the existing --jobs.

We can make cabal build read the jobs config file setting, but it shouldn't be used when the package is built during the execution of an install plan (since there's no way to limit the number of parallel build jobs from cabal install ATM).

nh2 · 2013-05-27T10:31:01Z

GLASGOW_HASKELL

Nice, pushed that.

nh2 · 2013-05-27T13:06:49Z

I haven't rebased on the latest master yet

Just rebased that.

23Skidoo · 2013-05-28T02:30:19Z

My GSoC 2013 project proposal has been accepted.

nh2 · 2013-05-28T09:43:14Z

Awesome! Let's give this build system another integer factor speedup! :)

nh2 · 2013-05-28T09:44:15Z

We can make cabal build read the jobs config file setting, but it shouldn't be used when the package is built during the execution of an install plan (since there's no way to limit the number of parallel build jobs from cabal install ATM).

Do you mean with this: When we use install -j and build -j, we get more than n (e.g. n*n) jobs because the two are not coordinated?

23Skidoo · 2013-05-28T19:22:31Z

Do you mean with this: When we use install -j and build -j, we get more than n (e.g. n*n) jobs because the two are not coordinated?

Yes. The plan is to use an OS-level semaphore for this, as outlined above.

treeowl · 2017-12-19T21:35:37Z

Is the speed at which GHC can read interface files a bottleneck? How hard might it be to fix that?

nh2 · 2017-12-22T17:04:41Z

Is the speed at which GHC can read interface files a bottleneck?

@treeowl I'm not sure if that is known.

I've advertised in some other place that GHC, for the various parts of its build pipeline, should record CPU and wall time and be able to produce a report (e.g. "N seconds CPU/wall were spent on reading and decoding interface files). That way we could more easily pinpoint where bottlenecks are. Right now GHC does things time counting and reporting only for optimiser phases, not for any of the "more basic tech" bits.

duog · 2021-03-05T09:51:05Z

Hi All,

I've written a prototype for a ghc feature to limit its parallelism with a semaphore, GNU make jobserver style. The idea being that cabal-install would pass -j -jsem ghc to ghc --make on each invocation. There is a datum showing a nice speedup on building lens from scratch.

https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5176

I'd appreciate any comment on

why this is a bad idea
what the UI should look like
any important semantics

to make this easy and useful to integrate into cabal

Move the `keysQueue` implementation out of `Data.PQueue.Internals`. This allows that module to build in parallel with `Data.PQueue.Prio.Internals`. It looks like [Cabal can't yet make use of this](haskell/cabal#976), but work to make it do so is under way.

Profpatsch · 2022-02-18T08:52:05Z

fwiw, I was able to fix that for local development in our project by adding

package <projectname>
  -- This causes cabal to build modules on all cores, instead of just one.
  ghc-options: -j

to my cabal.project.

This should be the default for interactive development, and I was kinda shocked that it isn’t!

Mikolaj · 2022-02-18T09:13:04Z

@Profpatsch: Great! What are your precise results?

Martinsos · 2022-03-16T12:06:18Z

@Profpatsch thanks for the tip, it also sped up things for me!

@Mikolaj , so I first added

jobs: $ncpus

thinking that will speed up local development / building of local code.

It however had no effect, so I started searching through cabal issues on GH and found this issue. I applied the suggestion by @Profpatsch :

package waspc
  -- This causes cabal to build modules on all cores, instead of just one,
  -- therefore reducing our build times.
  ghc-options: -j

Our project consists of one package, which has one executable (24 modules), one library (124 modules), two test suites (one 8, another 29 modules).

When building just library and exe (cabal clean && time cabal build), this reduced our building time from 43s to 26s.
When building the whole project (library + exe + tests) (cabal clean && time cabal build --enable-tests), this reduced our building time from 56s to 42s.

Most of the speed up seems to be coming from compiling the library.

I also wonder why this is not default, I guess it is relatively new thing?
It might also be valuable explaining in the docs that jobs: $ncpus parallelizes only on the package level, not module level, and in the meantime while it is not yet a default thing, mention this trick with ghc-options: -j.

Here is project in case it is useful: I am pointing to PR because we are just switching from Stack to Cabal, so this is the PR that does it and contains the code I tested this upon: wasp-lang/wasp#471

Mikolaj · 2022-03-16T12:20:05Z

Thank you for the data. Good points. I wonder how reporting warning and errors works with that option? E.g., may two warning texts be interspersed? I think that was in the past the blocker for cabal-level -j but it's been somehow taken care of (I don't know the details).

Mikolaj · 2022-03-16T12:21:34Z

BTW, there is active work by some GHC hacker on how to allocat cores to cabal-level and GHC-level -j. I see in your case there's speedup with a default allocation, but in general, you can get a slowdown due to cores being stolen from cabal by GHC or the other way around.

[Edit: to see what I mean, you'd need to wipe out the whole store directory and then measure the effect.]

Mikolaj · 2022-03-16T12:23:34Z

So, I guess, default GHC -j could potentially slow down initial builds of packages, but speed up subsequent rebuilds. Hard to balance automatically...

Martinsos · 2022-03-16T12:32:17Z

@Mikolaj thanks for explaining!

I have to admit I don't know how errors are reported, haven't tried that out.

So help me understand if I got this right: if cabal is building only one package, then GHC -j can be only beneficial. But, if cabal is building multiple packages and trying to parallelize that, then GHC -j can mess that up because it is also trying to parallelize stuff on its own, and then both of them trying to parallelize stuff ends up with them fighting for resources (threads)?

But I guess with the way I set it up now, that shouldn't be a problem, because GHC -j is enabled only for my local package, and not for the external packages -> so when installing external dependencies, there is only one parallelization happening, and that is on the package level (by cabal), while when building my local project (after external dependencies have been installed), only GHC will be doing parallelization?

fgaz · 2022-03-16T12:44:25Z

RAM also fills up quickly with -j

Martinsos · 2022-03-16T13:09:31Z

I understand then that having GHC -j enabled for libraries would not be great, at least not when they are built as external dependencies.

But it still sounds useful for local development of a package, be it library or executable.

So maybe it makes sense for most people to have GHC -j specified in cabal.project.local? Is that something that could be recommended by default? Or is that also situational?

Mikolaj · 2022-03-16T18:04:40Z

Yes, I think GHC -j is a good default for cabal.project.local. Not always the best, e.g., when a project has many packages or a package has many components that the user wants to always build (then they are built in parallel), but these are less common situations.

Profpatsch · 2022-03-17T12:20:56Z

when a project has many packages or a package has many components that the user wants to always build (then they are built in parallel)

Cabal packages are a horror story anyway, e.g. they don’t work together with cabal repl very well (:r doesn’t work over package boundaries).

For local development I always want to use all available cores, for building libraries we use nixpkgs/nix, which knows how to forward the right amount of cores to its builds.

this reduced our building time from 43s to 26s.
When building the whole project (library + exe + tests) (cabal clean && time cabal build --enable-tests), this reduced our building time from 56s to 42s.

You can probably get more speedups if your project’s modules are split up in a reasonable way, i.e. there is no bottleneck module that all of compilation has to wait on. Types.hs comes to mind as the biggest antipattern.

Profpatsch · 2022-03-17T12:22:52Z

In order to see the module dependency graph, I use a script like this in our production code:

  # display a graph of all modules and how they depend on each other
  mainserv-module-deps-with-filetype = self.writers.writeBash "mainserv-module-deps-with-filetype" ''
    shopt -s globstar
    filetype="$1"
    ${self.haskellPackages.graphmod}/bin/graphmod \
      ${/*silence warnings for missing external dependencies*/""} \
      --quiet \
      ${/*applies some kind of import simplification*/""} \
      --prune-edges \
      ${self.mainserv-root-directory}/src/**/*.hs \
      | ${self.graphviz}/bin/dot \
          ${/*otherwise it’s a bit cramped*/""} \
          -Gsize="20,20!" \
          -T"$filetype"
  '';

Which uses the very good https://hackage.haskell.org/package/graphmod command.

Then it’s just a matter of looking at the graph and noticing bottlenecks.

Profpatsch · 2022-03-17T12:27:12Z

The output for graphmod --quiet --prune-edges waspc/src/**/*.hs | dot -Gsize=20,20! -Tpng for example:

Mikolaj · 2022-03-17T12:50:00Z

@Profpatsch: yay, a great tool. And what are the speedups you are getting building a local project with GHC -j? And how many packages of that project and components are built at once?

fgaz · 2023-03-22T19:55:21Z

ghc-proposals/ghc-proposals#540 is in! 🚀

23Skidoo mentioned this issue Jul 13, 2012

build multiple packages in parallel #440

Closed

ghost assigned 23Skidoo Nov 24, 2012

23Skidoo mentioned this issue Apr 18, 2013

cabal build -j #1272

Closed

23Skidoo mentioned this issue Aug 7, 2013

GHC: Build normal and profiling libraries in parallel. #1413

Closed

hvr mentioned this issue Dec 1, 2017

Add option to limit number of concurrent calls to linker when building with -j #1529

Open

hvr mentioned this issue Apr 6, 2020

Cabal's GHC batch compilation uses too much memory when the module count is high #6658

Open

fgaz added the type: performance label Jul 12, 2021

emilypi removed the meta: 23Skidoo label Aug 11, 2021

treeowl mentioned this issue Dec 9, 2021

Enhance parallel building lspitzner/pqueue#58

Merged

Martinsos mentioned this issue Mar 16, 2022

Migrating Wasp from Stack to Cabal wasp-lang/wasp#471

Merged

dhess mentioned this issue Aug 17, 2022

refactor: Split up Primer.Core module hackworthltd/primer#643

Merged

ulysses4ever removed the old-milestone: cabal-install 2.0 label Sep 18, 2022

fgaz mentioned this issue Mar 23, 2023

Add --semaphore flag to enable interaction with GHC Job Server… #8557

Closed

ulysses4ever added the historical label May 6, 2023

Parallelise cabal build over modules #976

Parallelise cabal build over modules #976

Comments

23Skidoo commented Jul 13, 2012 • edited by ezyang Loading

bos commented Jul 16, 2012

23Skidoo commented Jul 16, 2012

23Skidoo commented Jul 16, 2012

tibbe commented Jul 17, 2012

bos commented Jul 18, 2012

tibbe commented Jul 18, 2012

nh2 commented Jul 31, 2012

23Skidoo commented Jul 31, 2012

nh2 commented May 17, 2013

23Skidoo commented May 17, 2013

23Skidoo commented May 17, 2013

nh2 commented May 18, 2013

23Skidoo commented May 18, 2013

nh2 commented May 20, 2013

nh2 commented May 23, 2013

23Skidoo commented May 23, 2013

23Skidoo commented May 25, 2013

nh2 commented May 26, 2013

23Skidoo commented May 26, 2013

23Skidoo commented May 26, 2013

nh2 commented May 27, 2013

nh2 commented May 27, 2013

23Skidoo commented May 28, 2013

nh2 commented May 28, 2013

nh2 commented May 28, 2013

23Skidoo commented May 28, 2013

treeowl commented Dec 19, 2017

nh2 commented Dec 22, 2017

duog commented Mar 5, 2021

Profpatsch commented Feb 18, 2022

Mikolaj commented Feb 18, 2022

Martinsos commented Mar 16, 2022 • edited Loading

Mikolaj commented Mar 16, 2022

Mikolaj commented Mar 16, 2022 • edited Loading

Mikolaj commented Mar 16, 2022 • edited Loading

Martinsos commented Mar 16, 2022

fgaz commented Mar 16, 2022

Martinsos commented Mar 16, 2022

Mikolaj commented Mar 16, 2022

Profpatsch commented Mar 17, 2022

Profpatsch commented Mar 17, 2022

Profpatsch commented Mar 17, 2022

Mikolaj commented Mar 17, 2022

fgaz commented Mar 22, 2023

23Skidoo commented Jul 13, 2012 •

edited by ezyang

Loading

Martinsos commented Mar 16, 2022 •

edited

Loading

Mikolaj commented Mar 16, 2022 •

edited

Loading

Mikolaj commented Mar 16, 2022 •

edited

Loading