Skip to content

fix(build): The number of build jobs should depend on RAM size#1939

Merged
lslezak merged 2 commits intomasterfrom
limit_build_jobs
Jan 23, 2025
Merged

fix(build): The number of build jobs should depend on RAM size#1939
lslezak merged 2 commits intomasterfrom
limit_build_jobs

Conversation

@lslezak
Copy link
Contributor

@lslezak lslezak commented Jan 23, 2025

Problem

Details

Sometimes the build might fail in OBS because Rust compilation requires huge amount of RAM. The problem happens when running many parallel jobs on a machine with not enough RAM. The build on S390 failed when running 8 jobs with 8GB RAM.

Originally I wanted to increase the requirement for RAM in the _constraints file from 8GB to 16GB. But that would decrease the number of available workers significantly, esp. on exotic archs like s390. And in most cases the workers run 4 or 8 jobs so requiring 16GB would be an overkill.

Solution

So to decrease the needed amount of RAM decrease the maximum number of parallel jobs.

My experimental run /usr/bin/time -v cargo build -j1 showed maximum used memory 1.1GB on x86_64. The compilation on S390 failed with 1GB per job. To be on the safe side let's require 1.3GB per job, different architectures might require more and we need to also leave something for the system and other services.

That means with 8GB RAM it should run at most 6 parallel jobs. That should hopefully avoid triggering the OOM killer.

Notes

  • We can adjust the RAM per job constant later if needed, this initial value is rather an experimental value, let's see how it will work.
  • If some architecture needs quite different amount of RAM we can make the setting arch dependent using the %ifarch macro.

Sometimes the build might fail in OBS because Rust compilation
requires huge amount of RAM. The problem happens when running
many parallel jobs on a machine with not enough RAM.

The build on S390 failed when running 8 jobs with 8GB RAM, so the
compilation definitely needs more than 1GB per job.

My experimental run `/usr/bin/time -v cargo build -j1` showed
maximum used memory 1.1GB on x86_64. To be on the safe side
let's require 1.3GB per job. (Different architectures might require
more and we need to leave something for the system.)

If some architecture needs quite different amount of RAM
we can make the setting arch dependent using the `%ifarch` macro.
Copy link
Contributor

@imobachgs imobachgs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like black magic to me :-) No, seriously, it looks good. I tried the logic on my own machine.

Thanks!

@lslezak lslezak merged commit d7e5108 into master Jan 23, 2025
4 checks passed
@lslezak lslezak deleted the limit_build_jobs branch January 23, 2025 16:44
lslezak added a commit that referenced this pull request Jan 24, 2025
## Problem
- Related to #1939
- Unfortunately SLE16 uses old rpm version 4.18 which does not support
`%{getncpus:proc}` (added in 4.19).
- We need to compute the value manually. :-/

## Details

- I wanted to use some `%if` and use the new option in TW. However the
macro exists in both but in SLE16 it complains about extra argument
which is not accepted. And rpm evaluates macros in both `%if`
branches... 😟
- So I basically reverted to my original implementation
imobachgs added a commit that referenced this pull request Jan 24, 2025
…ation (#1945)

- Another iteration of #1939 and #1943 
- Now use an external macro instead of our implementation
- Tested in TW and SLE16 builds, works fine in both
@imobachgs imobachgs mentioned this pull request Feb 26, 2025
imobachgs added a commit that referenced this pull request Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants