Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fork: retry: Resource temporarily unavailable #23

Closed
cgwalters opened this issue Jan 21, 2021 · 3 comments
Closed

fork: retry: Resource temporarily unavailable #23

cgwalters opened this issue Jan 21, 2021 · 3 comments

Comments

@cgwalters
Copy link
Member

We're seeing e.g. ./libtool: fork: retry: Resource temporarily unavailable in the rpm-ostree CI jobs.

One thing I notice is RPM's %{make_build} macro is detecting 40 CPUs: /usr/bin/make -O -j40. We might even be hitting PID limits, or perhaps per-user?

Googling around a bit I found openSUSE/obs-build#425 and looking at the rpm macros, it does seem likely that we could inject -D _smp_ncpus_max=8 or so?

@cgwalters
Copy link
Member Author

But I still think we have the larger "CPUs versus Kubernetes" issue that came up in the code in coreos/coreos-assembler#1287

@cgwalters
Copy link
Member Author

Thinking about this actually, in rpm-ostree in particular our tests do this:
JOBS=${JOBS:-$(ncpus)} and then actually each rpm-ostree compose process will e.g. parallelize RPM imports with threads (up to 40...). If we have several of these jobs running at once that can add up.

Ah but in rpm-ostree ncpus does do the "detect kubernetes" dance. Hmm...let's export that in cosa and have it in the buildroot by default?

Actually related to this, one really nice thing cargo does is interact with a make jobserver - perhaps our cosa jobs should export a jobserver by default.

@jlebon
Copy link
Member

jlebon commented Jan 21, 2021

Yeah, sadly right now we're responsible for bridging build tools' view of resources available and the Kubernetes world. I think most of our workloads do have this bridging now, but it looks like RPM building escaped through. Would be nice if in the future, e.g. make just knows how many jobs to run in parallel based on cgroups allocation. (Or maybe there should be a namespace feature which allows modifying how many CPUs processes can see?)

The _smp_ncpus_max macro bit sounds promising. I can play around with that.

jlebon added a commit to jlebon/rpm-ostree that referenced this issue Jan 21, 2021
Otherwise, it defaults to `_SC_NPROCESSORS_ONLN` (via `%make_build` ->
`%_smp_mflags` -> `%_smp_build_ncpus` -> `%{getncpus}` ->
https://github.com/rpm-software-management/rpm/blob/48c0f28834eb377a54f27ee0b6950af7e6d537b8/rpmio/macro.c#L583).
And that's going to be wrong in Kubernetes because we're constrained via
cgroups.

The `%_smp_build_ncpus` macro allows overriding this logic via
`RPM_BUILD_NCPUS`.

See: coreos/coreos-ci#23
See: coreos/coreos-assembler#1287
jlebon added a commit to jlebon/rpm-ostree that referenced this issue Jan 21, 2021
Otherwise, it defaults to `_SC_NPROCESSORS_ONLN` (via `%make_build` ->
`%_smp_mflags` -> `%_smp_build_ncpus` -> `%{getncpus}` ->
https://github.com/rpm-software-management/rpm/blob/48c0f28834eb377a54f27ee0b6950af7e6d537b8/rpmio/macro.c#L583).
And that's going to be wrong in Kubernetes because we're constrained via
cgroups.

The `%_smp_build_ncpus` macro allows overriding this logic via
`RPM_BUILD_NCPUS`.

See: coreos/coreos-ci#23
See: coreos/coreos-assembler#632
See: coreos/coreos-assembler#1287
openshift-merge-robot pushed a commit to coreos/rpm-ostree that referenced this issue Jan 21, 2021
Otherwise, it defaults to `_SC_NPROCESSORS_ONLN` (via `%make_build` ->
`%_smp_mflags` -> `%_smp_build_ncpus` -> `%{getncpus}` ->
https://github.com/rpm-software-management/rpm/blob/48c0f28834eb377a54f27ee0b6950af7e6d537b8/rpmio/macro.c#L583).
And that's going to be wrong in Kubernetes because we're constrained via
cgroups.

The `%_smp_build_ncpus` macro allows overriding this logic via
`RPM_BUILD_NCPUS`.

See: coreos/coreos-ci#23
See: coreos/coreos-assembler#632
See: coreos/coreos-assembler#1287
@jlebon jlebon closed this as completed Jan 21, 2021
cgwalters added a commit to cgwalters/coreos-ci-lib that referenced this issue Feb 3, 2021
See coreos/coreos-ci#23
We're doing this manually in rpm-ostree CI; let's standardize
on this.
cgwalters added a commit to cgwalters/coreos-ci-lib that referenced this issue Feb 3, 2021
See coreos/coreos-ci#23
We're doing this manually in rpm-ostree CI; let's standardize
on this.
jlebon pushed a commit to coreos/coreos-ci-lib that referenced this issue Feb 3, 2021
See coreos/coreos-ci#23
We're doing this manually in rpm-ostree CI; let's standardize
on this.
cgwalters added a commit to cgwalters/release that referenced this issue Feb 4, 2021
Let's do the build variants and unit testing in Prow, saving
the bare metal capacity of CentOS CI for our VM testing.

CC coreos/coreos-ci#23
jlebon added a commit to jlebon/coreos-ci-lib that referenced this issue Mar 21, 2022
In cosa CI, we're hitting:

> runtime: failed to create new OS thread

I think this is another instance of non-Kubernetes-aware multiprocessing
like in coreos/coreos-ci#23.

Let's expose the `resources` knob for building images like we already do
for `pod`. This will allow us in cosa to request a specific amount, and
then asking golang to respect it.
jlebon added a commit to coreos/coreos-ci-lib that referenced this issue Mar 21, 2022
In cosa CI, we're hitting:

> runtime: failed to create new OS thread

I think this is another instance of non-Kubernetes-aware multiprocessing
like in coreos/coreos-ci#23.

Let's expose the `resources` knob for building images like we already do
for `pod`. This will allow us in cosa to request a specific amount, and
then asking golang to respect it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants