Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintenance effort vs. package downloads #171

Closed
h-vetinari opened this issue Jun 28, 2021 · 10 comments
Closed

Maintenance effort vs. package downloads #171

h-vetinari opened this issue Jun 28, 2021 · 10 comments

Comments

@h-vetinari
Copy link
Member

In the context of trying to babysit the aarch/ppc CI after #169, I had a look at the available builds and began to wonder how the package downloads stacked up. At the time of writing 1.6.3 was pretty much exactly 2 months old, and had the following DL numbers (note that py37 build on aarch had failed for 3a76a91).

linux x64 linux aarch64 linux ppc64le osx x64 osx arm64 win sum %
py37 164'043 - 177 22'059 - 74'602 260'881 33.58%
py38 172'293 473 262 34'672 5431 82'829 295'960 38.10%
py39 144'913 1710 158 28'669 5932 36'050 217'432 27.99%
pypy37 1985 110 66 368 - - 2529 0.33%
sum 483'234 2293 663 85'768 11'363 193'481 776'802
% 62.21% 0.30% 0.09% 11.04% 1.46% 24.91%

PPC - which is by far the biggest problem child right now - only represents <0.1% of downloads, which begs the question of whether that justifies the (wildly disproportionate compared to other arches) maintenance effort.

It would be easily solvable with better CI (or even just a higher timeout), but unless someone with a big interest in PPC (IBM?) sponsors a separate CI queue for that, I'm doubtful if the CI woes are fixable, especially since the scipy build is becoming heavier & heavier.

Note that conda-forge does support building PPC packages through QEMU on azure, but for scipy specifically, this produces ~2000 test failures.

CC @conda-forge/core @jayfurmanek

@rgommers
Copy link
Contributor

especially since the scipy build is becoming heavier & heavier.

We're switching build systems, I intend to get that done before 1.8.0 (December). Then the builds should be faster than they've been in years.

Still, point taken - there's only so much templated C++ and Cython we can add before (free for open source) CI systems can't deal with it anymore.

@rgommers
Copy link
Contributor

As a reference point, I checked the last successful ppc64le build for the main scipy repo (5 months ago, when TravisCI credits ran out): https://www.travis-ci.com/github/scipy/scipy/jobs/473536688.

The whole thing including test suite run took 20 minutes; the build itself less than 5 minutes:

image

The successful ppc64le builds on this repo take about 43 minutes. And the last failed one says:

Successfully installed scipy-1.7.0
Removed build tracker: '/tmp/pip-req-tracker-ch1rplur'

Resource usage statistics from building scipy:
   Process count: 35
   CPU time: Sys=0:00:52.8, User=1:09:59.3
   Memory: 3.7G
   Disk usage: 243.6K
   Time elapsed: 0:22:27.4

So the hardware is fine . The actual problem:

INFO:conda_build.build:Packaging scipy
INFO conda_build.build:build(2274): Packaging scipy

Packaging scipy-1.7.0-py37h7638e60_0
INFO:conda_build.build:Packaging scipy-1.7.0-py37h7638e60_0
INFO conda_build.build:bundle_conda(1514): Packaging scipy-1.7.0-py37h7638e60_0
compiling .pyc files...
number of files: 1821

Warning: rpath /home/conda/feedstock_root/build_artifacts/scipy_1624870161116/_build_env/lib is outside prefix /home/conda/feedstock_root/build_artifacts/scipy_1624870161116/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place (removing it)
...
...
    INFO (scipy,lib/python3.7/site-packages/scipy/linalg/_flinalg.pypy37-pp73-linux-gnu.so): Needed DSO powerpc64le-conda-linux-gnu/sysroot/lib64/libc.so.6 found in CDT/compiler package conda-forge::sysroot_linux-ppc64le-2.17-h8b29623_10
...
...

Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /home/conda/feedstock_root/build_artifacts/scipy_1624870161116/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_

The following NEW packages will be INSTALLED:

    _libgcc_mutex:      0.1-conda_forge            conda-forge
    _openmp_mutex:      4.5-1_gnu                  conda-forge
    attrs:              21.2.0-pyhd8ed1ab_0        conda-forge
    bzip2:              1.0.8-h4e0d66e_4           conda-forge
    ca-certificates:    2021.5.30-h1084571_0       conda-forge
    execnet:            1.9.0-pyhd8ed1ab_0         conda-forge
    expat:              2.4.1-h3b9df90_0           conda-forge
    gdbm:       

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.

Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received

The build has been terminated

So it's conda and/or conda-build being slow as a dog (can't tell which one).

@h-vetinari
Copy link
Member Author

We're switching build systems, I intend to get that done before 1.8.0 (December). Then the builds should be faster than they've been in years.

Cool! Xref (for anyone interested): scipy/scipy#13615

Regarding your second point, yes, the conda solver (& general CI setup) eats a substantial amount of time. Not sure what options there are to change this, it's not like people aren't aware of it.

The failing log you picked out is for PPC+PyPy, which I noted happens consistently, leading me to believe it's some bug in the interaction with PyPy + PPC that reproducibly leads to hangs - hence why I stopped retriggering that job.

Note that Travis seems to have different underlying hardware in its fleet as well, there were a lot of timeouts (5-6) before the py37/py38 builds now ran through. Since I kept restarting the failed cpython builds (and after it runs through the previous logs are not available anymore), there are unfortunately no other failing logs (but would be easy with a new PR...).

@rgommers
Copy link
Contributor

Regarding your second point, yes, the conda solver (& general CI setup) eats a substantial amount of time. Not sure what options there are to change this, it's not like people aren't aware of it.

Is there a conda-forge issue about switching to Mamba somewhere? Mamba is reliable enough by now I'd think, and this would address the actual root cause of these problems.

@wolfv
Copy link
Member

wolfv commented Oct 13, 2021

Hey Ralf, yes we discussed the usage of mambabuild briefly at the last core meeting and Marius just opened a PR to make it the default. Basically we have the "go" to allow uploads of packages built with mambabuild (currently it can only be used for debugging because uploading is prohibited). That's a quick fix in the conda-smithy build script generation.

Would be cool if you weigh in on the open PR!

@h-vetinari
Copy link
Member Author

Would be cool if you weigh in on the open PR!

Link for convenience: conda-forge/conda-smithy#1507

@rgommers
Copy link
Contributor

That's great news, thanks @wolfv! From reading through the two PRs, it's not 100% clear to me - but I think we can enable it today in this feedstock by adding "build_with_mambabuild": True to scipy-feedstock/conda-forge.yml?

@wolfv
Copy link
Member

wolfv commented Oct 14, 2021

Actually I am not sure if we need a new release of conda-smithy – but yes, that will be possible!

@rgommers
Copy link
Contributor

@h-vetinari worth trying to see if we can avoid turning off aarch64 builds that way?

@h-vetinari
Copy link
Member Author

This situation has improved quite a lot in recent times, so I'm closing this issue...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants