Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

troubleshoot() misses .so errors that happen at the auditwheel step (was: when running cibuildwheel locally, a {project}/build directory may interfere) #793

Closed
jbarlow83 opened this issue Aug 13, 2021 · 15 comments · Fixed by #807

Comments

@jbarlow83
Copy link
Contributor

To reproduce:

  • using a Linux host machine that is significantly newer than the manylinux target (in my case, Ubuntu 21.04 and manylinux2010
  • use the same version of Python on host and target
  • git clone any package with C extensions
  • pip install -e .
  • cibuildwheel --plat linux

You will probably get error messages that complain about use of newer GLIBC symbols. These will coincide with the host machine's GLIBC.

The reason is that locally installing the package (pip install [-e] . or python setup.py bdist_wheel) will create built libraries in the build/ directory. And then, cibuildwheel maps the local directory to {project}/build in the container, and when Docker container run pip, pip thinks that the C extension is already built and up to date, so it skips building it and uses the wrong version.

I suggest creating an additional volume mapping from e.g. host:/tmp/cibuildwheel-build to container:/project/build, so that the container is isolated from the local build directory (and so that these temporary objects can be accessed if the build fails).

@Czaki
Copy link
Contributor

Czaki commented Aug 13, 2021

if you run build locally the .so files will be inside your project directory and the existence of build directory is irrelevant. setuptools will found that these .so files exist and will not perform build steep.

@jbarlow83
Copy link
Contributor Author

How do you mean it's irrelevant? The fact that setuptools/pip prevents the build step is precisely the issue being raised here.

@pradyunsg
Copy link
Member

pradyunsg commented Aug 13, 2021

The existence of a build directory, when building with setuptools is not irrelevant. Setuptools doesn't clean the directory before performing a build, so past build artifacts will get included in newer artifacts. See the pip 20.0 release fiasco for example.

@Czaki
Copy link
Contributor

Czaki commented Aug 13, 2021

How do you mean it's irrelevant? The fact that setuptools/pip prevents the build step is precisely the issue being raised here.

because if you have .so files inside code (which happens when you call pip install -e for example) then removing the build directory will change nothing.

You could also change these behavior by CIBW_BEFORE_ALL="rm -rf {project}/build

for me build is to common name to block its usage.

@Czaki
Copy link
Contributor

Czaki commented Aug 13, 2021

The existence of a build directory, when building with setuptools is not irrelevant. Setuptools doesn't clean the directory before performing a build, so past build artifacts will get included in newer artifacts. See the pip 20.0 release fiasco for example.

I know this. but build is too common a name to mock it by default.

cibuildwheel gives you a tool for setup clean steep (it could be done in pyproject.toml file).

I know the pain of cleaning artifacts on Linux machines, but the requested change could create more problems than solve.

@pradyunsg
Copy link
Member

Yea, I didn't mean to imply that you don't know that.

What I was hinting at is that it is relevant to how the user's project would be built, which is what they're using cibuildwheel for. I do agree that the approach suggested by OP is the solution here and that this is not something that cibuildwheel can solve.

And given that you've already provided the context that the user might have needed, I'm gonna bow out now.

@jbarlow83
Copy link
Contributor Author

I spent many hours investigating this issue. It's a nontrivial dirty state interaction, and the error message (through no fault of cibuildwheel) points in the wrong direction.

because if you have .so files inside code (which happens when you call pip install -e for example) then removing the build directory will change nothing.

This is not actually true. When building a wheel only whitelisted files from the source tree are copied into the wheel. By default this is only .py files; everything else needs to be listed in the package's package_data definition, even innocuous files like py.typed. That means if your source tree contains .so files, they will be ignored. Compiled C extensions get added to the wheel in a separate step after adding the Python source tree. Having a directory build/ directory will contaminate that separate step. I don't want to get into the weeds on this but if you build any wheel with pip wheel -v . to confirm how this actually functions.

I understand that one could mitigating it by explicitly deleting build/. But the whole point of CI is to have isolated repeatable build environments. cibuildwheel should identify the problem of a dirty build/ on its own.

@Czaki
Copy link
Contributor

Czaki commented Aug 14, 2021

I understand that one could mitigating it by explicitly deleting build/. But the whole point of CI is to have isolated repeatable build environments. cibuildwheel should identify the problem of a dirty build/ on its own.

If you have build wheels on CI how you could have dirty build/. Dirty build you could have on the local machine. I do not see a scenario on CI when you will have a dirty build dir. CI all time should provide same starting state and repository should not contain build directory with artifacts from wheel build.

This is not actually true. When building a wheel only whitelisted files from the source tree are copied into the wheel. By default this is only .py files; everything else needs to be listed in the package's package_data definition

I'm not sure. I need to check this. Last time when I do a similar mistake so files were copied. Maybe they fix it later. (I will check it today).

@joerick
Copy link
Contributor

joerick commented Aug 16, 2021

Ok, so I've done some experiments so we're all on the same page...

$ pwd
/Users/joerick/Desktop/test-project
$ tree
.
├── setup.cfg
├── setup.py
└── src
    └── spam.c

1 directory, 3 files

Installing it normally-

$ pip install .
Processing /Users/joerick/Desktop/test-project
  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.
Using legacy 'setup.py install' for spam, since package 'wheel' is not installed.
Installing collected packages: spam
  Attempting uninstall: spam
    Found existing installation: spam 0.1.0
    Uninstalling spam-0.1.0:
      Successfully uninstalled spam-0.1.0
    Running setup.py install for spam ... done
Successfully installed spam-0.1.0
$ tree
.
├── setup.cfg
├── setup.py
└── src
    └── spam.c

1 directory, 3 files

So far, so good. No build artifacts added. Let's try installing that with -e.

$ git clean -df
$ tree
.
├── setup.cfg
├── setup.py
└── src
    └── spam.c

1 directory, 3 files
$ pip install -e .
Obtaining file:///Users/joerick/Desktop/test-project
Installing collected packages: spam
  Running setup.py develop for spam
Successfully installed spam-0.1.0
$ tree
.
├── build
│   ├── lib.macosx-10.9-x86_64-3.8
│   │   └── spam.cpython-38-darwin.so
│   └── temp.macosx-10.9-x86_64-3.8
│       └── src
│           └── spam.o
├── setup.cfg
├── setup.py
├── spam.cpython-38-darwin.so
├── spam.egg-info
│   ├── PKG-INFO
│   ├── SOURCES.txt
│   ├── dependency_links.txt
│   └── top_level.txt
└── src
    └── spam.c

6 directories, 10 files

So, a few things to note here. The build added build and span.egg-info dirs. It also added spam.cpython-38-darwin.so, outside of the build tree. The binary .o and .so files are the issue here, they shouldn't be moved across to the Docker container.

So we could write rules to exclude the build dir, but I believe the isolated spam.cpython-38-darwin.so is the issue that @Czaki is mentioning. These files get spread throughout the build repository, in my experience, whereever the .c file is located. I don't think we can automatically exclude .so files when copying, because some users will want to bring .so files into the compile process (e.g. a binary dependency). We discussed this issue in the past, it's at #139. In response to that, we added a message on build failure. Did you see this message @jbarlow83?


As an aside, I was curious what would happen when --use-feature=in-tree-build becomes the default. Here's what I found:

$ git clean -df
Removing build/
Removing spam.cpython-38-darwin.so
Removing spam.egg-info/
$ tree
.
├── setup.cfg
├── setup.py
└── src
    └── spam.c

1 directory, 3 files
$ pip install --use-feature=in-tree-build .
Processing /Users/joerick/Desktop/test-project
Using legacy 'setup.py install' for spam, since package 'wheel' is not installed.
Installing collected packages: spam
    Running setup.py install for spam ... done
Successfully installed spam-0.1.0
$ tree
.
├── build
│   ├── lib.macosx-10.9-x86_64-3.8
│   │   └── spam.cpython-38-darwin.so
│   └── temp.macosx-10.9-x86_64-3.8
│       └── src
│           └── spam.o
├── setup.cfg
├── setup.py
├── spam.egg-info
│   ├── PKG-INFO
│   ├── SOURCES.txt
│   ├── dependency_links.txt
│   └── top_level.txt
└── src
    └── spam.c

This is slightly different from the -e ., in that it doesn't put the .so files around. But it does create the build directory.

@jbarlow83
Copy link
Contributor Author

Did you see this message @jbarlow83?

No. This is the failure path:

Repairing wheel...

    + sh -c 'auditwheel repair -w /tmp/cibuildwheel/repaired_wheel /tmp/cibuildwheel/built_wheel/pikepdf-3.0.0b2.dev9+g0bd6795b-cp39-cp39-linux_x86_64.whl'
INFO:auditwheel.main_repair:Repairing pikepdf-3.0.0b2.dev9+g0bd6795b-cp39-cp39-linux_x86_64.whl
usage: auditwheel [-h] [-V] [-v] command ...
auditwheel: error: cannot repair "/tmp/cibuildwheel/built_wheel/pikepdf-3.0.0b2.dev9+g0bd6795b-cp39-cp39-linux_x86_64.whl" to "manylinux2010_x86_64" ABI because of the presence of too-recent versioned symbols. You'll need to compile the wheel on an older toolchain.

                                                              ✕ 1.27s
Error: Command ['sh', '-c', 'auditwheel repair -w /tmp/cibuildwheel/repaired_wheel /tmp/cibuildwheel/built_wheel/pikepdf-3.0.0b2.dev9+g0bd6795b-cp39-cp39-linux_x86_64.whl'] failed with code 2

Looking around the code, the troubleshoot message does not print when auditwheel fails. Perhaps with earlier versions of pip wheel, behavior/failure modes were a bit different.

def troubleshoot(package_dir: Path, error: Exception) -> None:
if isinstance(error, subprocess.CalledProcessError) and (
error.cmd[0:4] == ["python", "-m", "pip", "wheel"]
or error.cmd[0:3] == ["python", "-m", "build"]
):

@joerick
Copy link
Contributor

joerick commented Aug 18, 2021

Thanks for getting back @jbarlow83!

Perhaps with earlier versions of pip wheel, behavior/failure modes were a bit different.

Yes, that sounds possible. I think we should adapt this clause to catch auditwheel failures too, and print the same message. I don't think we can do anything further, even if we did delete the build directory on copying we'd still have the .so files scattered around so it wouldn't solve the problem.

If anyone has time to send a PR updating that troubleshoot clause to include auditwheel failures, I'd be grateful.

@joerick joerick changed the title When running cibuildwheel locally, a {project}/build directory may interfere troubleshoot() misses .so errors that happen at the auditwheel step (was: when running cibuildwheel locally, a {project}/build directory may interfere) Aug 23, 2021
@jbarlow83
Copy link
Contributor Author

setuptools uses build/temp* and build/lib* as a build cache, and cibuildwheel inadvertently shares the build cache from the host and container by copying the whole source tree. I apologize but I can't get around the fact that unintentionally sharing the build cache from host to container is counterproductive when isolating builds is the goal. I believe that Cython has its own build cache that it drops elsewhere in build/.

By the time we get to troubleshoot(), because the build folder is going to contain mostly valid .o and .so files. Searching for these files isn't going to help; we expect a bunch of them. It would work to warn before the build that these folders contain files that may interfere with auditwheel. I can throw together a PR along those lines.

The core devs have noticed this is a problem in other areas and there's a draft PEP to make the cache directory configurable. That's what would be best when it's available - use a /tmp directory and avoid these issues, and there's no problem for users who happen to be doing strange things in their build folder.
https://github.com/python/peps/pull/1976/files

@joerick
Copy link
Contributor

joerick commented Aug 26, 2021

setuptools uses build/temp* and build/lib* as a build cache, and cibuildwheel inadvertently shares the build cache from the host and container by copying the whole source tree. I apologize but I can't get around the fact that unintentionally sharing the build cache from host to container is counterproductive when isolating builds is the goal.

This is true, but the caches are not just in ./build. See my example above:

$ git clean -df
$ tree
.
├── setup.cfg
├── setup.py
└── src
    └── spam.c

1 directory, 3 files
$ pip install -e .
Obtaining file:///Users/joerick/Desktop/test-project
Installing collected packages: spam
  Running setup.py develop for spam
Successfully installed spam-0.1.0
$ tree
.
├── build
│   ├── lib.macosx-10.9-x86_64-3.8
│   │   └── spam.cpython-38-darwin.so
│   └── temp.macosx-10.9-x86_64-3.8
│       └── src
│           └── spam.o
├── setup.cfg
├── setup.py
├── spam.cpython-38-darwin.so  <-------------------------------- this file
├── spam.egg-info
│   ├── PKG-INFO
│   ├── SOURCES.txt
│   ├── dependency_links.txt
│   └── top_level.txt
└── src
    └── spam.c

6 directories, 10 files

There are also caches written that are stored alongside the source files, as well as in build. I don't know why setuptools does this, maybe somebody does? But the point is that removing the build dir wouldn't solve the problem, as these files would still be copied over.

By the time we get to troubleshoot(), because the build folder is going to contain mostly valid .o and .so files. Searching for these files isn't going to help; we expect a bunch of them.

Not true, I think. The troubleshoot() function scans the host's version of the source tree, so it's not affected by the build that happened in the Docker container.

@jbarlow83
Copy link
Contributor Author

spam.cpython-38-darwin.so is not a cache file, it's an output that gets dumped in the source tree. While it will be copied to the container, it only gets included in the wheel if the package manifest includes that specific file (e.g. by adding it to package data in setup.cfg). pip wheel only copies files on the manifest to the wheel.

You can easily confirm that the compiled .so files that get copied to the wheel always come from build/ and copies outside of build get ignored. If you check pip wheel -v . you'll see something like:

copying build/lib.linux-x86_64-3.9/spam.cpython-39-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/wheel/spam

Hack ./spam*.so to be different from build/lib*/spam*.so and use wheel unpack to confirm that the latter is what gets copied to the wheel.

Not true, I think. The troubleshoot() function scans the host's version of the source tree, so it's not affected by the build that happened in the Docker container.

I stand corrected. Container changes won't show up in the host, so troubleshoot() may be viable after all.

@Czaki
Copy link
Contributor

Czaki commented Aug 26, 2021

You can easily confirm that the compiled .so files that get copied to the wheel always come from build/ and copies outside of build get ignored. If you check pip wheel -v . you'll see something like:

did you try this with a cleaned build? I cannot check now, but in the past, if the build was deleted and proper *.so files were present in the project structure then they are used instead of creating new ones.

I still do not understand what is clue. cibuildwheel is a tool for building a wheel on CI services. Not on a local machine. I do not see a scenario where on CI you will have a polluted file structure (other than keeping binary in a repository).

As far as I understand you try to push some solution for local cibuildwheel usage which may produce multiple problems for other persons using it on CI services.

jbarlow83 added a commit to jbarlow83/cibuildwheel that referenced this issue Aug 26, 2021
jbarlow83 added a commit to jbarlow83/cibuildwheel that referenced this issue Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants