Skip to content

Fix venv creation in Python environments #297628

Merged
domenkozar merged 2 commits intoNixOS:stagingfrom
cwp:python-env-venv
Mar 22, 2024
Merged

Fix venv creation in Python environments #297628
domenkozar merged 2 commits intoNixOS:stagingfrom
cwp:python-env-venv

Conversation

@cwp
Copy link

@cwp cwp commented Mar 21, 2024

The problem

The way we build python environments is subtly broken. A python environment should be semantically
identical to a vanilla Python installation in, say, /usr/local. The current implementation,
however, differs in two important ways.

The first is that it's impossible to use python packages from the environment in python virtual
environments. Here's a demonstration:

# build using nixpkgs master branch
> nix-build \
  -E '{pkgs ? import <nixpkgs> {}}: pkgs.python3.withPackages (ps: [ps.requests])' \
  -o classic
/nix/store/g8r3c4fwrl5gzz2qybj3i7s5119q8qs2-python3-3.10.12-env

# we can import a package installed in the environment
> classic/bin/python -c 'import requests; print(requests)'
<module 'requests' from '/nix/store/g8r3c4fwrl5gzz2qybj3i7s5119q8qs2-python3-3.10.12-env/lib/python3.10/site-packages/requests/__init__.py'>

# but can't import that package from a venv
> classic/bin/python -m venv --system-site-packages classic-venv
> classic-venv/bin/python -c 'import requests; print(requests)'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'requests'

Another problem is that the nix installation of python appears to python code to be a virtual
environment. The canonical way to detect a Python venv is to compare sys.prefix to
sys.base_prefix. In the base python installation, they will have the same value, but a venv will
update sys.prefix to point to the venv.

# the nix environment appears to be a venv
> classic/bin/python -c 'import sys; print(sys.base_prefix); print(sys.prefix)'
/nix/store/2rrfpkq6cr8ppip9szl0z1qfdlskdinq-python3-3.10.12
/nix/store/g8r3c4fwrl5gzz2qybj3i7s5119q8qs2-python3-3.10.12-env

# the venv has sys.base_prefix set to the bare python interpreter package
> classic-venv/bin/python -c 'import sys; print(sys.base_prefix); print(sys.prefix)'
/nix/store/2rrfpkq6cr8ppip9szl0z1qfdlskdinq-python3-3.10.12
/Users/cwp/dev/nixpkgs/classic-venv

The nix-generated environment appears to be a venv, but it's not. Most notably, it lacks a
pyvenv.cfg file in the root directory, but the structure of the file tree and the symlinks it
contains are also a bit different. Sophisticated python code that manipulates virtualenvs fails when
run in a nix environment.

The cause

The build machinery for Python packages does some clever manipulation of the Python runtime to make
the python interpreter and python packages that are individually defined as nix derivations
available to python code without copying them all into a single directory the way a vanilla Python
installation does.

The python interpreter packages use the sitecustomize
hook to read environment variables and manipulate the python runtime:

  • it sets sys.executable based on NIX_PYTHONEXECUTABLE
  • it sets sys.prefix based on NIX_PYTHONPREFIX
  • it adds directories to sys.path based on NIX_PYTHONPATH

When a python environment with additional packages is built, a wrapper is created
that invokes the python interpreter with the correct environment supply the sitecustomize module
with the information it needs. This wrapper is what causes the issues mentioned above. The python
interpeter relies very heavily on the path with which it is invoked to initialize its runtime. The
wrapper is quite transparent; it invokes the python interpreter with the "real" path of the
interpreter rather than that of the wrapper. This leads the python runtime to get built in the
context of the bare interpreter, rather than the full environment. The sitecustomize module then
makes some tweaks, but it can't completely compensate for the "incorrect" initialization of the
runtime.

The solution

The fix is pretty simple. Rather than invoking python with the "correct" path to the interpreter,
we invoke it with the path to the wrapper. This causes python to initialize its runtime correctly,
and makes all the sitecustomize machinery unnecessary. Rather than manipulating the
python runtime, we rely on symlinks to make the various nix packages available in the enviornment.

# build using a patched nixpkgs
> nix-build \
  -I nixpkgs="$HOME/dev/nixpkgs" \
  -E '{pkgs ? import <nixpkgs> {}}: pkgs.python3.withPackages (ps: [ps.requests])' \
  -o fixed
/nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env

# we can import packages from the nix environment and the venv
> /nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env/bin/python -m venv --system-site-packages fixed-venv
> fixed-venv/bin/pip install six
Collecting six
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: six
Successfully installed six-1.16.0
> fixed-venv/bin/python -c 'import six; import requests; print(six); print(requests)'
<module 'six' from '/Users/cwp/dev/eg-nix-venv/fixed-venv/lib/python3.10/site-packages/six.py'>
<module 'requests' from '/nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env/lib/python3.10/site-packages/requests/__init__.py'>

# sys.prefix matches sys.base_prefix, so it doesn't look like a venv
> /nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env/bin/python -c 'import sys; print(sys.base_prefix); print(sys.prefix)'
/nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env
/nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env

# this also works when python is invoked through a symlink
> fixed/bin/python -c 'import sys; print(sys.base_prefix); print(sys.prefix)'
/Users/cwp/dev/eg-nix-venv/fixed
/Users/cwp/dev/eg-nix-venv/fixed

# the venv has sys.base_prefix set to the environment
> fixed-venv/bin/python -c 'import sys; print(sys.base_prefix); print(sys.prefix)'
/nix/store/bm1s987flajl504nk3k7bkidx4zf1pj0-python3-3.10.12-env
/Users/cwp/dev/eg-nix-venv/fixed-venv

This approach has several benefits:

  • virtualenvs can use packages from the python environment
  • the nix environment no longer looks like a virtualenv to Python code
  • simplifies the Python interpreter packages
  • makes it possible to write nix packages that use sitecustomize for their own purposes

Testing

This change includes tweaks to the existing python environment tests, and several new tests.

@cwp cwp requested review from Ericson2314 and FRidh as code owners March 21, 2024 05:53
@github-actions github-actions bot added the 6.topic: python Python is a high-level, general-purpose programming language. label Mar 21, 2024
@cwp
Copy link
Author

cwp commented Mar 21, 2024

@domenkozar @3541 New PR

@cwp cwp force-pushed the python-env-venv branch from ba8f001 to f463bcf Compare March 21, 2024 06:00
@ofborg ofborg bot added 10.rebuild-darwin-stdenv This PR causes stdenv to rebuild on Darwin and must target a staging branch. 10.rebuild-linux-stdenv This PR causes stdenv to rebuild on Linux and must target a staging branch. labels Mar 21, 2024
@ofborg ofborg bot requested a review from andersk March 21, 2024 07:26
@ofborg ofborg bot added 10.rebuild-darwin: 501+ This PR causes many rebuilds on Darwin and should normally target the staging branches. 10.rebuild-darwin: 5001+ This PR causes many rebuilds on Darwin and must target the staging branches. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. labels Mar 21, 2024
@domenkozar
Copy link
Member

I suggest we merge this to staging and give it a go.

Copy link
Member

@FRidh FRidh Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong, but does this assume that all executables in linked Python packages are Python scripts? This is an assumption that cannot be made; it's not uncommon to have shell scripts in bin/ and binaries can occur as well. Replacing the shebang can be done, but it needs to be checked that it is a Python interpreter, and preferably also exactly the same interpreter. The latter should actually always be the case, if not, we have another problem (e.g. with overriding somewhere).

In time we should aim to not wrap packages at build time.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it only assumes that if files named bin/foo and bin/.foo-wrapped exist, then bin/foo is a wrapper and bin/.foo-wrapped is a python script.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In time we should aim to not wrap packages at build time.

Yes! I have an idea for how to do that.

Copy link
Member

@FRidh FRidh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for extending the tests, this gives a lot more trust in the functioning of this approach.

Please do separate the make-wrapper changes in a separate commit as they are a separate unit.

Colin Putney added 2 commits March 21, 2024 19:26
The way we build python environments is subtly broken. A python
environment should be semantically identical to a vanilla Python
installation in, say, /usr/local. The current implementation, however,
differs in two important ways. The first is that it's impossible to use
python packages from the environment in python virtual environments. The
second is that the nix-generated environment appears to be a venv, but
it's not.

This commit changes the way python environments are built:

  * When generating wrappers for python executables, we inherit argv[0]
    from the wrapper. This causes python to initialize its configuration
    in the environment with all the correct paths.
  * We remove the sitecustomize.py file from the base python package.
    This file was used tweak the python configuration after it was
    incorrectly initialized. That's no longer necessary.

The end result is that python environments no longer appear to be venvs,
and behave more like a vanilla python installation. In addition it's
possible to create a venv using an environment and use packages from
both the environment and the venv.
When building a python environment's bin directory, we now detect
wrapped python scripts from installed packages, and generate unwrapped
copies with the environment's python executable as the interpreter.
@cwp cwp force-pushed the python-env-venv branch from f463bcf to 9611885 Compare March 22, 2024 01:27
@cwp
Copy link
Author

cwp commented Mar 22, 2024

@FRidh Done!

@ofborg ofborg bot requested a review from FRidh March 22, 2024 03:30
@3541
Copy link

3541 commented Mar 22, 2024

Did one last manual check, and everything still works as expected.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixpkgs-news-a-weekly-recap-for-the-nix-community/42137/5

@SuperSandro2000
Copy link
Member

For some wrappers this is causing python lines to be deleted. see #302315 or #301449

@SuperSandro2000
Copy link
Member

For some wrappers this is causing python lines to be deleted. see #302315 or #301449

makeWrapper "$path/bin/$prg" "$out/bin/$prg" --set NIX_PYTHONPREFIX "$out" --set NIX_PYTHONEXECUTABLE ${pythonExecutable} --set NIX_PYTHONPATH ${pythonPath} ${lib.optionalString (!permitUserSite) ''--set PYTHONNOUSERSITE "true"''} ${lib.concatStringsSep " " makeWrapperArgs}
if [ -f ".$prg-wrapped" ]; then
echo "#!${pythonExecutable}" > "$out/bin/$prg"
sed -e '1d' -e '3d' ".$prg-wrapped" >> "$out/bin/$prg"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming line 3 here is wrong. In normal wrappers this is line 2 and when injecting the wrapper this could be any line, since it is skipping comments IIRC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert in #302385

Copy link
Contributor

@gabyx gabyx Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cwp : This fix is really important to get proper python environments and it would be really cool to see if you could make another PR which fixes this, as it got reverted. Thanks alot for your work!

@ruro
Copy link
Contributor

ruro commented Apr 11, 2024

@cwp this PR got reverted because of the broken sed unwrapping, can you clarify if you are planning on trying to fix this again? No pressure/expectations, just asking so that this doesn't get lost in limbo, and we don't step on each other's toes.

@imincik
Copy link
Contributor

imincik commented Apr 11, 2024

cc @domenkozar

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/cuda-tensorflow-my-setup-is-really-hacky-would-appreciate-help-unhackying-it/43912/9

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-build-python-virtualenv-with-packages-provided-by-python3-withpackages/24766/16

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-build-python-virtualenv-with-packages-provided-by-python3-withpackages/24766/20

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-build-python-virtualenv-with-packages-provided-by-python3-withpackages/24766/22

@Snarpix
Copy link

Snarpix commented Jun 6, 2025

As I understand correctly this should fix the
#66366 AKA sys.prefix != sys.base_prefix issue.
But it seems that in current staging this PR was overwritten, and current python312 package is considered venv.
Is there a way I can fix this?

qbisi added a commit to qbisi/nixpkgs that referenced this pull request Sep 16, 2025
Partially reverts NixOS#297628.
The --resolve-argv0 option in makeWrapper does nothing, and no
packages use it.
qbisi added a commit to qbisi/nixpkgs that referenced this pull request Sep 16, 2025
The --resolve-argv0 option was originally introduced in
NixOS#297628 for wrapping Python
executables. No other packages currently use it.
This commit adjusts the behavior of --resolve-argv0: it now also
resolves the dirname of argv0 to its realpath. This ensures that
python venvs always resolve sys.base_prefix to the Nix store path
(e.g., /nix/store/xxx-python3-env), regardless of how the Nix
python environment's python is invoked.
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/python-interpreter-from-python-withpackages-has/70287/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: python Python is a high-level, general-purpose programming language. 10.rebuild-darwin: 501+ This PR causes many rebuilds on Darwin and should normally target the staging branches. 10.rebuild-darwin: 5001+ This PR causes many rebuilds on Darwin and must target the staging branches. 10.rebuild-darwin-stdenv This PR causes stdenv to rebuild on Darwin and must target a staging branch. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. 10.rebuild-linux-stdenv This PR causes stdenv to rebuild on Linux and must target a staging branch.

Projects

None yet

Development

Successfully merging this pull request may close these issues.