Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--python vs --python-shebang #1095

Closed
danielburrell opened this issue Nov 3, 2020 · 2 comments
Closed

--python vs --python-shebang #1095

danielburrell opened this issue Nov 3, 2020 · 2 comments

Comments

@danielburrell
Copy link

danielburrell commented Nov 3, 2020

Hi,

I've looked at the docs and I'm still not clear about the difference between --python and --python-shebang when invoking pex. I'm sure it's obvious to folks intimately familiar with python, but it would be great if the docs explained gave examples of why you might want to set these.

For this example

pex -vv -f /tmp/wheelhouse --no-index \
  --python-shebang='???' 
  "--python" "python36" "--python" "python37" "--python" "python38"

So the python shebang as I understand it advises which interpreter should be used, specifically that value given should either be a full path to a binary interpreter e.g. /usr/bin/python3 or a shorthand like python3 if that's already on the path, is there a standard? because what if they don't have it on the path, and how do you know which version the user might have installed? and that interpreter will be used, in which case why do you need the --python environment values?

  • I think the intent was to allow this pex to work with multiple versions of python, what should the python-shebang value be?
  • Is the python-shebang mandatory? If so why?
  • What happens if you set the python-shebang to be python3 as opposed to /usr/local/bin/python3
  • Is there a recommended value for python-shebang?
  • If I'm putting requirements together to run this pex for an end user, does the user have to ensure the python-shebang matches their installation location?
  • If they only have python39 installed, will it still work? If I set the shebang to python3 does that means it might still work as it launches any python3 environment so provided we're backward compatible it'll work?
@jsirois
Copy link
Member

jsirois commented Nov 6, 2020

1st, correcting your example: --python python36 should fail on most systems since Python tends to install its binaries as python, python[MAJ] and python[MAJ].[MIN] where [MAJ] is the major version (like 3 for Python 3.8.6) and [MIN] is the minor version (like 8 for Pyhton 3.8.6). Note that your example is missing the dot between major and minor version.

For convenient reference, the help for the --python option:

    --python=PYTHON     The Python interpreter to use to build the PEX
                        (default: current interpreter). This cannot be used
                        with `--interpreter-constraint`, which will instead
                        cause PEX to search for valid interpreters. Either
                        specify an absolute path to an interpreter, or specify
                        a binary accessible on $PATH like `python3.7`. This
                        option can be passed multiple times to create a multi-
                        interpreter compatible PEX.

On to shebang questions, with the help again for convenient reference:

    --python-shebang=PYTHON_SHEBANG
                        The exact shebang (#!...) line to add at the top of
                        the PEX file minus the #!. This overrides the default
                        behavior, which picks an environment Python
                        interpreter compatible with the one used to build the
                        PEX file.

The fundamental missing context appears to be how binaries get executed on Unix; so I'll lead with that:

Traditionally, on Unix, a file is executable if the appropriate mode bits are set and the contents is either a known executable file format (ELF, COFF, ... depends on the Unix) or else its a script. A script is denoted by #! as the leading bytes with the path to the interpreter to use to execute the script following that and ending in a newline. That line as a holw is commonly referred to as a "shebang" line (https://en.wikipedia.org/wiki/Shebang_(Unix)). So to make a script executable you need to set its mode to executable and insert a 1st line that is an appropriate shebang. Examples:

  1. #!/bin/bash
  2. #!/usr/bin/ruby
  3. #!/usr/bin/python
  4. #!/usr/bin/env bash

The only special thing here with respect to Python is the Python interpreter knows how to "run" a zip file and zipfiles allow arbitrary leading content to be inserted at the head of the zipfile; so Python scripts can be packaged as a zipfile and a PEX is a zipfile that leverages all this:

Now, you may have noticed that the shebang is host-dependant in that you must know the exact absolute path to the script interpreter to run. In example 4, reliance on the path of the env program is used so that the exact location of bash need not be known (env looks it up in the PATH at runtime). Still though, this is fairly non-portable and depends on uniform paths with uniform meanings across machines. To stress the latter point, a given path /usr/bin/python3 could have two different meanings on different machines. The path could refer to a Python 3.6 interpreter on one machine and a Python 3.9 interpreter on another.

You edged up on many of these points in your questions above, but I think the following should now be answered:

  • I think the intent was to allow this pex to work with multiple versions of python, what should the python-shebang value be?
    It needs to be the common denominator. If you build the PEX to run with various Python 3's, say 3.6 through 3.9, then you'd want a shebang of #!/usr/bin/python3 or #!/usr/bin/env python3.
  • Is the python-shebang mandatory? If so why?:
    Yes, since a PEX is just a zipapp and not a valid ELF, COFF, etc executable file.
  • What happens if you set the python-shebang to be python3 as opposed to /usr/local/bin/python3:
    The OS will reject a relative path for the shebang interpreter.
  • Is there a recommended value for python-shebang?
    Not when customizing - no. Since a shebang relies on the setup of a target machine, you need to take that into account when picking a custom shebang. When you let Pex pick the shebang, of course its pick should suffice.
  • If I'm putting requirements together to run this pex for an end user, does the user have to ensure the python-shebang matches their installation location?:
    Yes - sortof. The shebang must match a path on the target machine.

The final question though has little to do with the shebang:

  • If they only have python39 installed, will it still work? If I set the shebang to python3 does that means it might still work as it launches any python3 environment so provided we're backward compatible it'll work?
    It depends. If the PEX includes platform-specific wheels, say numpy, then it will not work. The numpy-1.19.4-cp37-cp37m-manylinux2010_x86_64.whl and numpy-1.19.4-cp38-cp38-manylinux2010_x86_64.whl wheels installed in the PEX file will not be usable for Python 3.9 regardless of shebang.

Resetting and explaining at a high level in sequence of execution:

  1. The shebang is required by the operating system and it is used to run the PEX zipfile.
  2. The PEX main is a bit of Pex code that 1st finds an appropriate interpreter to use if the current one selected by the shebang is not appropriate. If a new interpreter is selected, the PEX file is re-executed with that interpreter.
  3. The PEX main extracts all wheels installed inside the PEX file to the filesystem (under ~/.pex) and adds these locations to sys.path before handing off execution to user code.

Currently step 2 only happens if you use --interpreter-constraint to select interpreters when building the PEX. It does not happend when you use --python to select interpreters. Fixing that will come in #1020.

@danielburrell
Copy link
Author

Sorry for the delay in responding, this is an incredibly detailed response, thank you so much for providing such an excellent guide. I've had a crack at applying the details above, and it seems to work, I'll post a summary of what I picked at some point for completeness in case others are interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants