Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for importing from PEXes. #1845

Merged
merged 6 commits into from
Jul 13, 2022

Conversation

jsirois
Copy link
Member

@jsirois jsirois commented Jul 13, 2022

You can now use a PEX as a mostly normal sys.path entry. You do need
to activate it though and that can be done in two ways:

  1. Perform an import __pex__ before importing anything else you
    might need that lives in a PEX sys.path entry.
  2. Import what you need in one step via from __pex__ import colors in
    the case where the PEX contains ansicolors, for example.

The second method is useful in cases where you can't influence which
python interpreter to run (In which case you could just specify the PEX
file itself), but you can influence the sys.path. The motivating
example of this is AWS Lambda functions for Python. One mode of
deploying these is in a zip. Here you can't pick the Python interpreter,
just its version, but you can specify the zip and an entry point. As
such you can specify a PEX as the zip and a __pex__ psuedo-package
prefixed entry-point. This should obviate the need for projects like
Lambdex or at least simplify their integration with Pex, which currently
relies on using the unsupported pex.pex_bootstrapper APIs which I'd
like to drop in Pex 3.

Fixes #1839

Previously you had to supply a top-level list of modules and packages to
enable import redirection for in the directory; now `""` or `"."`
indicates redirect for evrything in the directory.
You can now use a PEX as a mostly normal `sys.path` entry. You do need
to activate it though and that can be done in two ways:

1. Perform an `import __pex__` before importing anything else you
   might need that lives in a PEX `sys.path` entry.
2. Import what you need in one step via `from __pex__ import colors` or
   `from __pex__.colors import red` in the case where the PEX contains
   ansicolors, for example.

The second method is useful in cases where you can't influence which
python interpreter to run (In which case you could just specify the PEX
file itself), but you can influence the `sys.path`. The motivating
example of this is AWS Lambda functions for Python. One mode of
deploying these is in a zip. Here you can't pick the Python interpreter,
just its version, but you can specify the zip and an entry point. As
such you can specify a PEX as the zip and a `__pex__` psuedo-package
pre-fixed entry-point. This should obviate the need for projects like
Lambdex or at least simplify their integration with Pex, which currently
relies on using the un-supported pex.pex_bootstrapper APIs.
@jsirois jsirois requested review from wickman, benjyw and kwlzn July 13, 2022 01:33
@jsirois
Copy link
Member Author

jsirois commented Jul 13, 2022

The manual AWS Lambda test:

Source material in src/examine_disks.py:

import json

import psutil


def lambda_handler(event, context):
    return {
        'statusCode': 200,
        'body': json.dumps(
            {"partitions": [partition._asdict() for partition in psutil.disk_partitions()]},
            sort_keys=True,
            indent=2,
        )
    }

Build the plain old PEX:

$ python -mpex --python python3.9 -D src/ psutil -o plain_old.pex.zip

Upload to AWS & configure magic __pex__. entry point:
image

Use it:

$ curl https://ui6vqc7rocj7ojmdpp7ltf55yq0msyld.lambda-url.us-west-2.on.aws/
{
  "partitions": [
    {
      "device": "/dev/vdb",
      "fstype": "ext4",
      "maxfile": 255,
      "maxpath": 4096,
      "mountpoint": "/dev",
      "opts": "rw,nosuid,noexec,noatime,data=writeback"
    },
    {
      "device": "/dev/vdd",
      "fstype": "ext4",
      "maxfile": 255,
      "maxpath": 4096,
      "mountpoint": "/tmp",
      "opts": "rw,relatime,data=writeback"
    },
    {
      "device": "/dev/vdb",
      "fstype": "ext4",
      "maxfile": 255,
      "maxpath": 4096,
      "mountpoint": "/proc/sys/kernel/random/boot_id",
      "opts": "ro,nosuid,nodev,noatime,data=writeback"
    },
    {
      "device": "/dev/root",
      "fstype": "ext4",
      "maxfile": 255,
      "maxpath": 4096,
      "mountpoint": "/etc/passwd",
      "opts": "ro,nosuid,nodev,relatime,data=ordered"
    },
    {
      "device": "/dev/root",
      "fstype": "ext4",
      "maxfile": 255,
      "maxpath": 4096,
      "mountpoint": "/var/rapid",
      "opts": "ro,nosuid,nodev,relatime,data=ordered"
    },
    {
      "device": "/dev/vdb",
      "fstype": "ext4",
      "maxfile": 255,
      "maxpath": 4096,
      "mountpoint": "/etc/resolv.conf",
      "opts": "ro,nosuid,nodev,noatime,data=writeback"
    },
    {
      "device": "/dev/vdc",
      "fstype": "squashfs",
      "maxfile": 256,
      "maxpath": 4096,
      "mountpoint": "/var/task",
      "opts": "ro,nosuid,nodev,relatime"
    }
  ]
}

@jsirois
Copy link
Member Author

jsirois commented Jul 13, 2022

@wickman and @kwlzn I think this means things like Lambdex and pyuwsgi_pex can retire and go climbing!, but please let me know if you agree.

@jsirois
Copy link
Member Author

jsirois commented Jul 13, 2022

@benjyw I think this means we can delete the Pants awslambda and google_cloud_function backends after a deprecation period.

@jsirois
Copy link
Member Author

jsirois commented Jul 13, 2022

@rtkjliviero here's the promised change that hopefully satisfies #747.

@jsirois
Copy link
Member Author

jsirois commented Jul 13, 2022

The 1 failing integration test on the pypy3.9 shard is reproducible, but apparently not related to this change; I can repro on main. CI on main is green though; so stumped for now.

This was broken from the get-go in pex-tool#1787. It's unclear to me how the
`test_create_universal_platform_check` test added in pex-tool#1824 didn't fail
from the get-go, but it does now and this fix fixes that test for the
`--platform macosx_10.9_x86_64-cp-310-cp310 psutil==5.9.1` abbreviated
platform case.

Discovered in CI for pex-tool#1845 as a failure unrelated to that change.
@jsirois
Copy link
Member Author

jsirois commented Jul 13, 2022

OK, the 1 failing test was in fact unrelated. Fixed in #1846 which is merge in here now for a green CI.

@jsirois jsirois requested a review from stuhood July 13, 2022 15:31
jsirois added a commit that referenced this pull request Jul 13, 2022
This was broken from the get-go in #1787. It's unclear to me how the
`test_create_universal_platform_check` test added in #1824 didn't fail
from the get-go, but it does now and this fix fixes that test for the
`--platform macosx_10.9_x86_64-cp-310-cp310 psutil==5.9.1` abbreviated
platform case.

Discovered in CI for #1845 as a failure unrelated to that change.
@benjyw
Copy link
Collaborator

benjyw commented Jul 13, 2022

That's very neat! Before we drop the backends though, there's one thing I've been meaning to look into for a while, which is whether there are performance optimizations that AWS applies to "standard" lambda zip files that we defeat by using pex (via lambdex or this new way).

@jsirois
Copy link
Member Author

jsirois commented Jul 13, 2022

That's very neat! Before we drop the backends though, there's one thing I've been meaning to look into for a while, which is whether there are performance optimizations that AWS applies to "standard" lambda zip files that we defeat by using pex (via lambdex or this new way).

If Lambdexed PEXes are slower for some strange reason then PEXes used this way will be exactly as slow; so perf should have 0 bearing on dropping the Pants backends. If PEX based lambdas are slower and we want to improve that we'd need new backends right? Or are you suggesting exactly that, you'd like to keep the backends / targets named like they are but completely re-work their internals if PEXes are slow?

Either way, I need this API use of the bootstrap code gone for Pex 3.

pex/hashing.py Show resolved Hide resolved
@jsirois
Copy link
Member Author

jsirois commented Jul 13, 2022

Thanks @benjyw for taking a look.

@jsirois jsirois merged commit bb68814 into pex-tool:main Jul 13, 2022
@jsirois jsirois deleted the __pex_boot__/magic-vendor-ep branch July 13, 2022 20:04
@benjyw
Copy link
Collaborator

benjyw commented Jul 13, 2022

Yeah, I was thinking that depending on the outcome of a performance investigation we might have to reimplement those backends to create "proper" lambda zipfiles, with the layout AWS recommends.

@benjyw
Copy link
Collaborator

benjyw commented Jul 13, 2022

But that would be outside of PEX entirely, so yeah, I don't think it blocks getting rid of that API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PEXes should be useable as sys.path entries.
2 participants