Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not load package data resources when running in a zipapp #2965

Open
sinoroc opened this issue Sep 25, 2020 · 8 comments
Open

Can not load package data resources when running in a zipapp #2965

sinoroc opened this issue Sep 25, 2020 · 8 comments
Assignees
Labels
area/core Related to the poetry-core library good first issue kind/enhancement Not a bug or feature, but improves usability or performance

Comments

@sinoroc
Copy link

sinoroc commented Sep 25, 2020

  • OS version and name: Ubuntu Linux 18.04
  • Poetry version: 1.0.10 (in a zipapp)
  • Python: CPython 3.6.9

Issue

$ poetry config -vvv --list

[ValueError]
Schema poetry-schema does not exist.

Traceback (most recent call last):
  File "/home/sinoroc/.local/bin/poetry/clikit/console_application.py", line 131, in run
    status_code = command.handle(parsed_args, io)
  File "/home/sinoroc/.local/bin/poetry/clikit/api/command/command.py", line 120, in handle
    status_code = self._do_handle(args, io)
  File "/home/sinoroc/.local/bin/poetry/clikit/api/command/command.py", line 171, in _do_handle
    return getattr(handler, handler_method)(args, io, self)
  File "/home/sinoroc/.local/bin/poetry/cleo/commands/command.py", line 92, in wrap_handle
    return self.handle()
  File "/home/sinoroc/.local/bin/poetry/poetry/console/commands/config.py", line 75, in handle
    local_config_file = TomlFile(self.poetry.file.parent / 'poetry.toml')
  File "/home/sinoroc/.local/bin/poetry/poetry/console/commands/command.py", line 10, in poetry
    return self.application.poetry
  File "/home/sinoroc/.local/bin/poetry/poetry/console/application.py", line 49, in poetry
    self._poetry = Factory().create_poetry(Path.cwd())
  File "/home/sinoroc/.local/bin/poetry/poetry/factory.py", line 48, in create_poetry
    check_result = self.validate(local_config)
  File "/home/sinoroc/.local/bin/poetry/poetry/factory.py", line 272, in validate
    validation_errors = validate_object(config, 'poetry-schema')
  File "/home/sinoroc/.local/bin/poetry/poetry/json/__init__.py", line 22, in validate_object
    raise ValueError('Schema {} does not exist.'.format(schema_name))

Culprit is probably this block of code:

# [...]
SCHEMA_DIR = os.path.join(os.path.dirname(__file__), "schemas")

class ValidationError(ValueError):
    pass

def validate_object(obj, schema_name):  # type: (dict, str) -> List[str]
    schema = os.path.join(SCHEMA_DIR, "{}.json".format(schema_name))
   # [...]

Loading resources from package data should be done with pkgutil.get_data() (or importlib-resources or pkg_resources), especially if the code is running directly from a zipped file. These libraries are able to handle zipped resources.

References:

@sinoroc sinoroc added kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Sep 25, 2020
@sinoroc
Copy link
Author

sinoroc commented Sep 25, 2020

At least I guess that importlib-resources (or pkg_resources) could help. I will give it a try, and suggest a fix (pull request) if I get something meaningful.

@abn abn removed the status/triage This issue needs to be triaged label Sep 25, 2020
sinoroc added a commit to sinoroc/poetry that referenced this issue Sep 25, 2020
When installed in a Zip file ('zipapp', '*.pyz') loading resources
from package data can not be done by directly reading files relative
to '__file__', since there is no valid path on the file system for
zipped files.

Using 'pkgutil.get_data' ensures that loading files from a Zip archive
is handled accordingly.

GitHub: python-poetry#2965
sinoroc added a commit to sinoroc/poetry that referenced this issue Sep 25, 2020
When running from a Zip file ('zipapp', '*.pyz'), loading resources
from package data can not be done by directly reading files relative
to '__file__', since there is no valid path on the file system for
zipped files.

Using 'pkgutil.get_data' ensures that loading files from a Zip archive
is handled accordingly.

GitHub: python-poetry#2965
sinoroc added a commit to sinoroc/poetry that referenced this issue Sep 25, 2020
When running from a Zip file (`zipapp`, `*.pyz`), loading resources
from package data can not be done by directly reading files relative
to `__file__`, since there is no valid path on the file system for
zipped files.

Using `pkgutil.get_data` ensures that loading files from a Zip archive
is handled accordingly.

Resolved: python-poetry#2965
sinoroc added a commit to sinoroc/poetry that referenced this issue Sep 25, 2020
When running from a Zip file (`zipapp`, `*.pyz`), loading resources
from package data can not be done by directly reading files relative
to `__file__`, since there is no valid path on the file system for
zipped files.

Using `pkgutil.get_data` ensures that loading files from a Zip archive
is handled accordingly.

Resolves: python-poetry#2965
@sinoroc
Copy link
Author

sinoroc commented Sep 26, 2020

  • poetry-core-1.0.0rc2

It is not possible to file issues against poetry-core, but basically there is a similar issue in poetry-core as well. Actually in one of its vendored dependencies, namely jsonschema:

$ zapp-poetry config -vvv --list
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/sinoroc/.local/bin/zapp-poetry/__main__.py", line 2, in <module>
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "<frozen zipimport>", line 259, in load_module
  File "/home/sinoroc/.local/bin/zapp-poetry/poetry/console/__init__.py", line 1, in <module>
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "<frozen zipimport>", line 259, in load_module
  File "/home/sinoroc/.local/bin/zapp-poetry/poetry/console/application.py", line 7, in <module>
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "<frozen zipimport>", line 259, in load_module
  File "/home/sinoroc/.local/bin/zapp-poetry/poetry/console/commands/__init__.py", line 4, in <module>
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "<frozen zipimport>", line 259, in load_module
  File "/home/sinoroc/.local/bin/zapp-poetry/poetry/console/commands/check.py", line 1, in <module>
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "<frozen zipimport>", line 259, in load_module
  File "/home/sinoroc/.local/bin/zapp-poetry/poetry/factory.py", line 9, in <module>
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "<frozen zipimport>", line 259, in load_module
  File "/home/sinoroc/.local/bin/zapp-poetry/poetry/core/factory.py", line 13, in <module>
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "<frozen zipimport>", line 259, in load_module
  File "/home/sinoroc/.local/bin/zapp-poetry/poetry/core/json/__init__.py", line 7, in <module>
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "<frozen zipimport>", line 259, in load_module
  File "/home/sinoroc/.local/bin/zapp-poetry/poetry/core/_vendor/jsonschema/__init__.py", line 22, in <module>
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "<frozen zipimport>", line 259, in load_module
  File "/home/sinoroc/.local/bin/zapp-poetry/poetry/core/_vendor/jsonschema/validators.py", line 446, in <module>
  File "/home/sinoroc/.local/bin/zapp-poetry/poetry/core/_vendor/jsonschema/_utils.py", line 53, in load_schema
NotADirectoryError: [Errno 20] Not a directory: '/home/sinoroc/.local/bin/zapp-poetry/poetry/core/_vendor/jsonschema/schemas/draft3.json'

The culprit lines seem to be the following in poetry-core/vendors/patches/jsonschema.patch:

+    with open(
+        os.path.join(os.path.dirname(__file__), "schemas", "{0}.json".format(name))
+    ) as f:
+        data = f.read()


-    data = pkgutil.get_data("jsonschema", "schemas/{0}.json".format(name))
-    return json.loads(data.decode("utf-8"))
+    return json.loads(data)

Not sure why the patch has been written to replace the call to pkgutil.get_data with something else... A bit strange.

sinoroc added a commit to sinoroc/poetry-core that referenced this issue Sep 26, 2020
When running from a Zip file (`zipapp`, `*.pyz`), loading resources
from package data can not be done by directly reading files relative
to `__file__`, since there is no valid path on the file system for
zipped files.

Using `pkgutil.get_data` ensures that loading files from a Zip archive
is handled accordingly.

Resolves: python-poetry/poetry#2965
@sinoroc
Copy link
Author

sinoroc commented Sep 26, 2020

Next occurrence is in lark-parser:

https://github.com/lark-parser/lark/blob/0.9.0/lark/lark.py#L347-L362

    def open(cls, grammar_filename, rel_to=None, **options):
        """Create an instance of Lark with the grammar given by its filename
        If rel_to is provided, the function will find the grammar filename in relation to it.
        Example:
            >>> Lark.open("grammar_file.lark", rel_to=__file__, parser="lalr")
            Lark(...)
        """
        if rel_to:
            basepath = os.path.dirname(rel_to)
            grammar_filename = os.path.join(basepath, grammar_filename)
        with open(grammar_filename, encoding='utf8') as f:
            return cls(f, **options)

This would need to be patched (and sent upstream).

@sinoroc
Copy link
Author

sinoroc commented Sep 26, 2020

On the lark front...
After investigation, and some further changes to poetry-core, the next blocking issue is about loading lark's standard grammars. Like here for example:

https://github.com/python-poetry/poetry-core/blob/72291f379f87e41e22714f0eff0dbbdf57d9c060/poetry/core/version/grammars/markers.lark#L35-L37

sinoroc added a commit to sinoroc/poetry-core that referenced this issue Sep 26, 2020
When running from a Zip file (`zipapp`, `*.pyz`), loading resources
from package data can not be done by directly reading files relative
to `__file__`, since there is no valid path on the file system for
zipped files.

Using `pkgutil.get_data` ensures that loading files from a Zip archive
is handled accordingly.

Resolves: python-poetry/poetry#2965
sinoroc added a commit to sinoroc/poetry-core that referenced this issue Sep 26, 2020
When running from a Zip file (`zipapp`, `*.pyz`), loading resources
from package data can not be done by directly reading files relative
to `__file__`, since there is no valid path on the file system for
zipped files.

Using `pkgutil.get_data` ensures that loading files from a Zip archive
is handled accordingly.

Resolves: python-poetry/poetry#2965
@sinoroc
Copy link
Author

sinoroc commented Sep 26, 2020

I started with pkgutil.get_data() (from the standard library) but I don't know if it is the right decision. If I understood right, there is a wish to deprecate it. It might be a very long time until it happens though.

On the other hand there is importlib.resources, which is part of the standard library starting with Python 3.7. And it is also available as a 3rd party package importlib-resources, for "Python 2.7, and 3.4 through 3.8" since as far as I understood not all features are in 3.7 and 3.8. Seems like starting with 3.9 the backport is not necessary.

Anyway, pkgutil.get_data is quite limited and looks like it won't be enough. There are some cases (with lark for example), where we will need an actual path to a file on the file system.

I would rule out setuptools' pkg_resources for now (mainly because it is not part of the standard library, so it is a great disadvantage compared to the other solutions, it works fine though).

@neersighted neersighted added area/core Related to the poetry-core library kind/enhancement Not a bug or feature, but improves usability or performance good first issue and removed kind/bug Something isn't working as expected labels Oct 5, 2022
@gpshead
Copy link

gpshead commented Nov 28, 2023

We just patched our own vendored copy of poetry for this with:

core/spdx/helpers.py:

-    licenses_file = Path(__file__).parent / "data" / "licenses.json"
+    licenses_file = importlib.resources.files("poetry.core.spdx").joinpath("data/licenses.json")

core/json/__init__.py:

 def validate_object(obj: dict[str, Any], schema_name: str) -> list[str]:
-    schema_file = SCHEMA_DIR / f"{schema_name}.json"
-
-    if not schema_file.exists():
-        raise ValueError(f"Schema {schema_name} does not exist.")
+    schema_file = importlib.resources.files("poetry.core.json").joinpath(f"schemas/{schema_name}.json")

While we're running on 3.10 & 3.11, importlib.resources is a 3.7 addition. If you still need to support Python versions older than that, you can use the already mentioned pypi importlib-resources package.

@gpshead
Copy link

gpshead commented Nov 28, 2023

(those patches appear to belong with the poetry-core repo rather than this one), also it appears the files APIs were introduced in 3.9 (also as noted in sinoroc's comment). An importlib-resources>=1.3 dependency when installing for python<3.9 makes sense. I'll see if I can make reasonable PR(s) in the correct repos for all of this.

gpshead added a commit to gpshead/poetry-core that referenced this issue Nov 28, 2023
This is part of the fix for
python-poetry/poetry#2965.

Constructing a regression integration test that uses a zipapp like
environment would be useful but is not part of this PR.
@gpshead
Copy link

gpshead commented Nov 28, 2023

someone who understands the needs of the project should take over my PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core Related to the poetry-core library good first issue kind/enhancement Not a bug or feature, but improves usability or performance
Projects
None yet
5 participants