Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception on poetry build, incorrect gitignore handling logic. #1946

Closed
3 tasks done
kawing-chiu opened this issue Jan 25, 2020 · 9 comments · Fixed by #1947
Closed
3 tasks done

Exception on poetry build, incorrect gitignore handling logic. #1946

kawing-chiu opened this issue Jan 25, 2020 · 9 comments · Fixed by #1947
Assignees
Labels
kind/bug Something isn't working as expected

Comments

@kawing-chiu
Copy link

kawing-chiu commented Jan 25, 2020

  • I am on the latest Poetry version.

  • I have searched the issues of this repo and believe that this is not a duplicate.

  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option).

  • OS version and name: Arch Linux

  • Poetry version: 1.0.2

  • Link of a Gist with the contents of your pyproject.toml file: pyproject.toml

Issue

I got an exception when running poetry build -vvv, on the wheel building step:
image

The issue here is that, poetry seems to be trying to use the /.git directory under my root directory to get ignored files, which does not make sense. I dig into the source and find that the get_vcs() logic here

def get_vcs(directory):  # type: (Path) -> Git
    directory = directory.resolve()
    for p in [directory] + list(directory.parents):
        ...

is incorrect. Here is the content of [directory] + list(directory.parents), the list of directories that get_vcs() searches for .git:

[PosixPath('/tmp/tmp51qstjuk/mypackage-0.3.0.dev3'), PosixPath('/tmp/tmp51qstjuk'), PosixPath('/tmp'), PosixPath('/')]

When running poetry build -vvv, the CompleteBuilder will unpack the sdist file to a tmpdir:

with self.unpacked_tarball(sdist_file) as tmpdir:
    WheelBuilder.make_in(
        Factory().create_poetry(tmpdir),
        ...

Here the value of tmpdir is '/tmp/tmp51qstjuk/mypackage-0.3.0.dev3', there is no .git directory there, so the get_vcs() function is searching for the .git directory in the wrong place. In my specific case, I happen to have a .git directory in my /, so it results in the above exception. Usually, the incorrect search in get_vcs() will return an empty set, so the files ignored by git will not be honored. I suspect this might cause some other issues, like #1660.

@finswimmer
Copy link
Member

Hello @kawing-chiu ,

thanks a lot for reporting and finding the reason for this problem. 👍

I hope this is fixed by #1947 . Can you test it?

finswimmer

@kawing-chiu
Copy link
Author

kawing-chiu commented Jan 26, 2020

Hi @finswimmer , I've tested it. It does fix my issue. Thanks very much.

@kawing-chiu
Copy link
Author

kawing-chiu commented Jan 30, 2020

@finswimmer Hi, after some more test, I think this issue is not completely fixed. There is still a flaw in get_vcs(): I have a .git directory under / and ~ repectively, to manage the root and user configurations, e.g. systemd units. So if the poetry project that I'm developing don't use git, then this get_vcs() behavior will search up and use my ~/.git 's ignore files, which is not relevant to the poetry project, and will probably cause bugs.

So generally, I think poetry should only work on the directory containing pyproject.toml and downwards by default. It should not go upwards unless the user allows it (maybe through an option).

@finswimmer
Copy link
Member

Hey @kawing-chiu,

if you have a valid .git folder somewhere above the project folder all subfolders belong to this repository.

Nevertheless I've changed the way how poetry finds the git root folder. It now does this by running git rev-parse --show-toplevel within the project's folder instead of searching for a folder called .git. This should be more reliable.

fin swimmer

@kawing-chiu
Copy link
Author

kawing-chiu commented Jan 30, 2020

I'm not sure this method is reliable enough, because the user might have all kinds of settings in that .git repository. For example if I have this directory structure:

directory_A/    <---- has a .git/ directory, but ignores lib/ in .gitignore
  lib/
    some_python_project/   <---- uses hg or some other version control tool instead of git
      pyproject.toml
      pandas/__init__.py

Then running git ls-files --others -i --exclude-standard will return all files under some_python_project.

@finswimmer
Copy link
Member

Hello again,

that's an expected behavior. If you want to include files, that are excluded by any git rule, you can do it by using include in the tool.poetry section. In your example like this:

[tool.poetry]
name = "pandas"
version = "0.1.0"
description = ""
authors = ["Fin Swimmer <[email protected]>"]
include = ["pandas/**/*"]

@kawing-chiu
Copy link
Author

kawing-chiu commented Jan 31, 2020

I modified my example a bit:

my_project/    <---- uses git as vcs
  .git/
  .gitignore       <---- ignores the whole third_party/ directory
  third_party/
    some_external_python_project/     <---- uses hg as vcs
      .hg/
      .hgignore          <---- poetry should in fact use the ignore rules here
      pyproject.toml     <---- I should not have to touch this, it's thirt_party
      some_external_python_project/__init__.py

The problem is that, now poetry's behavior really depends on where the source directory some_external_python_project/ resides. I don't think the user should be forced to change a third_party project's pyproject.toml to make it work.

In this example, poetry should in fact use the ignore rules provided by hg, just like it normally uses ignore rules from git. However, here hg is just an example. Even if poerty supports hg later, the user might use some other proprietary version control tools, which poetry can never support. Because of this possibility, poetry can't even reliably detect whether a vcs is in use in the current project directory, not to mention in some higher level directories.

So I think a much better and robust solution is to provide an option for the user to override how to find ignore files. For example:

[tool.poetry]
vcs_ignore_files_command = ['hg', 'status', '-i', '-u', '-n']

@finswimmer finswimmer self-assigned this Feb 2, 2020
sdispater pushed a commit that referenced this issue Feb 28, 2020
…1946) (#1947)

* fix (builder): take `self._original_path` if available to find `.git` folder

* change (vcs): use `git rev-parse --show-toplevel` to find git root folder

* fix (vcs): change back to original working dir after finding vcs

* change (builder): introduce self._original_path to keep original path
if(vcs): resolve directory for `get_vcs`
sdispater added a commit that referenced this issue Mar 20, 2020
* Fix Github actions cache issues (#1908)

* Fix case of `-f` flag

* Make it clearer what options to pass to `--format`

* fix (masonry.api): `get_requires_for_build_wheel` must return additional list of requirements for building a package, not listed in `pyproject.toml` and not dependencies for the package itself (#1875)

fix (tests): adopted tests

* Lazy Keyring intialization for PasswordManager (#1892)

* Fix Github Actions cache issues (#1928)

* Avoid nested quantifiers with overlapping character space on git url parsing (#1902 (#1913)

* fix (git): match for `\w` instead of `.` for getting user

* change (vcs.git): hold pattern of the regex parts in a dictionary to be consistent over all regexs

* new (vcs.git): test for `parse_url` and some fixes for the regex pattern

* new (vcs.git): test for `parse_url` with string that should fail

* fix (test.vcs.git): make flake8 happy

* fix: correct parsing of wheel version with regex. (#1932)

The previous regexp was only taking the first integer of the version number,
this presented problems when the major version number reached double digits.

Poetry would determine that the version of the dependency is '1', rather than,
ie: '14'. This caused failures to solve versions.

* Fix errors when using the --help option (#1910)

* Fix how repository credentials are retrieved from env vars (#1909)

# Conflicts:
#	poetry/utils/password_manager.py

* Fix downloading packages from Simplepypi (#1851)

* fix downloading packages from simplepypi

* unused code removed

* remove unused imports

* Upgrade dependencies for the 1.0.3 release (#1965)

* Bump version to 1.0.3 (#1966)

* Fix non-compliant Git URL matching

RFC 3986 § 2.3 permits more characters in a URL than were matched. This
corrects that, though there may be other deficiencies. This was a
regression from v1.0.2, where at least “.” was matched without error.

* Update README.md "Updating Poetry"

Currently the note in "Updating Poetry" is different from the one below in "Enable tab completion for Bash, Fish, or Zsh". This MR is to make them more consistent.

* init: change dev dependency prompt

* Fix CI issues (#2069)

* fix (setup_reader): check if `func.value` has attr `id` (#2041)

* fix(git): get commit sha of git commit from annotated tags (#1948)

* fix(git): have annotated tags resolve to the commit sha

* fix(git): fix quote

* fix(git): change to rev-parse

* fix: use correct badge on README (#2065)

* Fix #1791: Load repository URL from config (#2061)

* Fix #1791: Load repository URL from config

* Ran black to fix linting errors

* Add test for repo URL env variable

* Changed schema to support url in multi dependencies (#2035)

* Fix handling of forward slashes and url encoding in credentials (#1911)

* Add support for forward slashes and url encoding in credentials

* Remove extra newline

* Remove unquote

* Bump actions/checkout from v1 to v2 (#2075)

* Update release.yml

* Update main.yml

* Fix vendor package as installed package (#1883) (#1981)

* Fix vendor package as installed package (#1883)

* import from

Co-Authored-By: Sébastien Eustace <[email protected]>

* test vendor package as installed

* refactor

* remove blank line

Co-authored-by: Sébastien Eustace <[email protected]>

* fix(utils.env): import cli_run from virtualenv (#2096)

* fix(utils.env): import cli_run from virtualenv if create_environment import failes

* fix (utils.env): added accidentally removed code

* list .venv when it exists (#1762)

* list .venv when it exists

* only list when in-project is true

* missing config

* move logic to manager.list

* Add .venv when it exists

* fix: exclude subpackage from `setup.py` if `__init__.py` is excluded (#1009) (#1626)

* fix: exclude subpackage from `setup.py` if `__init__.py` is excluded

Fixes: #1009

* fix: added missing test data

* fix: lint test data

* change (sdist.git): exclude folders with no python file

* fix (sdist.git): make black happy

* get_vcs starts searching git folder from tmp dir instead of project (#1946) (#1947)

* fix (builder): take `self._original_path` if available to find `.git` folder

* change (vcs): use `git rev-parse --show-toplevel` to find git root folder

* fix (vcs): change back to original working dir after finding vcs

* change (builder): introduce self._original_path to keep original path
if(vcs): resolve directory for `get_vcs`

* Normalize author name unicode before matching (#2006)

* Fix accented characters not being matched in author name

Fixes #2004

* Normalized the strings instead of modifying the pattern

* Applied isort & black

* Fix the url used for installation when fallbacking on PyPI (#2099)

* Upgrade dependencies before the 1.0.4 release (#2100)

* Upgrade dependencies before the 1.0.4 release (#2103)

* Release 1.0.4 (#2101)

* Update release script

* Bump version to 1.0.4

* Fix release script (#2104)

* Fix VCS when git is not in PATH

* Upgrade dependencies before the 1.0.5 release (#2111)

* Bump version to 1.0.5 (#2112)

* Fix GitHub URL for black

Black is now officially supported by the Python Software Foundation

* Update Contributing.md* Fix markdown formatting* Update link to official website FAQ

* Update managing-environments.md

Co-authored-by: brandonaut <[email protected]>
Co-authored-by: finswimmer <[email protected]>
Co-authored-by: Yannick PÉROUX <[email protected]>
Co-authored-by: Edward George <[email protected]>
Co-authored-by: Jan Škoda <[email protected]>
Co-authored-by: Andrew Marshall <[email protected]>
Co-authored-by: Andrew Selzer <[email protected]>
Co-authored-by: Andriy Maletsky <[email protected]>
Co-authored-by: Julien Lhermitte <[email protected]>
Co-authored-by: Michael Aquilina <[email protected]>
Co-authored-by: Joshua Cannon <[email protected]>
Co-authored-by: László Velinszky <[email protected]>
Co-authored-by: Lu Zhu <[email protected]>
Co-authored-by: BSKY <[email protected]>
Co-authored-by: Trim21 <[email protected]>
Co-authored-by: Frost Ming <[email protected]>
Co-authored-by: Raphael Yancey <[email protected]>
Co-authored-by: adisbladis <[email protected]>
Co-authored-by: Dimitri Merejkowsky <[email protected]>
Co-authored-by: Jules Chéron <[email protected]>
Co-authored-by: Alex Povel <[email protected]>
@ptrcnull
Copy link

I believe this isn't fully fixed - the case from the last comment still applies. Poetry shouldn't read random gitignore files above the actual project directory.

Currently we get around this by setting the GIT_DIR variable, but this is just a workaround, not a solution.

Copy link

github-actions bot commented Mar 2, 2024

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 2, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Something isn't working as expected
Projects
None yet
3 participants