Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude nested items (#784) #1464

Merged

Conversation

finswimmer
Copy link
Member

This PR Implements the feature request #784.

When a folder is explicit defined in pyproject.toml as excluded, all nested data, including sub folder, are excluded. It is no longer necessary to use the glob folder/**/*.

Pull Request Check List

This is just a reminder about the most common mistakes. Please make sure that you tick all appropriate boxes. But please read our contribution guide at least once, it will save you unnecessary review cycles!

  • Added tests for changed code.
  • Updated documentation for changed code.

Note: If your Pull Request introduces a new feature or changes the current behavior, it should be based
on the develop branch. If it's a bug fix or only a documentation update, it should be based on the master branch.

If you have any questions to any of the points above, just submit and ask! This checklist is here to help you, not to deter you from contributing!

Tobias Klare added 5 commits October 14, 2019 09:37
When a folder is explicit defined in `pyproject.toml` as excluded, all nested data, including subfolder, are excluded. It is no longer neccessary to use the glob `folder/**/*`
…folder, because pathlib's `is_dir()` raises in exception under windows of name contains globing characters
@finswimmer
Copy link
Member Author

Meh, windows again ... :( But now I got it.

Any comments on it?

@finswimmer
Copy link
Member Author

I see space for improvements.

The current implementation only covers the case where a folder is given explicit to be excluded. What should happen if a wildcard expand to a folder? I think than all the nested data should be excluded as well.

Do you agree?

@kasteph kasteph added kind/feature Feature requests/implementations Hacktoberfest labels Oct 14, 2019
Copy link
Member

@kasteph kasteph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@finswimmer If a wildcard in the exclude is also a folder, then it makes sense to also exclude that wildcard's nested dependencies.

p.s. thank you for you contribution! 🍡 💫 It's really appreciated.

@@ -86,6 +86,8 @@ def find_excluded_files(self): # type: () -> Set[str]

explicitely_excluded = set()
for excluded_glob in self._package.exclude:
if os.path.isdir(os.path.join(self._path.as_posix(), str(excluded_glob))):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extract the argument from os.path.isdir here and put it in a variable inside this for-loop since we are also using the same argument for glob in line 92?

Steps to do this are:
1. expand any wildcard used
2. if expanded path is a folder append  **/* and expand again
@finswimmer
Copy link
Member Author

@stephsamson I saw your comment to late, sorry.

But with the integration of also removing nested data when wildcards are used, your suggestion should be obsolet, should it?

fin swimmer

@finswimmer
Copy link
Member Author

Yeah, two commits in a row without failing tests on window. It's getting better ;)

I think I've finished here.

Ready for merge or any comments.

Copy link
Member

@sdispater sdispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

I think we can simplify the code a bit to make it more readable.

Comment on lines 88 to 105
for excluded_glob in self._package.exclude:
for excluded in glob(
os.path.join(self._path.as_posix(), str(excluded_glob)), recursive=True
):
for excluded_glob in itertools.chain.from_iterable(
map(
lambda excluded_path: glob(
str(Path(self._path, excluded_path)), recursive=True
),
self._package.exclude,
)
):
if Path(excluded_glob).is_dir():
for excluded in glob(str(Path(excluded_glob, "**/*")), recursive=True):
explicitely_excluded.add(
Path(excluded).relative_to(self._path).as_posix()
)
else:
explicitely_excluded.add(
Path(excluded).relative_to(self._path).as_posix()
Path(excluded_glob).relative_to(self._path).as_posix()
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify this to:

for excluded_glob in self._package.exclude:
    excluded_path = Path(self._path, excluded_glob)
    if excluded_path.is_dir():
        excluded_glob = os.path.join(excluded_glob, "**/*")

    for excluded in self._path.glob(excluded_glob):
        explicitely_excluded.add(excluded.relative_to(self._path).as_posix())

Path(excluded).relative_to(self._path).as_posix()
)
excluded_path = Path(self._path, excluded_glob)
if excluded_path.is_dir():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if excluded_path.is_dir():
try:
is_dir = excluded_path.is_dir()
except OSError:
# On Windows, testing if a path with a glob is a directory will raise an OSError
is_dir = False
if is_dir:

@finswimmer
Copy link
Member Author

Windows, always windows again ...

But now we've got it (I think)-

Copy link
Member

@sdispater sdispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍

@sdispater sdispater merged commit 3a39e5a into python-poetry:develop Oct 19, 2019
@finswimmer finswimmer deleted the issue-000784-exclude-nested-items branch December 13, 2019 18:00
sdispater added a commit that referenced this pull request Mar 20, 2020
* export: fix exporting extras sub-dependencies (#1294)

* Support POETRY_HOME for install (#794)

Allow the `POETRY_HOME` environment variable to be passed during installation to change the default installation directory of `~/.poetry`:

```
POETRY_HOME=/etc/poetry python get-poetry.py
```

* * check if relative filename is in excluded file list (#1459)

* * check if relative filename is in excluded file list
* removed find_excluded_files() method from wheel.py

* added test for excluding files in wheels

* creating an own test data folder, for testing excluding files by pyproject.toml

* use as_posix() to respect windows file path delimiters

* Exclude nested items (#784) (#1464)

* This PR impliments the feature request #784.

When a folder is explicit defined in `pyproject.toml` as excluded, all nested data, including subfolder, are excluded. It is no longer neccessary to use the glob `folder/**/*`

* use `Path` instead of `os.path.join` to create string for globbing

* try to fix linting error

* create glob pattern string by concatenating and not using Path

* using `os.path.isdir()`` for checking of explicit excluded name is a folder, because pathlib's `is_dir()` raises in exception under windows of name contains globing characters

* Remove nested data when wildcards where used.

Steps to do this are:
1. expand any wildcard used
2. if expanded path is a folder append  **/* and expand again

* fix linting

* only glob a second time if path is dir

* implement @sdispater 's suggestion for better readability

* fix glob for windows?

* On Windows, testing if a path with a glob is a directory will raise an OSError

* pathlibs  glob function doesn't return the correct case (https://bugs.python.org/issue26655). So switching back to  glob.glob()

* removing obsolete imports

* Update dependencies

* Deprecate allows-prereleases in favor of allow-prereleases for consistency

* Fix tests for Python 2.7

* Fix linting

* Fix linting

* Fix linting

* Fix typing import

* Correct a couple typos in get-poetry.py (#573)

* Docs: `self:update` changed to `self update` (#1588)

* Fix GitHub actions cache issues on develop (#1918)

* Fix Github actions cache issues

* Fix Github Actions cache issues (#1928)

* Add --source option to "poetry add" (#1912)

* Add --source option to 'poetry add'

* Add tests for 'poetry add --source'

* Merge master into develop (#2070)

* Fix Github actions cache issues (#1908)

* Fix case of `-f` flag

* Make it clearer what options to pass to `--format`

* fix (masonry.api): `get_requires_for_build_wheel` must return additional list of requirements for building a package, not listed in `pyproject.toml` and not dependencies for the package itself (#1875)

fix (tests): adopted tests

* Lazy Keyring intialization for PasswordManager (#1892)

* Fix Github Actions cache issues (#1928)

* Avoid nested quantifiers with overlapping character space on git url parsing (#1902 (#1913)

* fix (git): match for `\w` instead of `.` for getting user

* change (vcs.git): hold pattern of the regex parts in a dictionary to be consistent over all regexs

* new (vcs.git): test for `parse_url` and some fixes for the regex pattern

* new (vcs.git): test for `parse_url` with string that should fail

* fix (test.vcs.git): make flake8 happy

* fix: correct parsing of wheel version with regex. (#1932)

The previous regexp was only taking the first integer of the version number,
this presented problems when the major version number reached double digits.

Poetry would determine that the version of the dependency is '1', rather than,
ie: '14'. This caused failures to solve versions.

* Fix errors when using the --help option (#1910)

* Fix how repository credentials are retrieved from env vars (#1909)

# Conflicts:
#	poetry/utils/password_manager.py

* Fix downloading packages from Simplepypi (#1851)

* fix downloading packages from simplepypi

* unused code removed

* remove unused imports

* Upgrade dependencies for the 1.0.3 release (#1965)

* Bump version to 1.0.3 (#1966)

* Fix non-compliant Git URL matching

RFC 3986 § 2.3 permits more characters in a URL than were matched. This
corrects that, though there may be other deficiencies. This was a
regression from v1.0.2, where at least “.” was matched without error.

* Update README.md "Updating Poetry"

Currently the note in "Updating Poetry" is different from the one below in "Enable tab completion for Bash, Fish, or Zsh". This MR is to make them more consistent.

* init: change dev dependency prompt

* Fix CI issues (#2069)

Co-authored-by: brandonaut <[email protected]>
Co-authored-by: finswimmer <[email protected]>
Co-authored-by: Yannick PÉROUX <[email protected]>
Co-authored-by: Edward George <[email protected]>
Co-authored-by: Jan Škoda <[email protected]>
Co-authored-by: Andrew Marshall <[email protected]>
Co-authored-by: Andrew Selzer <[email protected]>
Co-authored-by: Andrii Maletskyi <[email protected]>

* pre-commit: replace isort mirror with isort upstream (#2118)

The isort pre-commit mirror has been deprecated. This change updates
configuration to use the upstream package repository instead of the
mirror.

* Add cache list command (#1187)

* Add poetry.locations.REPOSITORY_CACHE_DIR

The repository cache directory is used in multiple places in the
codebase. This change ensures that the value is reused.

* Add cache list command

This introduces a new cache sub-command that lists all available
caches.

Relates-to: #1162

Co-authored-by: Tom Milligan <[email protected]>
Co-authored-by: David Cramer <[email protected]>
Co-authored-by: finswimmer <[email protected]>
Co-authored-by: Kyle Altendorf <[email protected]>
Co-authored-by: Justin Mayer <[email protected]>
Co-authored-by: Yannick PÉROUX <[email protected]>
Co-authored-by: brandonaut <[email protected]>
Co-authored-by: Edward George <[email protected]>
Co-authored-by: Jan Škoda <[email protected]>
Co-authored-by: Andrew Marshall <[email protected]>
Co-authored-by: Andrew Selzer <[email protected]>
Co-authored-by: Andrii Maletskyi <[email protected]>
Co-authored-by: Arun Babu Neelicattu <[email protected]>
Copy link

github-actions bot commented Mar 1, 2024

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Feature requests/implementations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants