Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Globs ignore exclude_path #1405

Closed
Wumpf opened this issue Apr 16, 2024 · 5 comments · Fixed by #1556
Closed

Globs ignore exclude_path #1405

Wumpf opened this issue Apr 16, 2024 · 5 comments · Fixed by #1556
Labels
bug Something isn't working good first issue Good for newcomers hacktoberfest help wanted Extra attention is needed

Comments

@Wumpf
Copy link

Wumpf commented Apr 16, 2024

I wanted lychee to check rust files, so I run lychee **/*.rs, confused why it takes so long to scan for links, I ran lychee **/*.rs --dump-inputs which showed that paths excluded via exclude_path options (e.g. target) did not influence the glob.

This is quite inconvenient since what I'm actually after is adding extensions to be checked. At the very least this should be documented better.

@mre
Copy link
Member

mre commented Apr 16, 2024

I think it should actually work, though.
At least that's the idea of this part:

if self.is_excluded_path(&path) {
continue;

At this point, I don't know why it doesn't work.
One note, though: you're using your operating system's glob handling. Glob expansion happens in your shell.
To use the Rust-based one, you need to quote the glob.

lychee '**/*.rs' --dump-inputs

so

lychee --exclude-path './target/' '**/*.rs' --dump-inputs

However, this gives me the same incorrect results as you described.
If you find the correct incantation, we should certainly document that.
Otherwise, I consider this a bug and would love to have someone investigate that.
Not sure if you have the time, but I'd certainly appreciate your help.

As a side note, glob filtering is not ideal right now. Would love to move to globset, which would allow us to provide multiple globs and so on. That shouldn't be a blocker for better exclude handling for globs with the current implementation, though.

@mre mre added bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed labels Apr 16, 2024
Wumpf added a commit to rerun-io/rerun that referenced this issue Apr 16, 2024
### What

Code files are now included in link checking.

* Fixes #5925

Some issues encountered:
* globs don't respect exclude_files which makes working with this
locally harder lycheeverse/lychee#1405
* can't specify extensions / correct files aren't picked up
automatically lycheeverse/lychee#410

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/5986?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/5986?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!

- [PR Build Summary](https://build.rerun.io/pr/5986)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.
@Aariq
Copy link

Aariq commented Jul 26, 2024

Also having issues with this (but an R project where I'd like to exclude the renv/ directory and use a glob to check .md files). I can't seem to create a simple reproducible example though. I have discovered that --dump-inputs doesn't produce accurate output though (I think). E.g.

$ ls -R
nope yep

./nope:
a.txt b.md

./yep:
a.md  b.txt

--dump-inputs says nope/ isn't excluded

$ lychee --exclude-path nope './**/*.txt' --dump-inputs
nope/a.txt
yep/b.txt

but actually it is excluded

$ lychee --exclude-path nope './**/*.txt'
  2/2 ━━━━━━━━━━━━━━━━━━━━ Finished extracting links                                                                                                Issues found in 1 input. Find details below.

[yep/b.txt]:
✗ [ERR] https://qwkjlkrwjelrjwerlwekj.jp/ | Failed: Network error: error sending request for url (https://qwkjlkrwjelrjwerlwekj.jp/)

🔍 2 Total (in 0s) ✅ 1 OK 🚫 1 Error

@mre
Copy link
Member

mre commented Jul 30, 2024

I think dump-inputs currently is pretty simple. As you discovered, we don't seem to respect excludes yet. We should add that.

@mre
Copy link
Member

mre commented Nov 7, 2024

@Aariq, I tested dumping the links in combination with globs again locally, and they seem to work now with the latest lychee version. Can you confirm?

Never mind, I just realized we were talking about --dump-inputs here, which is still broken.

@mre
Copy link
Member

mre commented Nov 7, 2024

Fixed in #1556.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers hacktoberfest help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants