-
-
Notifications
You must be signed in to change notification settings - Fork 192
refactor: check glob validity when parsing input source arguments #1869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
now, when providing an input with glob characters but an invalid glob
pattern, this will cause a "cannot parse inputs" error rather than
erroring out later during link checking. this makes use of the
existing ErrorKind::InvalidGlobPattern error case.
for example, a glob with unbalanced brackets is invalid and now looks
like this:
```console
$ cargo run -- 'noasdjfi/['
Error: Cannot parse inputs from arguments
Caused by:
0: UNIX glob pattern is invalid
1: Pattern syntax error near position 9: invalid range pattern
```
i've also added some TODO notes which would make the logic neater.
i didn't do those refactors in this PR because i am mindful of another
currently-open (and much bigger) PR which touches these lines
(https://www.github.com/lycheeverse/lychee/pull/1837)
i wonder if the reason FsGlob stores strings rather than a Pattern is
for flexibility to switch to different glob packages.
this PR was prompted by findings in
lycheeverse#1864 (comment)
Good question, I wouldn't think so but then I'm not the author. @mre might know the answer to that. Otherwise I have no objections to this PR. |
fcdf77c to
e0912ab
Compare
If you're asking me, I haven't tried. I think the change would be pretty easy. But it's the kind of thing which is so obvious that if it hasn't been done already, then there's probably a logical reason why it hasn't. That's my reasoning at least. If you're asking @ mre, then we have to wait for him :) |
|
Sorry for the late reply.
That's a sensible approach in general. In this specific case however, I have to disappoint you. It might just have been an oversight on my side. 😆 Well, perhaps one issue might be serializing/deserializing those values when writing/reading the config. The |
Don't worry about this. My PR is stale anyway, so I'd have to rebase it. So feel free to make any changes you like. |
unfortunately, the `glob_with` function (which walks the dirs and performs the globbing) does _not_ support taking a Pattern as input, so we have to convert it back to a string at that point. this is a bit weird, but i think it's still a bit nicer.
ths parsing logic is also changed to use early return so there is less nesting
Conflicts: lychee-bin/tests/cli.rs lychee-lib/src/collector.rs lychee-lib/src/types/input/input.rs
|
Thanks for your comments! I've done the two extra TODOs within this PR. The InputSource parsing is now in InputSource::new and the FsGlob field is a Pattern. As @mre foreshadowed, we have to do some custom serialisation/deserialisation, but it's not too much trouble. |
thomas-zahner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@katrinafyi Thanks, I really like it. This makes the code quite a bit cleaner and usability is also improved. I have just a few small remarks.
lychee-lib/src/types/input/source.rs
Outdated
| fn test_pattern_serialization_is_original_pattern() { | ||
| let pat = "asd[f]*"; | ||
| assert_eq!( | ||
| to_json(&InputSource::FsGlob { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and then unwrap both results here when we expect two Oks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's done ~
Co-authored-by: Thomas Zahner <thomas.zahner@protonmail.ch>
now, when providing an input with glob characters but an invalid glob pattern, this will cause a "cannot parse inputs" error rather than erroring out later during link checking. this makes use of the existing ErrorKind::InvalidGlobPattern error case.
for example, a glob with unbalanced brackets is invalid and now looks like this:
i've also added some TODO notes which would make the logic neater. i didn't do those refactors in this PR because i am mindful of another currently-open (and much bigger) PR which touches these lines (https://www.github.com/lycheeverse/lychee/pull/1837)
i wonder if the reason FsGlob stores strings rather than a Pattern is for flexibility to switch to different glob packages.
this PR was prompted by findings in #1864 (comment)