-
-
Notifications
You must be signed in to change notification settings - Fork 192
refactor: use InputResolver to implement Input::get_sources #1880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
previously, there was a lot of duplicated code between InputResolver::resolve_input and Input::get_sources, where resolve_input is used for main link checking and get_sources is used for dumping inputs. after this PR, Input::get_sources is implemented in terms of InputResolver::resolve. however, in doing so, we have to loosen the lifetime constraints on the `&Input` argument of resolve, so there's a lot of changes in resolve_input to effectively _transpose_ the logic. that is, previously it matched on the input source within a single try_stream!, and now it matches on the input source first then chooses between multiple try_streams. however, because try_stream constructs a new anonymous type every use, we then need to change the type to pin box dyn send. i don't know why this is needed but it seems to work. all in all, it's a surprisingly difficult refactor to get rid of some unpleasant code duplication. idrc if this gets merged or not, but the transposing of resolve_input does open the possibility of returning a `Result<Box<dyn Stream ...>>` to represent input source failures, as desired in https://www.github.com/lycheeverse/lychee/pull/1864#issuecomment-3389278177
|
Thank you very much for this amazing PR 🚀 It was something that bothered me as well but I didn't end up digging so deep, so really appreciate it. I would really like to merge. What does your TODO refer to? |
Conflicts: lychee-lib/src/types/input/input.rs lychee-lib/src/types/input/resolver.rs
|
Thanks for the kind words! The todo comment was about checking the comments and moving them to where the code is implemented. I've done that now, so it should be ready to review :) Edit: I can't see the test failure locally - flaky? |
Conflicts: lychee-lib/src/types/input/input.rs lychee-lib/src/types/input/resolver.rs
|
Nice work!
Before, The change from However, by cloning the data ( Pin is necessary because As a side note, if we wanted to avoid the dynamic dispatch, the typical alternative is to write out all variants as an enum: enum ResolveStream<'a> {
RemoteUrl(...),
FsGlob(...),
FsPath(...),
// ...
}But honestly, |
mre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two minor comments. Apart from that looks good to go!
| /// Note: Individual glob match failures are logged to stderr but don't terminate the stream. | ||
| /// However, directory traversal errors will stop processing and return the error immediately. | ||
| /// Returns an error if [`InputResolver::resolve`] returns an error. | ||
| pub fn get_sources( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You added #[must_use] to resolve() but not to get_sources(). Since get_sources() also returns a Stream, you might want consistency:
| pub fn get_sources( | |
| #[must_use] | |
| pub fn get_sources( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried that but clippy didn't like it aha. I guess must_use is already implicitly propagated? Honestly, I only add must_use as clippy directs and am not an expert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're probably right. What was the clippy message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was this message here https://github.com/lycheeverse/lychee/actions/runs/19263234742/job/55072805923?pr=1880. I could've kept it and added a comment message, but I didn't think of one.
error: this function has a `#[must_use]` attribute with no message, but returns a type already marked as `#[must_use]`
--> lychee-lib/src/types/input/input.rs:188:5
|
188 | / pub fn get_sources(
189 | | self,
190 | | file_extensions: FileExtensions,
191 | | skip_hidden: bool,
192 | | skip_ignored: bool,
193 | | excluded_paths: &PathExcludes,
194 | | ) -> impl Stream<Item = Result<String>> {
| |___________________________________________^
|
= help: either add some descriptive message or remove the attribute
= help: for further information visit https://rust-lang.github.io/rust-clippy/rust-1.91.0/index.html#double_must_use
= note: `-D clippy::double-must-use` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(clippy::double_must_use)]`| #[must_use] | ||
| pub fn resolve<'a>( | ||
| input: &'a Input, | ||
| input: &'_ Input, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The &'_ is fine but I think &Input should work, too, and would be more conventional since you're explicitly not tying it to 'a.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, idk why but i had the idea that that wouldn't work. Thanks!
feedback by @ mre!
|
Thanks for making lychee's code a little cleaner. |
previously, there was a lot of duplicated code between
InputResolver::resolve_input and Input::get_sources, where
resolve_input is used for main link checking and get_sources is used for
dumping inputs.
after this PR, Input::get_sources is implemented in terms of
InputResolver::resolve. however, in doing so, we have to loosen the
lifetime constraints on the
&Inputargument of resolve, so there's alot of changes in resolve_input to effectively transpose the logic.
that is, previously it matched on the input source within a single
try_stream!, and now it matches on the input source first then chooses
between multiple try_streams. however, because try_stream constructs a
new anonymous type every use, we then need to change the type to pin box
dyn send.
i don't know why this is needed but it seems to work. all in all, it's a
surprisingly difficult refactor to get rid of some unpleasant code
duplication.
idrc if this gets merged or not, but the transposing of resolve_input
does open the possibility of returning a
Result<Box<dyn Stream ...>>to represent input source failures, as desired in
https://www.github.com/lycheeverse/lychee/pull/1864#issuecomment-3389278177
TODO