feat: propagate input loading/collecting errors to top level #1864
feat: propagate input loading/collecting errors to top level #1864thomas-zahner merged 66 commits intolycheeverse:masterfrom
Conversation
|
I see that this PR contradicts an existing test case which says the current behaviour is intentional. lychee/lychee-bin/tests/cli.rs Lines 1458 to 1462 in 54e425c Maybe there should be an option about whether to treat absolute links as an error when root-dir is not specified? At the moment, it would not be possible to get the old "ignoring" behaviour after this PR. Or, we just change the test |
we still ned to feed it back to Status and that requires an old ErrorKind. maybe we add a new case to Status.
this slightly winds back the ResolvedInputSource usage and changes Response back to an ordinary InputSource
this shouldn't be needed anymore since we no longer use a recursive case in ErrorKind. instead, we use the new RequestError type.
@katrinafyi Honestly, I'm okay with this. Conceptually, it's not wrong to treat preprocessing as part of link extraction and therefore the link check process. This PR then also has the benefit that if preprocessing only fails on one/some file lychee doesn't abort, which is an improvement. Before PR We could think about a simple prevalidation step, as you mentioned which checks if the value of Just two things I noticed:
|
|
@thomas-zahner Thanks for your reply. Yeah, I can see the logic in that approach. I'm happy to leave preprocess errors as link check errors. Re the error://, it appears for every error arising before a HTTP request is made. I also don't like it very much, but it's passed through the stats and response formatter logic, so avoiding it would need a lot of changes there. But maybe this is something I should dig deeper into. The error:// url could also be confusing. |
This reverts commit 759b691.
|
I don't want to pursue making the error URL optional in this PR. I think it will be a simple but extensive change. You just have to allow In the meantime, it could be changed to |
|
Many thanks for this PR! |
Exclude /$version links that are not yet populated during this building stage Add root-dir to handle relative links Newline added to the file The change in lychee introduced in lycheeverse/lychee#1864 (comment)
Exclude /$version links that are not yet populated during this building stage Add root-dir to handle relative links Newline added to the file The change in lychee introduced in lycheeverse/lychee#1864 (comment)
previously, errors which happened while collecting links (e.g., due to invalid base join) would print a terse message to the console and then they would be forgotten. this looked like:
this is unhelpful because it only shows the internal enum name and it's not reported in the final error count, so it's very easy to miss.
now, these "early" errors get propagated into the final error list and displayed alongside the HTTP failures.
this makes them more obvious and lets us make use of the existing functions for displaying error details and suggested fixes.
this is implemented by changing the request-constructing functions to return a
Result<_, RequestError>rather thanResult<_, ErrorKind>. RequestError is a new error type which contains information about the URL or input source which caused the error. it also contains the underlying ErrorKind.lychee/lychee-lib/src/types/request.rs
Lines 7 to 19 in 976904a
i think it's nice to have it as a new error type because it makes it clear to the user that they only have to handle these two errors, and it avoids making the ErrorKind type recursive.
this makes it easier to handle the returned error. previously, if an Err did occur in the collected request stream, this would lead to an unexpected early exit and a tokio error. so, i think that changing this to
Result<_, RequestError>is no great burden because lychee-bin did not have very good handling of the old ErrorKind.some more examples of the new error messages:
compared to before (below). also note that the old behaviour is nondeterministic. any one of the errors could be the one which is printed, and it aborts the program with no other output.
this implements step (2) of the plan here #1624 (comment)
outdated commentary. i changed it to a new RequestError type now
however, i don't like the way this is implemented. it has to "smuggle" the early errors through a new error case,
and all of the commands have to deliberately handle this case by bypassing the check logic and directly constructing a failed Response. it would be very easy for a user to forget to do this. originally, i wanted to make this new error separate from the usual ErrorKind, but the
lychee_lib::Result<T>type with ErrorKind is so pervasive and it would've needed extensive changes.maybe, instead, i could embed this into the Request type by making its Uri field into a
Result<Uri, CreatRequestError>but that seems not great too.another downside of the current approach is it uses the fake
error://URL to display these messages. this is because the ResponseBody needs a "valid" Uri.TODO:
also, along the way, this changes the
ErrorKind::detailscases for invalid path and invalid base join. these details previously just repeated the error message and would look like "Cannot convert path '/../../' to a URI: Cannot convert path to URI: '/../../'. Check path format".related to #1265