tests: refactor test_fragments to clarify expected successes/fails#1776
tests: refactor test_fragments to clarify expected successes/fails#1776mre merged 8 commits intolycheeverse:masterfrom
test_fragments to clarify expected successes/fails#1776Conversation
this PR improves `test_fragments` by separating the URLs into separate lists for expected failures and expected successes. each of these are searched in stdout and/or stderr to check that the URL has the expected outcome. this makes the expected test behaviour much more obvious, where it was previously only implied by the "x total" and "x OK" strings. this PR makes no changes to actual behaviour. hopefully, it will make it easier to test future changes to the fragment behaviour. previously, the test only checked that stderr contains a list of URLs. because `--verbose` is specified, stderr would contain both successful and failing URLs, so this didn't really test much. it only tested that the URLs were detected by lychee. also, search terms are now suffixed with whitespace to prevent `file.html` from matching `file.html#something`. the test should be consider these as different links, but the previous code had the potential to match `file.html#something` when looking for `file.html`. in making this PR, we find that a fragment link to an empty file is always treated as a valid link because it is detected as a plaintext file. maybe there should be a warning when a URL points to a plaintext file and a fragment check is requested ([relevant code](https://github.com/lycheeverse/lychee/blob/ea415c8db4597c383f88b5fc9894fd86a7245494/lychee-lib/src/utils/fragment_checker.rs#L133)).
also update outdated text in file1.md
|
Yeah, we've certainly outgrown our old fragment testing setup, so I'm thankful for the refactor.
Makes sense, yes. Would you like to add the warning as part of this PR or a separate one? I'm fine with either option. |
for files detected as Plaintext type, fragment checking is not supported and will always return true. at the moment, this is done silently and this might be surprising if the user has requested fragment checking. note that the Plaintext case includes empty files and unknown file types, since Plaintext is used as the fallback file type. this change adds a warning to fragment_checker when this case is reached. the other reasonable place to put this warning would be in `lychee-lib/src/checker/file.rs` next to the other "Skipped fragment check" warnings. i have decided to put it in `fragment_checker.rs` so that it is next to the `return Ok(true)` statement. this will make it easier in case the Plaintext case is ever changed.
622c897 to
86edd1c
Compare
|
I've made the change but I'm looking at other discussions now and the warning could be hit often, for instance if a user has lots of local binary files. This seems more noisy than helpful. That said, people shouldn't be attaching fragments to binary file links anyway so maybe it's okay. |
|
Maybe instead of |
|
It is done :) |
|
|
||
| let expected_successes = vec![ | ||
| "fixtures/fragments/empty_dir", | ||
| "fixtures/fragments/empty_file#fragment", // XXX: is this a bug? a fragment in an empty file is being treated as valid |
There was a problem hiding this comment.
You're right to question this. I think we should treat fragment links to empty files as invalid.
Our job is to identify links that won't provide a meaningful user experience. A fragment reference in an empty file is functionally broken: it promises to take the user to specific content that doesn't exist. While the file itself may be reachable, the fragment reference is semantically meaningless, making this a legitimate case to flag as invalid.
I'd be willing to accept a pull request, which changes this, and I would welcome a PR for that.
There was a problem hiding this comment.
I do agree in principle, but I couldn't think of a nice way to implement it. Fragments in empty file are treated as existing because empty files are detected as plaintext. But I think it would be too heavy-handed to reject all fragments on plaintext files, especially since plaintext is the fallback file type for unknown files.
I think the info message is a okay for now, and maybe someone more experienced can look at it :) Maybe the first step would be to differentiate plaintext files and unknown file types, so you could handle them differently.
There was a problem hiding this comment.
Oh, that's a really good idea. I guess in the long run, we should consider deeper file inspection anyway. This gives us more flexibility and takes away some of the guesswork.
|
Thanks for the nice change, @katrinafyi. |
|
Thanks @katrinafyi this test also used to bother me a bit as it was so imprecise |
this PR improves
test_fragmentsby separating the URLs into separatelists for expected failures and expected successes. each of these are
searched in stdout and/or stderr to check that the URL has the expected
outcome. this makes the expected test behaviour much more obvious, where
it was previously only implied by the "x total" and "x OK" strings.
this PR makes no changes to actual behaviour. hopefully, it will make it
easier to test future changes to the fragment behaviour.
previously, the test only checked that stderr contains a list of URLs.
because
--verboseis specified, stderr would contain both successful andfailing URLs, so this didn't really test much. it only tested that the
URLs were detected by lychee.
also, search terms are now suffixed with whitespace to prevent
file.htmlfrom matching
file.html#something. the test should consider these asdifferent links, but the previous code had the potential to match
file.html#somethingwhen looking forfile.html.in making this PR, we find that a fragment link to an empty file is
always treated as a valid link because it is detected as a plaintext
file. maybe there should be a warning when a URL points to a plaintext
file and a fragment check is requested
(relevant code).