Skip to content

tests: refactor test_fragments to clarify expected successes/fails#1776

Merged
mre merged 8 commits intolycheeverse:masterfrom
rina-forks:improve-fragment-tests
Jul 30, 2025
Merged

tests: refactor test_fragments to clarify expected successes/fails#1776
mre merged 8 commits intolycheeverse:masterfrom
rina-forks:improve-fragment-tests

Conversation

@katrinafyi
Copy link
Member

this PR improves test_fragments by separating the URLs into separate
lists for expected failures and expected successes. each of these are
searched in stdout and/or stderr to check that the URL has the expected
outcome. this makes the expected test behaviour much more obvious, where
it was previously only implied by the "x total" and "x OK" strings.

this PR makes no changes to actual behaviour. hopefully, it will make it
easier to test future changes to the fragment behaviour.

previously, the test only checked that stderr contains a list of URLs.
because --verbose is specified, stderr would contain both successful and
failing URLs, so this didn't really test much. it only tested that the
URLs were detected by lychee.

also, search terms are now suffixed with whitespace to prevent file.html
from matching file.html#something. the test should consider these as
different links, but the previous code had the potential to match
file.html#something when looking for file.html.

in making this PR, we find that a fragment link to an empty file is
always treated as a valid link because it is detected as a plaintext
file. maybe there should be a warning when a URL points to a plaintext
file and a fragment check is requested
(relevant code).

katrinafyi and others added 6 commits July 27, 2025 15:51
this PR improves `test_fragments` by separating the URLs into separate
lists for expected failures and expected successes. each of these are
searched in stdout and/or stderr to check that the URL has the expected
outcome. this makes the expected test behaviour much more obvious, where
it was previously only implied by the "x total" and "x OK" strings.

this PR makes no changes to actual behaviour. hopefully, it will make it
easier to test future changes to the fragment behaviour.

previously, the test only checked that stderr contains a list of URLs.
because `--verbose` is specified, stderr would contain both successful and
failing URLs, so this didn't really test much. it only tested that the
URLs were detected by lychee.

also, search terms are now suffixed with whitespace to prevent `file.html`
from matching `file.html#something`. the test should be consider these as
different links, but the previous code had the potential to match
`file.html#something` when looking for `file.html`.

in making this PR, we find that a fragment link to an empty file is
always treated as a valid link because it is detected as a plaintext
file. maybe there should be a warning when a URL points to a plaintext
file and a fragment check is requested
([relevant code](https://github.com/lycheeverse/lychee/blob/ea415c8db4597c383f88b5fc9894fd86a7245494/lychee-lib/src/utils/fragment_checker.rs#L133)).
also update outdated text in file1.md
@mre
Copy link
Member

mre commented Jul 27, 2025

Yeah, we've certainly outgrown our old fragment testing setup, so I'm thankful for the refactor.

in making this PR, we find that a fragment link to an empty file is
always treated as a valid link because it is detected as a plaintext
file. maybe there should be a warning when a URL points to a plaintext
file and a fragment check is requested

Makes sense, yes. Would you like to add the warning as part of this PR or a separate one? I'm fine with either option.

for files detected as Plaintext type, fragment checking is not supported
and will always return true. at the moment, this is done silently and
this might be surprising if the user has requested fragment checking.
note that the Plaintext case includes empty files and unknown file
types, since Plaintext is used as the fallback file type.

this change adds a warning to fragment_checker when this case is
reached.

the other reasonable place to put this warning would be in
`lychee-lib/src/checker/file.rs` next to the other "Skipped fragment
check" warnings. i have decided to put it in `fragment_checker.rs` so
that it is next to the `return Ok(true)` statement. this will make it
easier in case the Plaintext case is ever changed.
@katrinafyi katrinafyi force-pushed the improve-fragment-tests branch from 622c897 to 86edd1c Compare July 27, 2025 13:19
@katrinafyi
Copy link
Member Author

I've made the change but I'm looking at other discussions now and the warning could be hit often, for instance if a user has lots of local binary files. This seems more noisy than helpful. That said, people shouldn't be attaching fragments to binary file links anyway so maybe it's okay.

@mre
Copy link
Member

mre commented Jul 28, 2025

Maybe instead of warning we could set it to info. I think it's fine to only print it in verbose mode if it's expected to occur often.

@katrinafyi
Copy link
Member Author

It is done :)


let expected_successes = vec![
"fixtures/fragments/empty_dir",
"fixtures/fragments/empty_file#fragment", // XXX: is this a bug? a fragment in an empty file is being treated as valid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right to question this. I think we should treat fragment links to empty files as invalid.

Our job is to identify links that won't provide a meaningful user experience. A fragment reference in an empty file is functionally broken: it promises to take the user to specific content that doesn't exist. While the file itself may be reachable, the fragment reference is semantically meaningless, making this a legitimate case to flag as invalid.

I'd be willing to accept a pull request, which changes this, and I would welcome a PR for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree in principle, but I couldn't think of a nice way to implement it. Fragments in empty file are treated as existing because empty files are detected as plaintext. But I think it would be too heavy-handed to reject all fragments on plaintext files, especially since plaintext is the fallback file type for unknown files.

I think the info message is a okay for now, and maybe someone more experienced can look at it :) Maybe the first step would be to differentiate plaintext files and unknown file types, so you could handle them differently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's a really good idea. I guess in the long run, we should consider deeper file inspection anyway. This gives us more flexibility and takes away some of the guesswork.

@mre mre merged commit caddc9f into lycheeverse:master Jul 30, 2025
6 checks passed
@mre
Copy link
Member

mre commented Jul 30, 2025

Thanks for the nice change, @katrinafyi.

@mre mre mentioned this pull request Jul 30, 2025
@thomas-zahner
Copy link
Member

Thanks @katrinafyi this test also used to bother me a bit as it was so imprecise

@mre mre mentioned this pull request Aug 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants