tests: refactor `test_fragments` to clarify expected successes/fails by katrinafyi · Pull Request #1776 · lycheeverse/lychee

katrinafyi · 2025-07-27T05:57:11Z

this PR improves test_fragments by separating the URLs into separate
lists for expected failures and expected successes. each of these are
searched in stdout and/or stderr to check that the URL has the expected
outcome. this makes the expected test behaviour much more obvious, where
it was previously only implied by the "x total" and "x OK" strings.

this PR makes no changes to actual behaviour. hopefully, it will make it
easier to test future changes to the fragment behaviour.

previously, the test only checked that stderr contains a list of URLs.
because --verbose is specified, stderr would contain both successful and
failing URLs, so this didn't really test much. it only tested that the
URLs were detected by lychee.

also, search terms are now suffixed with whitespace to prevent file.html
from matching file.html#something. the test should consider these as
different links, but the previous code had the potential to match
file.html#something when looking for file.html.

in making this PR, we find that a fragment link to an empty file is
always treated as a valid link because it is detected as a plaintext
file. maybe there should be a warning when a URL points to a plaintext
file and a fragment check is requested
(relevant code).

this PR improves `test_fragments` by separating the URLs into separate lists for expected failures and expected successes. each of these are searched in stdout and/or stderr to check that the URL has the expected outcome. this makes the expected test behaviour much more obvious, where it was previously only implied by the "x total" and "x OK" strings. this PR makes no changes to actual behaviour. hopefully, it will make it easier to test future changes to the fragment behaviour. previously, the test only checked that stderr contains a list of URLs. because `--verbose` is specified, stderr would contain both successful and failing URLs, so this didn't really test much. it only tested that the URLs were detected by lychee. also, search terms are now suffixed with whitespace to prevent `file.html` from matching `file.html#something`. the test should be consider these as different links, but the previous code had the potential to match `file.html#something` when looking for `file.html`. in making this PR, we find that a fragment link to an empty file is always treated as a valid link because it is detected as a plaintext file. maybe there should be a warning when a URL points to a plaintext file and a fragment check is requested ([relevant code](https://github.com/lycheeverse/lychee/blob/ea415c8db4597c383f88b5fc9894fd86a7245494/lychee-lib/src/utils/fragment_checker.rs#L133)).

also update outdated text in file1.md

mre · 2025-07-27T12:23:37Z

Yeah, we've certainly outgrown our old fragment testing setup, so I'm thankful for the refactor.

in making this PR, we find that a fragment link to an empty file is
always treated as a valid link because it is detected as a plaintext
file. maybe there should be a warning when a URL points to a plaintext
file and a fragment check is requested

Makes sense, yes. Would you like to add the warning as part of this PR or a separate one? I'm fine with either option.

for files detected as Plaintext type, fragment checking is not supported and will always return true. at the moment, this is done silently and this might be surprising if the user has requested fragment checking. note that the Plaintext case includes empty files and unknown file types, since Plaintext is used as the fallback file type. this change adds a warning to fragment_checker when this case is reached. the other reasonable place to put this warning would be in `lychee-lib/src/checker/file.rs` next to the other "Skipped fragment check" warnings. i have decided to put it in `fragment_checker.rs` so that it is next to the `return Ok(true)` statement. this will make it easier in case the Plaintext case is ever changed.

katrinafyi · 2025-07-27T15:34:30Z

I've made the change but I'm looking at other discussions now and the warning could be hit often, for instance if a user has lots of local binary files. This seems more noisy than helpful. That said, people shouldn't be attaching fragments to binary file links anyway so maybe it's okay.

mre · 2025-07-28T11:25:19Z

Maybe instead of warning we could set it to info. I think it's fine to only print it in verbose mode if it's expected to occur often.

katrinafyi · 2025-07-28T12:54:17Z

It is done :)

mre · 2025-07-30T17:16:37Z

lychee-bin/tests/cli.rs

+
+        let expected_successes = vec![
+            "fixtures/fragments/empty_dir",
+            "fixtures/fragments/empty_file#fragment", // XXX: is this a bug? a fragment in an empty file is being treated as valid


You're right to question this. I think we should treat fragment links to empty files as invalid.

Our job is to identify links that won't provide a meaningful user experience. A fragment reference in an empty file is functionally broken: it promises to take the user to specific content that doesn't exist. While the file itself may be reachable, the fragment reference is semantically meaningless, making this a legitimate case to flag as invalid.

I'd be willing to accept a pull request, which changes this, and I would welcome a PR for that.

I do agree in principle, but I couldn't think of a nice way to implement it. Fragments in empty file are treated as existing because empty files are detected as plaintext. But I think it would be too heavy-handed to reject all fragments on plaintext files, especially since plaintext is the fallback file type for unknown files.

I think the info message is a okay for now, and maybe someone more experienced can look at it :) Maybe the first step would be to differentiate plaintext files and unknown file types, so you could handle them differently.

Oh, that's a really good idea. I guess in the long run, we should consider deeper file inspection anyway. This gives us more flexibility and takes away some of the guesswork.

mre · 2025-07-30T17:17:23Z

Thanks for the nice change, @katrinafyi.

thomas-zahner · 2025-07-31T10:06:25Z

Thanks @katrinafyi this test also used to bother me a bit as it was so imprecise

katrinafyi and others added 6 commits July 27, 2025 15:51

borrow in for loop

add42e7

add remaining URLs to classify all detected URLs

349597c

also update outdated text in file1.md

use .len() to compute expected counts

017fd92

move total check last so the more informative ok/err checks are first

56de370

fix text which wasn't updated after revert

5d02a6f

katrinafyi force-pushed the improve-fragment-tests branch from 622c897 to 86edd1c Compare July 27, 2025 13:19

use info level for plaintext fragment messge

39b5aea

mre reviewed Jul 30, 2025

View reviewed changes

mre approved these changes Jul 30, 2025

View reviewed changes

mre merged commit caddc9f into lycheeverse:master Jul 30, 2025
6 checks passed

mre mentioned this pull request Jul 30, 2025

chore: release v0.20.0 #1760

Closed

mre mentioned this pull request Aug 19, 2025

chore: release v0.20.0 #1808

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tests: refactor `test_fragments` to clarify expected successes/fails#1776

tests: refactor `test_fragments` to clarify expected successes/fails#1776
mre merged 8 commits intolycheeverse:masterfrom
rina-forks:improve-fragment-tests

katrinafyi commented Jul 27, 2025

Uh oh!

mre commented Jul 27, 2025

Uh oh!

katrinafyi commented Jul 27, 2025

Uh oh!

mre commented Jul 28, 2025

Uh oh!

katrinafyi commented Jul 28, 2025

Uh oh!

mre Jul 30, 2025

Uh oh!

katrinafyi Jul 30, 2025

Uh oh!

mre Jul 31, 2025

Uh oh!

Uh oh!

mre commented Jul 30, 2025

Uh oh!

thomas-zahner commented Jul 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

katrinafyi commented Jul 27, 2025

Uh oh!

mre commented Jul 27, 2025

Uh oh!

katrinafyi commented Jul 27, 2025

Uh oh!

mre commented Jul 28, 2025

Uh oh!

katrinafyi commented Jul 28, 2025

Uh oh!

mre Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

katrinafyi Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

mre Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mre commented Jul 30, 2025

Uh oh!

thomas-zahner commented Jul 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants