Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update bagit.py #174

Closed
wants to merge 2 commits into from
Closed

Conversation

finoradin
Copy link

Fixing validation bug - oxum validation was outside the if fast condition, so validation was always defaulting to fast

Fixing validation bug - oxum validation was outside the if fast condition, so validation was *always* defaulting to fast
@acdha
Copy link
Member

acdha commented Aug 6, 2024

Can you give a little more detail about the scenario you encountered? We always want to validate the payload oxum value even if we're not doing a fast validation so the code appears to be correct and a brief review of the existing tests shows at least one test which relies on slow validation.

@finoradin
Copy link
Author

If I intentionally modified a file in a bag, and validated, it would only complain about the payload oxum, but would not report anything regarding checksum validation failure - i.e. which file failed, etc. I thought it was just stopping at the oxum check, thus not validating the checksums. If I was wrong and it is actually still validating the checksums, I would expect it to report back accordingly of course.

@acdha
Copy link
Member

acdha commented Sep 18, 2024

The proposed new logic doesn't change the fast processing and it appears to introduce a bug where the non-fast mode would no longer validate the oxum check. We always need to do that, the only thing fast mode does is bail out immediately if that fails rather than checksumming a bunch of files first.

I'm getting the impression that there might be a different workflow that you have in mind, and that makes me wonder whether the answer would be a different CLI flag.

@finoradin
Copy link
Author

Yes, the proposed changes were not intended to impact the fast processing. the issue, at least, as far as I could tell was that the fast processing was happening even when it wasn’t asked to happen.

Regular validation was only reporting failed oxum and, not showing any output about individual file validation, nor indicating, which of the files had failed.

@acdha
Copy link
Member

acdha commented Sep 18, 2024

Yes, the proposed changes were not intended to impact the fast processing. the issue, at least, as far as I could tell was that the fast processing was happening even when it wasn’t asked to happen.

That's expected - the fast processing isn't separate, it's only doing the first half of the validation process (see section 3 of RC 8493) but both steps are required.

It would definitely make sense to extend the output to be more precise if there are cases where we're not telling you which files are inconsistent with the metadata.

@finoradin
Copy link
Author

finoradin commented Sep 18, 2024

Ok well in that case I'll close this PR since it sounds like it should be doing "fast" validation (checking the oxum) and validating the payload checksums (whereas I think my change makes it only does the latter when not specifying "fast"). We can continue discussing here: #177 because it def shouldn't stop and only report the failed oxum — how is someone supposed to fix their bag if they don't know why it failed after all.

@finoradin finoradin closed this Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants