feat: move submission files async with retry #8018
Draft
+105
−27
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Validating / rendering various formats of a submitted I-D will, in general, happen in a different container (i.e., a celery worker container) than where its files will be moved upon posting as a draft (i.e., the main django container). These processes may not agree on the contents of the filesystem. That's bad news for the
move_files_to_repository
method, which assumes it only needs to move files it can see on the filesystem before being done with its job.This is causing #8016, where NFS sync between containers is sometimes taking a few to a few tens of seconds. We might be able to improve that, but in a future where files become blobs in an external store, this issue is going to be worse so patching around this issue is not appealing.
This PR does a couple things. First, it adds modeling to track the files associated with a submission. For now, it remembers the filename, creation time, and whether it was generated. A file that was not generated is assumed to have been uploaded. This allows us to rely on the database to tell us what files need moving, which better guarantees data consistency.
It also moves the chore of relocating files from the staging path to the draft repository into an asynchronous celery task. The task is tolerant of missing files and retries, moving what it can see, until all the expected files have been moved. With the parameters chosen, it should usually finish in 5-15 seconds but will try for up to 2-3 minutes before giving up. (The ranges are approximate because retry jitter is enabled.)
Fixes #8016