-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sidecar doesn't properly handle best effort upload and races with entrypoint. #21167
Comments
@cjwagner: Please ensure the request meets the requirements listed here. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I would like to work on this issue |
I think that's what the behavior did when I originally wrote it @cjwagner lol. Perhaps that's been broken since. Unfortunately hard to e2e test this component. |
@viveksyngh sure thing - shoot me an /assign and I'll review when you are ready |
That would be one way to mitigate this problem and it would probably be sufficient most of the time, but I don't think it is as robust as doing a best effort upload then continuing to wait and try the real upload. The double upload seems strictly better since it ensures that we upload something even if the main upload takes too long to complete. |
/assign |
I have opened a draft MR to #21644 to address this, where I am adding a field in DecorationConfig to allow configuring IgnoreInterrupts for sidecar. Please let me know if I am moving in right direction here. |
/milestone v1.22 |
We've recently had multiple presubmits report failure in spyglass and finished.json even though they were actually aborted due to more recent commits being pushed. The ProwJob resource disagrees with finished.json and properly indicates the aborted state.
I think we are encountering this race:
test-infra/prow/sidecar/run.go
Lines 90 to 95 in 2a5710e
This option is designed to help mitigate when used in conjunction with an appropriate graceperiod timeout:
test-infra/prow/sidecar/options.go
Lines 50 to 77 in 2a5710e
However, the option is always set to false, we don't have a way to configure it:
test-infra/prow/pod-utils/decorate/podspec.go
Line 723 in 276f55e
I think the comment describing the race is misleading and the logic could be improved here. In particular we don't actually try to perform the upload twice if an interrupt is received, we just immediately begin the upload then terminate. I would expect sidecar to immediately begin the upload, but then check/wait for the marker files to be written, and reupload when the markers are present.
Based on the comment this sounds like what was actually intended.
/help
The text was updated successfully, but these errors were encountered: