Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify Agent to reduce frequency of Pods getting UnexpectedAdmissionError #556

Conversation

kate-goldenring
Copy link
Contributor

What this PR does / why we need it:

Previously we were clearing device usage slots when a node tries to reallocate it. We did this for fairness reasons, but it meant denying kubelet as the source of truth and irritating UnexpectedAdmissionErrors. Furthermore, fairness of reallocating slots should already be ensured by SlotReconcilliation, which be clearing unused slots more immediately, reducing the frequency of this scenario. Also, for local unshared devices, there is no reason to free the device up for other nodes.

fixes #450

@bfjelds and @romoh would be great to get quick eyes on this small but significant change.

Special notes for your reviewer:

If applicable:

  • this PR has an associated PR with documentation in akri-docs
  • this PR contains unit tests
  • added code adheres to standard Rust formatting (cargo fmt)
  • code builds properly (cargo build)
  • code is free of common mistakes (cargo clippy)
  • all Akri tests succeed (cargo test)
  • inline documentation builds (cargo doc)
  • all commits pass the DCO bot check by being signed off -- see the failing DCO check for instructions on how to retroactively sign commits

@kate-goldenring kate-goldenring changed the title Allow reallocating instance device_usage slots to a node Modify Agent to reduce frequency of Pods getting UnexpectedAdmissionError Feb 17, 2023
@kate-goldenring
Copy link
Contributor Author

working on failed tests in #559

@adithyaj
Copy link
Collaborator

adithyaj commented Feb 23, 2023

working on failed tests in #559

merged #559, so if we pull in the changes from the main branch it should be good to go - all the errors are related to the crictl like you mentioned

@kate-goldenring kate-goldenring force-pushed the reduce-unexpected-admissions-errors branch from 56ea7b4 to f1f5be9 Compare February 23, 2023 16:25
@kate-goldenring
Copy link
Contributor Author

/version patch

@github-actions github-actions bot added the version/patch Patch version change is needed label Feb 23, 2023
@kate-goldenring kate-goldenring force-pushed the reduce-unexpected-admissions-errors branch from 5ec29fb to bf6d538 Compare February 23, 2023 22:27
@yujinkim-msft yujinkim-msft merged commit bfd0fe6 into project-akri:main Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
version/patch Patch version change is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modify Agent to reduce frequency of Pods getting UnexpectedAdmissionError
3 participants