Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle recovery from resize failure #73036

Open
gnufied opened this issue Jan 17, 2019 · 10 comments
Open

Handle recovery from resize failure #73036

gnufied opened this issue Jan 17, 2019 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@gnufied
Copy link
Member

gnufied commented Jan 17, 2019

Currently when resize fails of a volume it keeps retrying indefinitely. This leads to two problems:

  1. Unnecessary API usage for an action that is not going to succeed (for example when you are out of quota on cloud provider or out of bricks on glusterfs).

  2. Sometimes user may want to retry volume expansion with lower value. For example - my current PVC is of 12GB and I tried to expand it to 40GB. But that failed. Now I want to retry volume expansion with 20GB except I can’t.

/sig storage

cc @bswartz @saad-ali

@gnufied gnufied added the kind/bug Categorizes issue or PR as related to a bug. label Jan 17, 2019
@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jan 17, 2019
@bswartz
Copy link
Contributor

bswartz commented Jan 17, 2019

I honestly don't see (1) as a problem. This is how everything in kubernetes works.

(2) Is the case that interests me. As long as the actual size doesn't change, it should be legal to "cancel" the resize request by changing the spec size back to the old value. Or as you say, retry a resize with a smaller increment.

I think we also want to think about future compatibility with a volume-shrink feature. It's inevitable that users will ask for it, and it's inevitable that at least a subset of implementers will want to implement it. Assuming we allow the feature to happen, we'll want the interface to work straightforwardly.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 17, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 17, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gnufied
Copy link
Member Author

gnufied commented Jul 18, 2019

/reopen
/remove-lifecycle-rotten
/lifecycle frozen

@k8s-ci-robot
Copy link
Contributor

@gnufied: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle-rotten
/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Jul 18, 2019
@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jul 18, 2019
@andyzhangx
Copy link
Member

@gnufied do you know how to cancel volume resize? e.g. current size is 4TB, and resize to 6TB failed, how can I change back to 4TB?

@FengJunLiu
Copy link

@gnufied https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1790-recover-resize-failure#recovery-from-volume-expansion-failure
This KEP describes the drawback of not being able to restore the capacity of PVC to its original capacity in the event of PVC expansion failure. Is there a plan to address this drawback in the future? thanks

@gnufied gnufied moved this to In progress in Volume expansion GA Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests

6 participants