Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix code to handle delayed binding volumes #73863

Merged
merged 2 commits into from
Apr 1, 2019

Conversation

gnufied
Copy link
Member

@gnufied gnufied commented Feb 8, 2019

This fixes counting of volumes that are not immediate binded.

fixes #73724

/sig storage

Count PVCs that are unbound towards attach limit

@k8s-ci-robot k8s-ci-robot added needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. sig/storage Categorizes an issue or PR as relevant to SIG Storage. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 8, 2019
@msau42
Copy link
Member

msau42 commented Feb 8, 2019

/assign

@gnufied
Copy link
Member Author

gnufied commented Feb 11, 2019

/retest

@bsalamat
Copy link
Member

Please follow instructions here to post scheduler benchmarks before and after this PR.

@gnufied
Copy link
Member Author

gnufied commented Feb 11, 2019

/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Feb 11, 2019
@gnufied
Copy link
Member Author

gnufied commented Feb 11, 2019

Something is wrong with my setup of scheduler integration tests:

I0211 15:18:29.083115   22229 retry.go:149] clientv3/auth-retry: error "context canceled" on pinned endpoint "127.0.0.1:2379"                                                                                       
E0211 15:18:29.083160   22229 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}                                                                     
E0211 15:18:29.083204   22229 writers.go:172] apiserver was unable to write a JSON response: http: Handler timeout                                                                                                  
I0211 15:18:29.083304   22229 wrap.go:47] GET /apis/rbac.authorization.k8s.io/v1/clusterroles: (1m0.000470325s) 504                                                                                                 
goroutine 1560647 [running]:                      
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).recordStatus(0xc01b901c70, 0x1f8)                                                                                                        
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog/httplog.go:204 +0xd2                                                       
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).WriteHeader(0xc01b901c70, 0x1f8)                                                                                                         
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog/httplog.go:183 +0x35                                                       
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*baseTimeoutWriter).timeout(0xc03fab6aa0, 0xc020a2ef30)                                                                                               
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:218 +0xb5                                                       
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP(0xc04d746040, 0x5db9ce0, 0xc01b901c70, 0xc01482fa00)                                                                       
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:118 +0x2aa                                                      
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithWaitGroup.func1(0x5db9ce0, 0xc01b901c70, 0xc01482f900)                                                                                             
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/waitgroup.go:47 +0xd6                                                      
net/http.HandlerFunc.ServeHTTP(0xc04d774b10, 0x5db9ce0, 0xc01b901c70, 0xc01482f900)                
        /data/go/src/net/http/server.go:1964 +0x44                                                                                  
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithRequestInfo.func1(0x5db9ce0, 0xc01b901c70, 0xc01482f800)                                                                                        
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/requestinfo.go:39 +0x217                                                
net/http.HandlerFunc.ServeHTTP(0xc04d774b40, 0x5db9ce0, 0xc01b901c70, 0xc01482f800)  
        /data/go/src/net/http/server.go:1964 +0x44
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.withPanicRecovery.func1(0x5db9ce0, 0xc01b901c70, 0xc01482f800)                                                                                         
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/wrap.go:46 +0x127                                                          
net/http.HandlerFunc.ServeHTTP(0xc04d746060, 0x5dbada0, 0xc033987420, 0xc01482f800)
        /data/go/src/net/http/server.go:1964 +0x44 
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server.(*APIServerHandler).ServeHTTP(0xc04d774b70, 0x5dbada0, 0xc033987420, 0xc01482f800)                                                                             
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/handler.go:189 +0x51                                                               
k8s.io/kubernetes/test/integration/util.StartApiserver.func1(0x5dbada0, 0xc033987420, 0xc01482f800)
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/util/util.go:44 +0x72                                                                                
net/http.HandlerFunc.ServeHTTP(0xc01b95c690, 0x5dbada0, 0xc033987420, 0xc01482f800)                                                            
        /data/go/src/net/http/server.go:1964 +0x44                                                                
net/http.serverHandler.ServeHTTP(0xc01d1a8000, 0x5dbada0, 0xc033987420, 0xc01482f800)                              
        /data/go/src/net/http/server.go:2741 +0xab
net/http.(*conn).serve(0xc013255220, 0x5dc4ea0, 0xc04f058980)                                               
        /data/go/src/net/http/server.go:1847 +0x646                                                                                                          
created by net/http.(*Server).Serve                                                                        
        /data/go/src/net/http/server.go:2851 +0x2f5                                                                                                          
                                                                                                                     
logging error output: "{\"metadata\":{},\"status\":\"Failure\",\"message\":\"Timeout: request did not complete within 1m0s\",\"reason\":\"Timeout\",\"details\":{},\"code\":504}\n"                                 
 [scheduler_perf.test/v0.0.0 (linux/amd64) kubernetes/$Format 127.0.0.1:55076]                                                               
E0211 15:18:29.083507   22229 master_utils.go:234] last health content: ""                                                                                    
F0211 15:18:29.083530   22229 master_utils.go:235] timed out waiting for the condition                                 
E0211 15:18:29.440434   22229 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}                                                                
W0211 15:18:29.083562   22229 storage_scheduling.go:95] unable to get PriorityClass system-node-critical: the server was unable to return a response in the time allotted, but may still be processing the request (get priorityclasses.scheduling.k8s.io system-node-critical). Retrying...
exit status 255                                                                                                             
FAIL    k8s.io/kubernetes/test/integration/scheduler_perf       203.972s                         

@bsalamat fwiw - this PR should not introduce any regression in scheduler performance because all it does is - it ensures we are counting unbound but scheduled PVCs towards volume limit.

@gnufied
Copy link
Member Author

gnufied commented Feb 11, 2019

@bsalamat so I ran../test-performance.sh which gave me..

Before:

Scheduled 3000 Pods in 3 seconds (1000 per second on average). min QPS was 207

After:

Scheduled 3000 Pods in 3 seconds (1000 per second on average). min QPS was 206

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Feb 14, 2019
@gnufied
Copy link
Member Author

gnufied commented Feb 15, 2019

@msau42 addressed review comments PTAL.

@gnufied gnufied force-pushed the fix-attach-limit-delayed branch from ce3e1d0 to 2385eaf Compare February 15, 2019 20:45
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 15, 2019
@gnufied gnufied force-pushed the fix-attach-limit-delayed branch from 2385eaf to d1fada4 Compare February 22, 2019 03:24
@childsb childsb added this to the v1.14 milestone Mar 6, 2019
@nikopen
Copy link
Contributor

nikopen commented Mar 12, 2019

@childsb @gnufied @msau42
we're in code freeze, release is in 2 weeks from now. Is this bug urgent and release-blocking?

@spiffxp
Copy link
Member

spiffxp commented Mar 13, 2019

/milestone clear
v1.14 release lead here, given the lack of activity on this PR, I'm removing this from the milestone. Please discuss in #sig-release in slack if it's urgent this land as part of v1.14

@k8s-ci-robot k8s-ci-robot removed this from the v1.14 milestone Mar 13, 2019
@gnufied gnufied force-pushed the fix-attach-limit-delayed branch from d1fada4 to a3a93c5 Compare March 14, 2019 21:07
@gnufied
Copy link
Member Author

gnufied commented Mar 14, 2019

I guess we can merge this after code freeze.

@msau42 addressed review comments. PTAL.

@msau42
Copy link
Member

msau42 commented Mar 26, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 26, 2019
@spiffxp
Copy link
Member

spiffxp commented Mar 26, 2019

/priority important-soon

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 26, 2019
Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, gnufied

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 1, 2019
@k8s-ci-robot k8s-ci-robot merged commit 76b1d83 into kubernetes:master Apr 1, 2019
@matthias50
Copy link
Contributor

@gnufied , It looks like earliest release where this is present is 1.15. Can this be back-ported to 1.14 and 1.13?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Volume attach limit doesn't work for WaitForFirstConsumer Volume binding mode
8 participants