bulk project deletion causes long controller response curve #7166

timothysc · 2016-02-09T21:25:00Z

The response curve for bulk operations 1<n<100 is almost immediate, but once 100<n<1000the response curve becomes very slow, and cluster cleanup takes half hour on average. I'm not exactly certain which controller on the openshift side is responsible for project deletion on state change...

oc delete projects -l purpose=test

Eventual consistency is reached, but it's very slow.

/cc @abhgupta @jeremyeder @danmcp @derekwaynecarr

The text was updated successfully, but these errors were encountered:

liggitt · 2016-02-09T21:48:27Z

how many projects is that operating on? what's the split between client time (how long does the oc command take to run) and server processing before projects transition from "Terminated" to actually being gone?

derekwaynecarr · 2016-02-09T22:54:04Z

Relevant upstream PR:
kubernetes/kubernetes#20076

It allows us to run multiple workers in Kubernetes namespace clean-up, and
it moves all the resource deletion (except for pods and services) to
deleteCollection calls.

On Tue, Feb 9, 2016 at 4:48 PM, Jordan Liggitt [email protected]
wrote:

how many projects is that operating on? what's the split between client
time (how long does the oc command take to run) and server processing
before projects transition from "Terminated" to actually being gone?

—
Reply to this email directly or view it on GitHub
#7166 (comment).

derekwaynecarr · 2016-02-09T22:55:16Z

We would need to make corresponding changes in the origin namespace controller to use delete collection calls, and adopt worker pattern.

derekwaynecarr · 2016-02-09T22:59:18Z

What was in the projects?

timothysc · 2016-02-10T02:32:50Z

how many projects is that operating on?

1000 takes > 1/2 hour.

what's the split between client time (how long does the oc command take to run) and server processing before projects transition from "Terminated" to actually being gone?

Total time for oc command is less then a minute, and transition to "Terminating" is pretty fast. Draining itself is very slow.

What was in the projects?

A several deploymentConfig objects that had a bunch of rcs etc as part of a vertical scaling test.

derekwaynecarr · 2016-02-10T20:20:44Z

So the upstream PR that merged today may improve the time, but we would also need to move the downstream controller to follow a similar pattern. I can look to get a PR together to adopt a queue for openshift, but I am not sure if I should move to use delete_collection calls in OpenShift now or not...

derekwaynecarr · 2016-02-10T20:34:55Z

On further thought, I would prefer we measure after the rebase that includes the earlier mentioned PR to understand if we need further improvements since ultimately, I would like to move to a single shared controller in both Origin and Kube that uses the Discovery API.

pweil- · 2016-02-18T16:09:49Z

@timothysc is this something you can remeasure for us now?

timothysc · 2016-02-18T17:55:06Z

Yes, we'll re-eval.

liggitt · 2016-03-16T19:57:22Z

rebase just pulled in deletecollection changes for upstream types

timothysc · 2016-03-18T13:43:55Z

We'll be deploying early next week and will report back.

timothysc · 2016-05-12T14:22:40Z

So I'm going to close down this issue now. There is still a long decay on termination for bulk deletions, but I believe there are other things now in the pipe to handle this for a -force case.

danmcp added the priority/P1 label Feb 9, 2016

danmcp assigned pweil- Feb 9, 2016

pweil- mentioned this issue Feb 10, 2016

Finalize Kube items 3.2 #6766

Closed

85 tasks

pweil- assigned timothysc and unassigned pweil- Feb 18, 2016

smarterclayton modified the milestone: 1.2.0 Feb 20, 2016

danmcp added priority/P2 and removed priority/P1 labels Feb 26, 2016

timothysc closed this as completed May 12, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bulk project deletion causes long controller response curve #7166

bulk project deletion causes long controller response curve #7166

timothysc commented Feb 9, 2016

liggitt commented Feb 9, 2016

derekwaynecarr commented Feb 9, 2016

derekwaynecarr commented Feb 9, 2016

derekwaynecarr commented Feb 9, 2016

timothysc commented Feb 10, 2016

derekwaynecarr commented Feb 10, 2016

derekwaynecarr commented Feb 10, 2016

pweil- commented Feb 18, 2016

timothysc commented Feb 18, 2016

liggitt commented Mar 16, 2016

timothysc commented Mar 18, 2016

timothysc commented May 12, 2016

bulk project deletion causes long controller response curve #7166

bulk project deletion causes long controller response curve #7166

Comments

timothysc commented Feb 9, 2016

liggitt commented Feb 9, 2016

derekwaynecarr commented Feb 9, 2016

derekwaynecarr commented Feb 9, 2016

derekwaynecarr commented Feb 9, 2016

timothysc commented Feb 10, 2016

derekwaynecarr commented Feb 10, 2016

derekwaynecarr commented Feb 10, 2016

pweil- commented Feb 18, 2016

timothysc commented Feb 18, 2016

liggitt commented Mar 16, 2016

timothysc commented Mar 18, 2016

timothysc commented May 12, 2016