Skip to content

Conversation

@DirectXMan12
Copy link
Contributor

The check for no endpoints got inverted at some point during the PR,
causing health checks to be enabled only when a service was idled,
instead of the other way around. This fixes that.

Fixes bug 1366180

@smarterclayton
Copy link
Contributor

Since this regressed the product, please add an integration or e2e test that validates this is correct so it does not break in the future.

@smarterclayton smarterclayton added this to the 1.3.0 milestone Aug 23, 2016
@DirectXMan12
Copy link
Contributor Author

cc @knobunc

@DirectXMan12
Copy link
Contributor Author

[test]

@DirectXMan12 DirectXMan12 force-pushed the bug/reencrypt-unidling-fixed branch from 7d3dd11 to e8454b6 Compare August 23, 2016 17:27
@DirectXMan12
Copy link
Contributor Author

DirectXMan12 commented Aug 23, 2016

Alright, I had an extended test which could potentially have been a bit flaky, but I've changed it so that it should not be (the default health check interval is 5 seconds, and I've up the "consistently" time period to 20 seconds). However, I do not think that extended test was running -- I just looked through a Jenkins result, and only saw networking and conformance extended tests being run.

@smarterclayton do we not run most of the extended tests? If that's the case, should I add a new section to the e2e test, or what?

@smarterclayton
Copy link
Contributor

smarterclayton commented Aug 23, 2016 via email

@knobunc
Copy link
Contributor

knobunc commented Aug 23, 2016

@smarterclayton Define fast? This is testing a negative. So the test has to sleep to make sure something doesn't happen within a reasonable amount of time.

@DirectXMan12
Copy link
Contributor Author

@smarterclayton these tests are somewhat necessarily slow (they run in about 43s per for the "idling only" on my machine) because we have to wait for endpoints to come up (may take a few seconds), and then wait at least 10s (currently 20s to be safe) in order to make sure we've gone through what would be 1 health check interval (5s) if the health checks were running, checking to make sure we don't get any pods coming up. I'm assuming anything reported as slow by the test runner is too slow to be marked as conformance, right?

@knobunc
Copy link
Contributor

knobunc commented Aug 23, 2016

The change LGTM.

@smarterclayton
Copy link
Contributor

43 seconds is not too slow. The test runner doesn't know what slow is really. Slow would be 2 minutes probably.

@DirectXMan12
Copy link
Contributor Author

@smarterclayton ack, I'll stick conformance on the appropriate ones.

@DirectXMan12 DirectXMan12 force-pushed the bug/reencrypt-unidling-fixed branch from e8454b6 to be3ba8c Compare August 23, 2016 18:21
@DirectXMan12
Copy link
Contributor Author

I've marked the entire idling test suite (~5-6m total) as "[Conformance]". If that's too much, I can just mark a couple of tests (which would be ~1m30s) that cover this case and basic unidling instead.

@DirectXMan12
Copy link
Contributor Author

[test]

@smarterclayton
Copy link
Contributor

your budget is 2 minutes

@DirectXMan12
Copy link
Contributor Author

ack, I'll just make that two then

@DirectXMan12 DirectXMan12 force-pushed the bug/reencrypt-unidling-fixed branch from be3ba8c to b08af4a Compare August 23, 2016 18:44
@DirectXMan12
Copy link
Contributor Author

alright, the basic idling with DC and basic unidling with TCP are now marked as conformance

The check for no endpoints got inverted at some point during the PR,
causing health checks to be enabled *only* when a service was idled,
instead of the other way around.  This fixes that.

Fixes bug 1366180
@DirectXMan12
Copy link
Contributor Author

[test]

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to b08af4a

@smarterclayton
Copy link
Contributor

LGTM [merge]

@openshift-bot
Copy link
Contributor

openshift-bot commented Aug 23, 2016

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/8383/) (Image: devenv-rhel7_4914)

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to b08af4a

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/8373/)

@DirectXMan12
Copy link
Contributor Author

Looks like a flake on the DC deployment conformance tests:

should run a deployment to completion and then scale to zero
Started deployment #4\nError from server: The get operation against ReplicationController could not be completed at this time, please try again.

@smarterclayton
Copy link
Contributor

Please link the appropriate flake issue.

On Tue, Aug 23, 2016 at 4:37 PM, Solly Ross [email protected]
wrote:

Looks like a flake on the DC deployment conformance tests:

should run a deployment to completion and then scale to zero
Started deployment #4 https://github.com/openshift/origin/pull/4\nError
from server: The get operation against ReplicationController could not be
completed at this time, please try again.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10596 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p4xCDDR-gqJwYZANkEuMox1Xo5BXks5qi1oZgaJpZM4JrEMR
.

@openshift-bot openshift-bot merged commit 6fd783f into openshift:master Aug 24, 2016
@DirectXMan12 DirectXMan12 deleted the bug/reencrypt-unidling-fixed branch August 24, 2016 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants