-
Notifications
You must be signed in to change notification settings - Fork 4.8k
vendor/k8s.io/kubernetes/test/e2e/upgrades/apps/job: List Pods in failure message #23161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@wking: This pull request references a valid Bugzilla bug. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: wking The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/hold only to debug why this might be failing, we would merge to master first before this (expect to get data from test runs) |
…lure message
Currently, this test can fail with the not-very-helpful [1,2]:
fail [k8s.io/kubernetes/test/e2e/upgrades/apps/job.go:58]: Expected
<bool>: false
to be true
Since this test is the only CheckForAllJobPodsRunning consumer, and
has been since CheckForAllJobPodsRunning landed in 116eda0
(Implements an upgrade test for Job, 2017-02-22, #41271), this commit
refactors the function to EnsureJobPodsRunning, dropping the opaque
boolean, and constructing a useful error summarizing the divergence
from the expected parallelism and the status of listed Pods.
Thanks to Maciej Szulik for the fixups [3] :).
Backports kubernetes/kubernetes@96b04bfeac
(test/e2e/upgrades/apps/job: List Pods in failure message, 2019-05-09,
kubernetes/kubernetes#77716).
[1]: https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1434/build-log.txt
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1708454#c0
[3]: wking/kubernetes#1
---
d44ff45 to
795c7cb
Compare
|
Fixed the compilation error with d44ff45 -> 795c7cb. |
|
@wking: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Hey, first run :). So that's "no Pods", not "Pods waiting to be scheduled and just not running yet". Dunno what to do about that :/ |
|
This failure was caused by problems with node: Let's keep re-testing it. |
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/23161/pull-ci-openshift-origin-release-4.1-e2e-aws-upgrade/51/artifacts/e2e-aws-upgrade/nodes.json | jq -r '.items[] | .conditions = ([.status.conditions[] | {key: .type, value: .}] | from_entries) | .conditions.Ready.lastTransitionTime + " " + .conditions.Ready.status + " " + .metadata.name' | sort
2019-06-13T17:35:31Z True ip-10-0-160-149.ec2.internal
2019-06-13T17:40:33Z True ip-10-0-169-80.ec2.internal
2019-06-13T17:41:09Z True ip-10-0-130-244.ec2.internal
2019-06-13T18:13:38Z True ip-10-0-138-20.ec2.internal
2019-06-13T18:15:01Z True ip-10-0-150-197.ec2.internal
2019-06-13T18:15:20Z Unknown ip-10-0-148-26.ec2.internalLooks like that node was just gone, with no sign of return by that cluster-teardown log collection. |
|
I can't find evidence of the pods being created. There are no pods from a |
|
Follow-up in openshift/machine-config-operator#855 |
|
In this case, kubelet ip-10-0-169-80 had two scheduled job pods at 18:12 and there are more kubelet logs going back for a considerable time showing they were available. grep the worker's journals for The namespace wasn't removed until 18:11 or so. |
|
@wking: No Bugzilla bug is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
we have a better patch in 4.2 and more jobs. So I'm going to close this. |
Currently, this test can fail with the not-very-helpful:
This pull request backports kubernetes/kubernetes@96b04bfeac (kubernetes/kubernetes#77716) to get a more useful error message.
The backport didn't apply cleanly. I think I made the appropriate adjustments, but give it a careful eyeball to make sure ;).