UPSTREAM: 23894: OOM errors when processes exit rapidly #8412

smarterclayton · 2016-04-08T01:17:06Z

This is on the bubble for 1.2 but I wanted to see if it helps clear up our failures

smarterclayton · 2016-04-08T15:34:25Z

[test]

ncdc · 2016-04-08T23:51:00Z

@smarterclayton upstream PR was updated and hopefully will fix the type issues in all the right places (by removing the type assertions).

smarterclayton · 2016-04-11T15:29:47Z

Updated against upstream

ncdc · 2016-04-11T17:03:35Z

Conformance tests failed, again with the update-demo scaling an RC test. But it doesn't appear to be the same failure text. Still tracking it down. But I did notice that the docker.log that's captured in the jenkins artifacts isn't complete. For example, the failing test created its first container that had an issue at 12:04, but the contents of docker.log starts at 12:08 😢

smarterclayton · 2016-04-11T17:10:52Z

#8441 is the other failure.

smarterclayton · 2016-04-11T17:11:11Z

You can extend the docker log time. Going up to 30m is probably fine.

ncdc · 2016-04-11T17:12:13Z

How do we do that?

I'll spin up a rhel7 vm in ec2 to try to repro manually.

smarterclayton · 2016-04-11T17:13:55Z

It's in the test failure trap where we shut down the server - we grab the docker logs from the journal in hack/test-end-to-end-docker.sh

smarterclayton · 2016-04-11T18:24:46Z

[test]

smarterclayton · 2016-04-11T18:57:51Z

Updated

ncdc · 2016-04-11T19:22:10Z

Because we run tests in parallel, each test's namespace needs to be added to the various SCCs to ensure upstream e2es can pass against OpenShift's security model. It looks like that code was resulting in each namespace stomping on the other namespaces such that only a single e2e namespace at a time was ever a member of the various SCCs.

#8465 should fix this issue.

smarterclayton · 2016-04-11T20:10:02Z

@jwforres @spadgett Bindata failure

spadgett · 2016-04-11T20:13:34Z

@smarterclayton looking at it

spadgett · 2016-04-11T20:16:15Z

@jwforres new font-awesome update today is breaking us

smarterclayton · 2016-04-11T22:20:06Z

[test]

smarterclayton · 2016-04-12T00:05:31Z

Flaked #8399 [test]

On Mon, Apr 11, 2016 at 7:50 PM, OpenShift Bot [email protected]
wrote:

continuous-integration/openshift-jenkins/test FAILURE (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/2907/)

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8412 (comment)

openshift-bot · 2016-04-12T00:10:17Z

Evaluated for origin test up to 9ec799d

openshift-bot · 2016-04-12T01:35:17Z

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/2913/)

smarterclayton · 2016-04-12T02:15:36Z

Flaked on

Apr 11 20:39:02.217: INFO: Error running &{/data/src/github.com/openshift/origin/_output/local/bin/linux/amd64/oc [oc create --namespace=extended-test-scoped-router-27a6z-0lea8 --config=/tmp/openshift-extended-tests/extended-test-scoped-router-27a6z-0lea8-user.kubeconfig -f /data/src/github.com/openshift/origin/test/extended/fixtures/scoped-router.yaml] []   Error from server: User "extended-test-scoped-router-27a6z-0lea8-user" cannot create pods in project "extended-test-scoped-router-27a6z-0lea8"

smarterclayton · 2016-04-12T02:15:48Z

Have not seen OOMs reoccur - [merge]

smarterclayton · 2016-04-12T02:16:25Z

I think that flake is an extended flake w.r.t. the policy cache falling behind. Not sure though - @deads2k?

openshift-bot · 2016-04-12T02:20:14Z

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_requests_origin/5568/) (Image: devenv-rhel7_3953)

openshift-bot · 2016-04-12T02:20:14Z

Evaluated for origin merge up to 9ec799d

deads2k · 2016-04-12T11:39:13Z

I think that flake is an extended flake w.r.t. the policy cache falling behind. Not sure though - @deads2k?

It's likely. We have a method WaitForPolicyUpdate to avoid that problem during our integration tests.

ncdc · 2016-04-12T13:05:31Z

@pecameron this already merged. No need to re-test.

deads2k · 2016-04-12T17:16:14Z

@pecameron this already merged. No need to re-test.

Man, he meant it too. :)

smarterclayton force-pushed the 23894 branch from 612f4e6 to 26fc091 Compare April 8, 2016 02:40

smarterclayton added the priority/P0 label Apr 8, 2016

smarterclayton mentioned this pull request Apr 11, 2016

Flake in extended test suite (should scale a replication contoller) #8441

Closed

smarterclayton force-pushed the 23894 branch from 26fc091 to 28c366a Compare April 11, 2016 15:29

smarterclayton mentioned this pull request Apr 11, 2016

Add flag to disable dynamic provisioning #8426

Merged

UPSTREAM: 23894: Should not fail containers on OOM score adjust

9ec799d

smarterclayton force-pushed the 23894 branch from 28c366a to 9ec799d Compare April 11, 2016 18:57

spadgett mentioned this pull request Apr 11, 2016

Fix bindata diff for new font-awesome release #8467

Merged

ncdc mentioned this pull request Apr 11, 2016

Flake in kubectl e2e test #8397

Closed

openshift-bot merged commit 5aa33d1 into openshift:master Apr 12, 2016

ncdc mentioned this pull request Apr 12, 2016

FailedSync: Error syncing pod, skipping failed to apply oom-score-adj to container #8466

Closed

liggitt mentioned this pull request Apr 16, 2016

Finalize Kube items 3.2 #6766

Closed

85 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM: 23894: OOM errors when processes exit rapidly #8412

UPSTREAM: 23894: OOM errors when processes exit rapidly #8412

smarterclayton commented Apr 8, 2016

smarterclayton commented Apr 8, 2016

ncdc commented Apr 8, 2016

smarterclayton commented Apr 11, 2016

ncdc commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

ncdc commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

ncdc commented Apr 11, 2016

smarterclayton commented Apr 11, 2016 via email

spadgett commented Apr 11, 2016

spadgett commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

smarterclayton commented Apr 12, 2016

openshift-bot commented Apr 12, 2016

openshift-bot commented Apr 12, 2016

smarterclayton commented Apr 12, 2016

smarterclayton commented Apr 12, 2016

smarterclayton commented Apr 12, 2016

openshift-bot commented Apr 12, 2016

openshift-bot commented Apr 12, 2016

deads2k commented Apr 12, 2016

ncdc commented Apr 12, 2016

deads2k commented Apr 12, 2016

UPSTREAM: 23894: OOM errors when processes exit rapidly #8412

UPSTREAM: 23894: OOM errors when processes exit rapidly #8412

Conversation

smarterclayton commented Apr 8, 2016

smarterclayton commented Apr 8, 2016

ncdc commented Apr 8, 2016

smarterclayton commented Apr 11, 2016

ncdc commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

ncdc commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

ncdc commented Apr 11, 2016

smarterclayton commented Apr 11, 2016 via email

spadgett commented Apr 11, 2016

spadgett commented Apr 11, 2016

smarterclayton commented Apr 11, 2016

smarterclayton commented Apr 12, 2016

openshift-bot commented Apr 12, 2016

openshift-bot commented Apr 12, 2016

smarterclayton commented Apr 12, 2016

smarterclayton commented Apr 12, 2016

smarterclayton commented Apr 12, 2016

openshift-bot commented Apr 12, 2016

openshift-bot commented Apr 12, 2016

deads2k commented Apr 12, 2016

ncdc commented Apr 12, 2016

deads2k commented Apr 12, 2016