pkg/controller: start informers before populating store in tests #457

runcom · 2019-02-19T12:25:09Z

Signed-off-by: Antonio Murdaca [email protected]

- What I did

The indexer in the informer uses a store which manages object using a read-write lock, the flakes come from the fact that if we don't sync up the store before adding objects, the listing of objects can race against an add (verified this by adding a bunch of debug logs during investigation and I could spot that lister.List can return empty even if we added objects to the indexer before starting the informers themselves).

Fixes #417
Fixes #444
Fixes #451
Fixes #449

Cannot reproduce anymore with this patch (running in a loop since 3 hours now), I'm able to consistently reproduce by running go test -race w/o this patch.

- How to verify it

true && while [ $? -eq 0 ]; do GOCACHE=off go test -race -v ./cmd/... ./pkg/... ./lib/...; done

w/o this patch the command above fails as soon as the flake makes the test error out. With the patch, the above continues perpetually.

- Description for the changelog

Signed-off-by: Antonio Murdaca <[email protected]>

runcom · 2019-02-19T12:36:19Z

CI glitch...

could not wait for build: the build machine-config-server failed after 6m6s with reason PullBuilderImageFailed: Failed pulling builder image.

Pulling image docker-registry.default.svc:5000/ci-op-v6y05...8fe8db228e015b2bd5ec62da99e0505f8fd6dbc7ce842e2153680dc ...

Pulling image registry.svc.ci.openshift.org/openshift/release:golang-1.10 ...
error: build error: failed to pull image: Get https://regi...t canceled (Client.Timeout exceeded while awaiting headers)

/retest

runcom · 2019-02-19T13:00:14Z

still glitch #457 (comment)

runcom · 2019-02-19T13:01:58Z

/retest

runcom · 2019-02-19T13:05:41Z

CI errors apart (not unit), this is fixing the flakes as my tests never failed in more than 5 hours

runcom · 2019-02-19T13:10:59Z

the glitch is a CI issue

runcom · 2019-02-19T14:07:14Z

/retest

runcom · 2019-02-19T14:48:51Z

aws limit

/retest

cgwalters · 2019-02-19T15:31:48Z

LGTM, giving for another final review/merge:

/assign kikisdeliveryservice

rphillips · 2019-02-19T15:46:24Z

lgtm (deferring to @kikisdeliveryservice for review). Nice find

LorbusChris · 2019-02-19T15:46:31Z

/lgtm

openshift-ci-robot · 2019-02-19T15:47:03Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LorbusChris, runcom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [runcom]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

LorbusChris · 2019-02-19T16:11:02Z

/test e2e-aws

openshift-bot · 2019-02-19T17:39:59Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2019-02-19T21:45:10Z

/retest

Please review the full test history for this PR and help us cut down flakes.

kikisdeliveryservice · 2019-02-19T23:06:42Z

HAProxy and other known flakes.

/test e2e-aws

abhinavdahiya · 2019-02-20T00:15:51Z

known flake.

fail [github.com/openshift/origin/test/extended/bootstrap_user/bootstrap_user_login.go:52]: Expected error:
    <*util.ExitError | 0xc422561740>: {
        Cmd: "oc login --config=/tmp/configfile916071508 --namespace=e2e-test-bootstrap-login-cpctm -u kubeadmin -p 5LpcuRoMeuNntNkAusTRa269NuAbQoU9ptn4aloHbp4",
        StdErr: "Error from server (InternalError): Internal error occurred: unexpected response: 400",
        ExitError: {
            ProcessState: {
                pid: 11753,
                status: 256,
                rusage: {
                    Utime: {Sec: 0, Usec: 268025},
                    Stime: {Sec: 0, Usec: 66749},
                    Maxrss: 94440,
                    Ixrss: 0,
                    Idrss: 0,
                    Isrss: 0,
                    Minflt: 15028,
                    Majflt: 0,
                    Nswap: 0,
                    Inblock: 0,
                    Oublock: 0,
                    Msgsnd: 0,
                    Msgrcv: 0,
                    Nsignals: 0,
                    Nvcsw: 1198,
                    Nivcsw: 27,
                },
            },
            Stderr: nil,
        },
    }
    exit status 1
not to have occurred

openshift/origin#22088 is moving it to flake

kikisdeliveryservice · 2019-02-20T00:37:53Z

Thanks for the update on that test, @abhinavdahiya !

openshift-bot · 2019-02-20T01:47:37Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2019-02-20T04:00:06Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2019-02-20T05:49:08Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2019-02-20T07:49:56Z

/retest

Please review the full test history for this PR and help us cut down flakes.

runcom · 2019-02-20T08:27:29Z

/retest

runcom · 2019-02-20T08:33:57Z

/retest

openshift-bot · 2019-02-20T09:50:59Z

/retest

Please review the full test history for this PR and help us cut down flakes.

runcom · 2019-02-20T10:40:47Z

CI errors tracked in slack

/retest

openshift-bot · 2019-02-20T11:52:01Z

/retest

Please review the full test history for this PR and help us cut down flakes.

runcom · 2019-02-20T14:40:36Z

/retest

This patch does various things, all related: - start the informers for controllers before running them to follow what https://github.com/kubernetes/sample-controller/blob/master/main.go#L64-L71 does and to avoid races already spotted in unit tests (openshift#457) - move the clients builder code from cmd/common to a new package just for that under the new internal/ folder so nobody but us can use that - move common controllers code under pkg/controller/common to be reused - create an e2e only clientset, avoiding us to type client version everytime (eg client.CoreV1()..). This is a need cause if in the future the api change, we don't play grep for a week replacing old apis... Signed-off-by: Antonio Murdaca <[email protected]>

pkg/controller: start informers before populating store in tests

2953dd9

Signed-off-by: Antonio Murdaca <[email protected]>

openshift-ci-robot requested review from abhinavdahiya and ashcrow February 19, 2019 12:25

openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 19, 2019

openshift-ci-robot assigned kikisdeliveryservice Feb 19, 2019

openshift-ci-robot assigned LorbusChris Feb 19, 2019

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 19, 2019

runcom mentioned this pull request Feb 19, 2019

pkg/operator: use fake clients in unit #452

Merged

umohnani8 mentioned this pull request Feb 20, 2019

Add extra filter for checking if registries have changed #461

Merged

openshift-merge-robot merged commit 53923bc into openshift:master Feb 20, 2019

runcom deleted the flakes-fixes branch February 20, 2019 16:14

runcom mentioned this pull request Feb 23, 2019

controllers: refactor code and start informers first, fix template/render race #482

Merged

pkg/controller: start informers before populating store in tests #457

pkg/controller: start informers before populating store in tests #457

Uh oh!

Conversation

runcom commented Feb 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

runcom commented Feb 19, 2019

Uh oh!

runcom commented Feb 19, 2019

Uh oh!

runcom commented Feb 19, 2019

Uh oh!

runcom commented Feb 19, 2019

Uh oh!

runcom commented Feb 19, 2019

Uh oh!

runcom commented Feb 19, 2019

Uh oh!

runcom commented Feb 19, 2019

Uh oh!

cgwalters commented Feb 19, 2019

Uh oh!

rphillips commented Feb 19, 2019

Uh oh!

LorbusChris commented Feb 19, 2019

Uh oh!

openshift-ci-robot commented Feb 19, 2019

Uh oh!

LorbusChris commented Feb 19, 2019

Uh oh!

openshift-bot commented Feb 19, 2019

Uh oh!

openshift-bot commented Feb 19, 2019

Uh oh!

kikisdeliveryservice commented Feb 19, 2019

Uh oh!

abhinavdahiya commented Feb 20, 2019

Uh oh!

kikisdeliveryservice commented Feb 20, 2019

Uh oh!

openshift-bot commented Feb 20, 2019

Uh oh!

openshift-bot commented Feb 20, 2019

Uh oh!

openshift-bot commented Feb 20, 2019

Uh oh!

openshift-bot commented Feb 20, 2019

Uh oh!

runcom commented Feb 20, 2019

Uh oh!

runcom commented Feb 20, 2019

Uh oh!

openshift-bot commented Feb 20, 2019

Uh oh!

runcom commented Feb 20, 2019

Uh oh!

openshift-bot commented Feb 20, 2019

Uh oh!

runcom commented Feb 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

runcom commented Feb 19, 2019 •

edited

Loading