Skip to content

Conversation

@enj
Copy link
Contributor

@enj enj commented Apr 2, 2019

This change updates the operator deployment to include the operand
version as an environment variable. Since we encode the operator
deployment RV into the operand deployment, we correctly gate on
changing the version until the new version has rolled out.

The VersionForOperand function uses a config map with a version to
image mapping to determine the operand version. It does this by
reversing the mapping (image to version). This has a serious flaw
in that it assumes that all versions point to distinct images
(which is not the case since an image may not change during a
release). Since map iteration order is undefined, this can lead to
the operand version flapping back and forth as the reversed map
changes based on which image value was written last.

Remove version-mapping config map as it is no longer used.

All environment variable names and values are now stored as consts
and vars, respectively (the values are effectively runtime consts as
they cannot change once the operator is running).

Signed-off-by: Monis Khan [email protected]

@openshift/sig-auth
/assign @sallyom

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 2, 2019
@enj
Copy link
Contributor Author

enj commented Apr 2, 2019

/hold

Just so I can carefully review CI output.

@openshift-ci-robot openshift-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 2, 2019
@enj enj changed the title Correctly determine operand version Bug 1695244: Correctly determine operand version Apr 2, 2019
@enj
Copy link
Contributor Author

enj commented Apr 2, 2019

Looks promising https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-authentication-operator/103/pull-ci-openshift-cluster-authentication-operator-master-e2e-aws-operator/394:

{
    "apiVersion": "config.openshift.io/v1",
    "kind": "ClusterOperator",
    "metadata": {
        "creationTimestamp": "2019-04-02T20:31:10Z",
        "generation": 1,
        "name": "authentication",
        "resourceVersion": "13942",
        "selfLink": "/apis/config.openshift.io/v1/clusteroperators/authentication",
        "uid": "40860c64-5586-11e9-b56e-0ec73cf8fb1c"
    },
    "spec": {},
    "status": {
        "conditions": [ ... ],
        "versions": [
            {
                "name": "integrated-oauth-server",
                "version": "0.0.1-2019-04-02-200340_openshift"
            },
            {
                "name": "operator",
                "version": "0.0.1-2019-04-02-200340"
            }
        ]
    }
}

@enj
Copy link
Contributor Author

enj commented Apr 2, 2019

{
    "apiVersion": "extensions/v1beta1",
    "kind": "Deployment",
    "metadata": {
        "annotations": {
            "deployment.kubernetes.io/revision": "1"
        },
        "creationTimestamp": "2019-04-02T20:30:36Z",
        "generation": 1,
        "labels": {
            "app": "origin-cluster-authentication-operator"
        },
        "name": "openshift-authentication-operator",
        "namespace": "openshift-authentication-operator",
        "resourceVersion": "9211",
        "selfLink": "/apis/extensions/v1beta1/namespaces/openshift-authentication-operator/deployments/openshift-authentication-operator",
        "uid": "2c124a6e-5586-11e9-be1d-1282a1ebf0a8"
    },
    "spec": {
        ...
            },
            "spec": {
                "containers": [
                    {
                        "args": [
                            "--config=/var/run/configmaps/config/operator-config.yaml",
                            "-v=2"
                        ],
                        "command": [
                            "authentication-operator",
                            "operator"
                        ],
                        "env": [
                            {
                                "name": "IMAGE",
                                "value": "registry.svc.ci.openshift.org/ci-op-w32ivklf/stable@sha256:cafaf6b1c6ab2c454f10dedb862bff72d75af81baba475ca384b0fc117c9e10d"
                            },
                            {
                                "name": "OPERATOR_IMAGE_VERSION",
                                "value": "0.0.1-2019-04-02-200340"
                            },
                            {
                                "name": "OPERAND_IMAGE_VERSION",
                                "value": "0.0.1-2019-04-02-200340_openshift"
                            },
                            {
                                "name": "POD_NAME",
                                "valueFrom": {
                                    "fieldRef": {
                                        "apiVersion": "v1",
                                        "fieldPath": "metadata.name"
                                    }
                                }
                            }
                        ],
                        "image": "registry.svc.ci.openshift.org/ci-op-w32ivklf/stable@sha256:762c2470d9ec8d1da0cf42d6f0ecf74eec31eed66a9c5a8c10d0bfd8139a44f6",
                        "imagePullPolicy": "IfNotPresent",
                        "name": "operator",
                        "resources": {
                            "requests": {
                                "cpu": "10m",
                                "memory": "50Mi"
                            }
                        },
                        "terminationMessagePath": "/dev/termination-log",
                        "terminationMessagePolicy": "FallbackToLogsOnError",
                        "volumeMounts": [
                            {
                                "mountPath": "/var/run/configmaps/config",
                                "name": "config"
                            }
                        ]
                    }
                ],
                ...
            }
        }
    }
}

@enj
Copy link
Contributor Author

enj commented Apr 2, 2019

/hold cancel

OPERAND_IMAGE_VERSION is new to this PR so I am pretty sure this is working.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 2, 2019
@sallyom
Copy link
Contributor

sallyom commented Apr 2, 2019

/test e2e-aws

@sallyom
Copy link
Contributor

sallyom commented Apr 2, 2019

i tested an upgrade w/ this PR, it completed successfully and:

$ oc get clusteroperator authentication -o yaml | grep -A 4 versions
  versions:
  - name: integrated-oauth-server
    version: 0.0.1-2019-04-02-211727_openshift
  - name: operator
    version: 0.0.1-2019-04-02-211727

looks good

@sallyom
Copy link
Contributor

sallyom commented Apr 2, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 2, 2019
@enj
Copy link
Contributor Author

enj commented Apr 2, 2019

/retest

5 similar comments
@enj
Copy link
Contributor Author

enj commented Apr 3, 2019

/retest

@enj
Copy link
Contributor Author

enj commented Apr 3, 2019

/retest

@enj
Copy link
Contributor Author

enj commented Apr 3, 2019

/retest

@enj
Copy link
Contributor Author

enj commented Apr 3, 2019

/retest

@stlaz
Copy link
Contributor

stlaz commented Apr 3, 2019

/retest

@wking
Copy link
Member

wking commented Apr 3, 2019

Wait, what? e2e-aws:

Failing tests:

[k8s.io] Kubelet when scheduling a busybox Pod with hostAliases should write entries to /etc/hosts [NodeConformance] [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]
[sig-storage] In-tree Volumes [Driver: aws] [Testpattern: Dynamic PV (block volmode)] volumeMode should create sc, pod, pv, and pvc, read/write to the pv, and delete all created resources [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] In-tree Volumes [Driver: nfs] [Testpattern: Pre-provisioned PV (block volmode)] volumeMode should fail to create pod by failing to mount volume [Suite:openshift/conformance/parallel] [Suite:k8s]

I thought openshift/origin#22454 and openshift/origin/pull/22455 were supposed to get those ignored?

@wking
Copy link
Member

wking commented Apr 3, 2019

Maybe try bumping your commit timestamp or something to get a new commit hash to get a new namespace in the CI cluster? You'd need a new /lgtm, but it might unstick whatever's hung up about the e2e-aws suite here.

@stlaz stlaz force-pushed the enj/i/operand_version branch from 57706f0 to b0f9c00 Compare April 3, 2019 07:35
@openshift-ci-robot openshift-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 3, 2019
This change updates the operator deployment to include the operand
version as an environment variable.  Since we encode the operator
deployment RV into the operand deployment, we correctly gate on
changing the version until the new version has rolled out.

The VersionForOperand function uses a config map with a version to
image mapping to determine the operand version.  It does this by
reversing the mapping (image to version).  This has a serious flaw
in that it assumes that all versions point to distinct images
(which is not the case since an image may not change during a
release).  Since map iteration order is undefined, this can lead to
the operand version flapping back and forth as the reversed map
changes based on which image value was written last.

Remove version-mapping config map as it is no longer used.

All environment variable names and values are now stored as consts
and vars, respectively (the values are effectively runtime consts as
they cannot change once the operator is running).

Signed-off-by: Monis Khan <[email protected]>
@stlaz stlaz force-pushed the enj/i/operand_version branch from b0f9c00 to 914a446 Compare April 3, 2019 07:36
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 3, 2019
@stlaz
Copy link
Contributor

stlaz commented Apr 3, 2019

/lgtm
I did a rebase of the PR, lets see about the tests now.

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 3, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enj, sallyom, stlaz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit e458a2d into openshift:master Apr 3, 2019

operatorVersion = os.Getenv(operatorVersionEnvName)

apiserverURL = os.Getenv(apiHostEnvName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be handled in the cmd package along with other configuration.

osinOperandName = "integrated-oauth-server"

operatorSelfName = "operator"
osinOperandName = "integrated-oauth-server"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename osin references while we're at it? It's probably not clear what it is for ppl coming to the codebase post 4.0. oauthServerDeploymentName or whatever would be more descriptive anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants