Bug 1961925: UPSTREAM: <carry>: Does not prevent pod creation because of no nodes reason when it runs under the regular cluster by cynepco3hahue · Pull Request #756 · openshift/kubernetes

cynepco3hahue · 2021-05-19T18:01:11Z

Check the cluster infrastructure resource status to be sure that we run on top of an SNO cluster and in case if the pod runs on top of the regular cluster, exit before node existence check.

Signed-off-by: Artyom Lukianov alukiano@redhat.com

openshift-ci · 2021-05-19T18:01:19Z

@cynepco3hahue: This pull request references Bugzilla bug 1961925, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target release (4.8.0) matches configured target release for branch (4.8.0)
bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @wangke19

Details

In response to this:

Bug 1961925: UPSTREAM: : Does not prevent pod creation because of no nodes reason when it runs under the regular cluster

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot · 2021-05-19T18:01:21Z

@cynepco3hahue: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

ae1f6d7|UPSTREAM: : Does not prevent pod creation because of no nodes reason when it runs under the regular cluster: does not specify an upstream backport in the commit message

dhellmann

The logic looks right. I'm not familiar with the informer pattern, so I'll leave that for the API team to comment on. I have one suggestion about the wording for the warning annotation, but that's not a blocker to approving this.

Thanks!

dhellmann · 2021-05-19T18:05:24Z

openshift-kube-apiserver/admission/autoscaling/managementcpusoverride/admission.go

Suggested change

pod.Annotations[workloadAdmissionWarning] = "only single node clusters are supported"

pod.Annotations[workloadAdmissionWarning] = "only single-node clusters support workload partitioning"

sjenning · 2021-05-19T18:29:34Z

LGTM

deads2k · 2021-05-19T19:58:03Z

openshift-kube-apiserver/admission/autoscaling/managementcpusoverride/admission.go

I'd like a measure of how long until the first pod is created before and after this change. Our install is likely be to sensitive to this. It should be fast, but I'd like to know how quickly this resource is available.

Will it be enough to check how long takes installation with and without the PR?

@deads2k I checked the installer time under the GCP for two jobs that include the PR and first one was ~29m and the second one ~31, and I checked the installer time under the GCP for jobs without the PR(checked 10 jobs), and the installer time always inside of interval 28-32 minutes

deads2k · 2021-05-19T20:02:03Z

/approve
/hold

structure looks ok. I'd like this measurement to be sure we aren't trading one problem for another (#756 (comment)) before the hold is released. If it is more than one minute, let's discuss before merge. If it's less than one minute, simply recording how much longer it is here when you release the hold is sufficient.

cynepco3hahue · 2021-05-20T08:28:44Z

/retest

openshift-ci-robot · 2021-05-20T09:21:56Z

@cynepco3hahue: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

59a40f5|UPSTREAM: : Does not prevent pod creation because of no nodes reason when it runs under the regular cluster: does not specify an upstream backport in the commit message

cynepco3hahue · 2021-05-20T12:12:43Z

/retest

cynepco3hahue · 2021-05-20T15:52:04Z

/retest

deads2k · 2021-05-24T13:20:28Z

/hold cancel

sttts · 2021-05-25T07:22:07Z

Don't mix vendor changes with code changes. This make it hard during rebases. Make it two commits.

openshift-ci-robot · 2021-05-25T11:25:11Z

@cynepco3hahue: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

093d51d|UPSTREAM: : vendor config listers and informers: does not specify an upstream backport in the commit message
96e5dcc|UPSTREAM: : Does not prevent pod creation because of no nodes reason when it runs under the regular cluster: does not specify an upstream backport in the commit message

sttts · 2021-05-25T11:31:17Z

openshift-kube-apiserver/admission/autoscaling/managementcpusoverride/admission.go

what does nil mean? When can it be niil?

sttts · 2021-05-25T11:31:50Z

openshift-kube-apiserver/admission/autoscaling/managementcpusoverride/admission.go

this races. You can't write unprotected state.

Why do you store this value at all? It is from a lister, i.e. has O(1) lookup as there is only one object. So no need to cache the value. Get rid of it inicluding the race.

Good point! Thanks!

…reason when it runs under the regular cluster Check the `cluster` infrastructure resource status to be sure that we run on top of a SNO cluster and in case if the pod runs on top of regular cluster, exit before node existence check. Signed-off-by: Artyom Lukianov <alukiano@redhat.com>

openshift-ci-robot · 2021-05-25T11:46:41Z

@cynepco3hahue: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

384cbdd|UPSTREAM: : Does not prevent pod creation because of no nodes reason when it runs under the regular cluster: does not specify an upstream backport in the commit message
727f445|UPSTREAM: : vendor config listers and informers: does not specify an upstream backport in the commit message

sttts · 2021-05-25T12:08:47Z

openshift-kube-apiserver/admission/autoscaling/managementcpusoverride/admission.go

 	}

-	nodes, err := a.nodeLister.List(labels.Everything())
+	clusterInfra, err := a.infraConfigLister.Get(infraClusterName)


can one delete the infra resource? Would that brick the cluster?

@dhellmann @browsell Does it expected that an admin can delete the infrastructure object? What additional components in a cluster relay on it?

@sttts if it's deletable, it's a bug. Admission shoudl be coded to prevent deletion of config.openshift.io objects.

Lots of things rely on that resource. It's entirely possible that someone could delete it. There is also a period of time during bootstrapping where it won't exist yet. So, yes, we need to cope with it not existing and assume a default. Unfortunately, the default won't work for single node because it won't enable partitioning.

That race condition makes me think we need something other than an API resource to turn the feature on, since we need all partitioning annotations processed the same way from the beginning of the life of the cluster. I'm not sure what options we have. Elsewhere I would say use an environment variable or config file. Are those options in the API server @stts & @deads2k ?

Nevermind, I misunderstood some things about the ordering during bootstrapping. It should be safe to assume the infrastructure resource exists when it's safe to create regular pods through the API.

Signed-off-by: Artyom Lukianov <alukiano@redhat.com>

openshift-ci-robot · 2021-05-25T12:28:20Z

@cynepco3hahue: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

374f6f0|UPSTREAM: : vendor config listers and informers: does not specify an upstream backport in the commit message
384cbdd|UPSTREAM: : Does not prevent pod creation because of no nodes reason when it runs under the regular cluster: does not specify an upstream backport in the commit message

cynepco3hahue · 2021-05-25T15:45:05Z

/retest

sttts · 2021-05-26T17:58:24Z

/lgtm

openshift-ci · 2021-05-26T17:58:39Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cynepco3hahue, deads2k, sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~DOWNSTREAM_OWNERS~~ [deads2k,sttts]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2021-05-27T13:02:47Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-05-27T13:26:44Z

/retest

Please review the full test history for this PR and help us cut down flakes.

cynepco3hahue · 2021-05-27T15:43:12Z

/retest

openshift-ci · 2021-05-27T15:52:29Z

@cynepco3hahue: The following test failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-aws-csi	`374f6f0`	link	`/test e2e-aws-csi`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci · 2021-05-27T17:32:12Z

@cynepco3hahue: All pull requests linked via external trackers have merged:

openshift/kubernetes#756

Bugzilla bug 1961925 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1961925: UPSTREAM: : Does not prevent pod creation because of no nodes reason when it runs under the regular cluster

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. label May 19, 2021

openshift-ci bot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels May 19, 2021

openshift-ci bot requested a review from wangke19 May 19, 2021 18:01

openshift-ci bot requested review from deads2k and sttts May 19, 2021 18:01

dhellmann reviewed May 19, 2021

View reviewed changes

deads2k reviewed May 19, 2021

View reviewed changes

openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 19, 2021

cynepco3hahue force-pushed the admission_plugin_sno_imrovement branch from ae1f6d7 to 59a40f5 Compare May 20, 2021 09:21

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 24, 2021

cynepco3hahue force-pushed the admission_plugin_sno_imrovement branch from 59a40f5 to 093d51d Compare May 25, 2021 11:25

sttts reviewed May 25, 2021

View reviewed changes

openshift-kube-apiserver/admission/autoscaling/managementcpusoverride/admission.go Outdated

Copy link

sttts May 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does nil mean? When can it be niil?

sttts reviewed May 25, 2021

View reviewed changes

cynepco3hahue force-pushed the admission_plugin_sno_imrovement branch from 093d51d to 727f445 Compare May 25, 2021 11:46

sttts reviewed May 25, 2021

View reviewed changes

UPSTREAM: <carry>: vendor config listers and informers

374f6f0

Signed-off-by: Artyom Lukianov <alukiano@redhat.com>

cynepco3hahue force-pushed the admission_plugin_sno_imrovement branch from 727f445 to 374f6f0 Compare May 25, 2021 12:28

openshift-ci bot assigned sttts May 26, 2021

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 26, 2021

sttts removed the backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. label May 27, 2021

openshift-merge-robot merged commit 4b2b6ff into openshift:master May 27, 2021

danwinship mentioned this pull request Jun 3, 2021

Bug 1953127: rebase k8s, undisable NetworkPolicy tests openshift/origin#26197

Closed

	pod.Annotations[workloadAdmissionWarning] = "only single node clusters are supported"
	pod.Annotations[workloadAdmissionWarning] = "only single-node clusters support workload partitioning"

Conversation

cynepco3hahue commented May 19, 2021

Uh oh!

openshift-ci bot commented May 19, 2021

Uh oh!

openshift-ci-robot commented May 19, 2021

Uh oh!

dhellmann left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sjenning commented May 19, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cynepco3hahue May 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

deads2k commented May 19, 2021

Uh oh!

cynepco3hahue commented May 20, 2021

Uh oh!

openshift-ci-robot commented May 20, 2021

Uh oh!

cynepco3hahue commented May 20, 2021

Uh oh!

cynepco3hahue commented May 20, 2021

Uh oh!

deads2k commented May 24, 2021

Uh oh!

sttts commented May 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented May 25, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-ci-robot commented May 25, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-ci-robot commented May 25, 2021

Uh oh!

cynepco3hahue commented May 25, 2021

Uh oh!

sttts commented May 26, 2021

Uh oh!

openshift-ci bot commented May 26, 2021

Uh oh!

openshift-bot commented May 27, 2021

Uh oh!

openshift-bot commented May 27, 2021

Uh oh!

cynepco3hahue commented May 27, 2021

Uh oh!

openshift-ci bot commented May 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented May 27, 2021

Uh oh!

Reviewers

Assignees

cynepco3hahue May 20, 2021 •

edited

Loading

sttts commented May 25, 2021 •

edited

Loading

openshift-ci bot commented May 27, 2021 •

edited

Loading