Skip to content

Conversation

@rphillips
Copy link
Contributor

@rphillips rphillips commented Nov 30, 2020

- What I did
Takes Ben's suggestion and refactors the Kubelet log level into a systemd environment file.

Obsoletes: #2261

This also changes the default of the log level to 2, since stale logs are printed into the kubelet log at level 3: BZ1903290.

Allows the logLevel to be configured on the KubeletConfig CRD:

apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: set-kubelet-loglevel
spec:
  logLevel: 2
  machineConfigPoolSelector:
    matchLabels:
      pools.operator.machineconfiguration.openshift.io/worker: ""

- How to verify it

- Description for the changelog

@sjenning
Copy link
Contributor

/cc @darkmuggle

Copy link

@darkmuggle darkmuggle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than indentation, this looks good.

@kikisdeliveryservice
Copy link
Contributor

vpc exceeded error, reported in 4-dev

@kikisdeliveryservice
Copy link
Contributor

/retest

@darkmuggle
Copy link

/approve
But I'll leave the LGTM to a member of the MCO team.

@rphillips
Copy link
Contributor Author

/retest

@kikisdeliveryservice
Copy link
Contributor

aws/gcp are having issues across the board today would like to see passing runs before approving

@rphillips rphillips changed the title kubelet: refactor KUBELET_LOG_LEVEL into systemd environment file Bug 1903290: kubelet: refactor KUBELET_LOG_LEVEL into systemd environment file Dec 2, 2020
@openshift-ci-robot
Copy link
Contributor

@rphillips: This pull request references Bugzilla bug 1903290, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

Bug 1903290: kubelet: refactor KUBELET_LOG_LEVEL into systemd environment file

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Dec 2, 2020
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought level 2 was not sufficient for debugging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is that level 3 is logging dead pod logs into the kubelet log. Instead of chasing all level 3 logs... I'm proposing just setting it to level 2. This new approach with a kubelog.conf would allow us to manage the kubelet log level within the kubeletconfig perhaps.

Log Line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to changing the log line to be 4, but it seems like whack-a-mole.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's consensus that level work works between the experts (ie you and @sjenning) I'm ok with it, just want to make sure this isn't going to hurt us later.

@sjenning wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kikisdeliveryservice I'm on board with this

@openshift-ci-robot
Copy link
Contributor

@rphillips: This pull request references Bugzilla bug 1903290, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

Bug 1903290: kubelet: refactor KUBELET_LOG_LEVEL into systemd environment file

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kikisdeliveryservice
Copy link
Contributor

uh since ci is still underwater could you update: https://github.com/openshift/machine-config-operator/blob/master/docs/KubeletConfigDesign.md

@sjenning sjenning changed the title Bug 1903290: kubelet: refactor KUBELET_LOG_LEVEL into systemd environment file Bug 1903290: kubelet: refactor KUBELET_LOG_LEVEL into KubeletConfig field Dec 2, 2020
@kikisdeliveryservice
Copy link
Contributor

unit and verify failures don't look to be flakes..

@rphillips
Copy link
Contributor Author

Updated to fix the unit tests and allow support for only setting the logLevel

@openshift-ci-robot
Copy link
Contributor

@rphillips: This pull request references Bugzilla bug 1903290, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

Bug 1903290: kubelet: refactor KUBELET_LOG_LEVEL into KubeletConfig field

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rphillips
Copy link
Contributor Author

Vpc limit

/retest

@kikisdeliveryservice
Copy link
Contributor

@rphillips brought this up again in 4-dev no aws resolution yet ☹️

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2020
@kikisdeliveryservice
Copy link
Contributor

/skip

Copy link
Contributor

@kikisdeliveryservice kikisdeliveryservice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested & seems to work ok - made kubelet config applied, then changed kubelet logging seemed to change appropriately.

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 3, 2020
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@sjenning
Copy link
Contributor

sjenning commented Dec 4, 2020

still hitting VPC limits issues 😞

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@kikisdeliveryservice
Copy link
Contributor

Since Seth found an issue, I think we should fix it first then merge

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 4, 2020
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2020
@rphillips
Copy link
Contributor Author

/hold cancel

fixed the issue.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 4, 2020
@sjenning
Copy link
Contributor

sjenning commented Dec 4, 2020

Trying it out now. Will lgtm when verified.

@sjenning
Copy link
Contributor

sjenning commented Dec 4, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: darkmuggle, kikisdeliveryservice, rphillips, sjenning

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [kikisdeliveryservice]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@kikisdeliveryservice
Copy link
Contributor

/skip

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 19b904d into openshift:master Dec 5, 2020
@openshift-ci-robot
Copy link
Contributor

@rphillips: All pull requests linked via external trackers have merged:

Bugzilla bug 1903290 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1903290: kubelet: refactor KUBELET_LOG_LEVEL into KubeletConfig field

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rphillips
Copy link
Contributor Author

/cherry-pick release-4.6

@openshift-cherrypick-robot

@rphillips: #2262 failed to apply on top of branch "release-4.6":

Applying: kubelet: refactor KUBELET_LOG_LEVEL into systemd environment file
Using index info to reconstruct a base tree...
M	install/0000_80_machine-config-operator_01_kubeletconfig.crd.yaml
M	pkg/apis/machineconfiguration.openshift.io/v1/types.go
M	pkg/apis/machineconfiguration.openshift.io/v1/zz_generated.deepcopy.go
M	pkg/controller/kubelet-config/helpers.go
M	pkg/controller/kubelet-config/kubelet_config_controller.go
M	pkg/controller/kubelet-config/kubelet_config_controller_test.go
A	templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
A	templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
Falling back to patching base and 3-way merge...
Auto-merging pkg/controller/kubelet-config/kubelet_config_controller_test.go
Auto-merging pkg/controller/kubelet-config/kubelet_config_controller.go
CONFLICT (content): Merge conflict in pkg/controller/kubelet-config/kubelet_config_controller.go
Auto-merging pkg/controller/kubelet-config/helpers.go
Auto-merging pkg/apis/machineconfiguration.openshift.io/v1/zz_generated.deepcopy.go
Auto-merging pkg/apis/machineconfiguration.openshift.io/v1/types.go
Auto-merging install/0000_80_machine-config-operator_01_kubeletconfig.crd.yaml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 kubelet: refactor KUBELET_LOG_LEVEL into systemd environment file
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Details

In response to this:

/cherry-pick release-4.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants