Skip to content

Conversation

@kikisdeliveryservice
Copy link
Contributor

KUBELET_LOG_LEVEL was bumped to level 4 to aid in
debugging some issue by node team (see #1672).
However customers upgrading to 4.5 in prod, are seeing this bump
causing extra GBs of data to be saved.

Closes: BZ 1895385

KUBELET_LOG_LEVEL was bumped to level 4 to aid in
debugging some issue by node team (see openshift#1672).
However customers upgrading to 4.5 in prod, are seeing this bump
causing extra GBs of data to be saved.

Closes: BZ 1895385
@openshift-ci-robot openshift-ci-robot added bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Nov 9, 2020
@openshift-ci-robot
Copy link
Contributor

@kikisdeliveryservice: This pull request references Bugzilla bug 1895385, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

Bug 1895385: Drop kubelet logging back down to level 3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 9, 2020
@kikisdeliveryservice
Copy link
Contributor Author

Also some question if 3 is too high? Or is 3 the necessary level for proper debugging of kubelet?

@mrobson
Copy link

mrobson commented Nov 9, 2020

From a large production cluster point of view, 3 is still too high for everyday running in production. OCP 3.x ran at log level 2.

Scott Worthington did some sampling on the issue on 4.5.17 in a 3 masters / 9 workers node cluster:
loglevel=2 for masters and workers: 14000 documents per 5 minutes in ES
loglevel=3 for masters and workers: 42000 documents per 5 minutes in ES
loglevel=4 for masters and workers: 126000 documents per 5 minutes in ES

There is also a big different in the level of detail / verbosity of each log line at level 4 versus level 2 which has additional storage implications.

@kikisdeliveryservice
Copy link
Contributor Author

kikisdeliveryservice commented Nov 9, 2020

From a large production cluster point of view, 3 is still too high for everyday running in production. OCP 3.x ran at log level 2.

Scott Worthington did some sampling on the issue on 4.5.17 in a 3 masters / 9 workers node cluster:
loglevel=2 for masters and workers: 14000 documents per 5 minutes in ES
loglevel=3 for masters and workers: 42000 documents per 5 minutes in ES
loglevel=4 for masters and workers: 126000 documents per 5 minutes in ES

There is also a big different in the level of detail / verbosity of each log line at level 4 versus level 2 which has additional storage implications.

I'm comfortable dropping down to 3 as that was the previous default in OCP 4.x. We can see if 2 makes sense to node team as we do need sufficient logging for troubleshooting clusters and dropping down to 2 may make it harder from a support perspective.

Overall, I'd rather just drop down to 3 now to fix the BZ and let node decide later after gathering info if they want to drop down to 2.

@rphillips
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 9, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kikisdeliveryservice, rphillips

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@kikisdeliveryservice
Copy link
Contributor Author

/skip

@kikisdeliveryservice
Copy link
Contributor Author

/cherry-pick release-4.6

@openshift-cherrypick-robot

@kikisdeliveryservice: once the present PR merges, I will cherry-pick it on top of release-4.6 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-4.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot
Copy link
Contributor

openshift-merge-robot commented Nov 10, 2020

@kikisdeliveryservice: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-ovn-step-registry 78417c6 link /test e2e-ovn-step-registry

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit da75bdf into openshift:master Nov 10, 2020
@openshift-ci-robot
Copy link
Contributor

@kikisdeliveryservice: All pull requests linked via external trackers have merged:

Bugzilla bug 1895385 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1895385: Drop kubelet logging back down to level 3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@kikisdeliveryservice: new pull request created: #2213

Details

In response to this:

/cherry-pick release-4.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants