Skip to content

Review registration log traces for Provisioner#757

Merged
metal3-io-bot merged 1 commit intometal3-io:masterfrom
andfasano:registration-log-and-fix
Mar 24, 2021
Merged

Review registration log traces for Provisioner#757
metal3-io-bot merged 1 commit intometal3-io:masterfrom
andfasano:registration-log-and-fix

Conversation

@andfasano
Copy link
Copy Markdown
Member

@andfasano andfasano commented Jan 5, 2021

This PR introduce a debug logger for the provisioner, to avoid printing too many traces in the log during the steady cases.

@metal3-io-bot metal3-io-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 5, 2021
@dhellmann
Copy link
Copy Markdown
Member

How easy/hard is it to change the verbosity level of the log output to debug a problem? I've usually found that it's better to log a little more, and throw info away in the log aggregation tool, but there's not a hard rule about where that line is.

Copy link
Copy Markdown
Member

@zaneb zaneb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does logging at verbosity level 5 mean in practice?
I see we have a -dev flag to "enable developer logging", but no way to set the verbosity level explicitly AFAICT.


info.host.ClearError()
// Avoid to clean up other kind of errors
if info.host.Status.ErrorType == metal3v1alpha1.RegistrationError {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably also always clear the error if registeredNewCreds is true. WDYT?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting point. I assumed that the error was set only here during the registration. In case of credentials change the ValidateManagementAccess can fail in the update, but in such case the actions gets simply re-enqueued (409) or an actionError is returned, so it seems that such scenario the ClearError() shouldn't be required for the host.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Credentials can change at any time. If someone changes the password on the server prior to changing it in the BMH, then I think it's possible to see another kind of error.
It seems reasonable that once we know the creds have changed (and the change worked), we discard any assumptions about what we thought might have been broken before.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so if IIUC changing the creds could be considered a very "strong" kind of event that somehow resets the current situation (forcing anyhow implicitly a re-evaluation).

@andfasano
Copy link
Copy Markdown
Member Author

How easy/hard is it to change the verbosity level of the log output to debug a problem? I've usually found that it's better to log a little more, and throw info away in the log aggregation tool, but there's not a hard rule about where that line is.

Zap offers a couple of configuration options (see also https://pkg.go.dev/go.uber.org/zap#ex-package--BasicConfiguration) but as @zaneb correctly pointed out the verbosity level wasn't directly available. My original intention was also to have a smaller footprint assuming an optimistic case

@andfasano
Copy link
Copy Markdown
Member Author

What does logging at verbosity level 5 mean in practice?
I see we have a -dev flag to "enable developer logging", but no way to set the verbosity level explicitly AFAICT.

Following the logr docs indication the idea was to have a more technical debug level. I can see if we can use that flag instead

@andfasano
Copy link
Copy Markdown
Member Author

What does logging at verbosity level 5 mean in practice?
I see we have a -dev flag to "enable developer logging", but no way to set the verbosity level explicitly AFAICT.

Following the logr docs indication the idea was to have a more technical debug level. I can see if we can use that flag instead

I checked the current configuration and By default zap is configured to trace level 0 only (Info). -dev flag also enables tracing at level 1 (Debug), so it could be possible to switch to that level to reuse an existing BMO configuration without the need of adding a new one

@andfasano andfasano force-pushed the registration-log-and-fix branch from b74251b to 6ce9300 Compare January 7, 2021 09:40
@andfasano
Copy link
Copy Markdown
Member Author

Changed the implementation to use the -dev flag to print additional log traces

@andfasano andfasano requested a review from zaneb January 7, 2021 09:45
@andfasano
Copy link
Copy Markdown
Member Author

/test-integration

Comment thread pkg/provisioner/ironic/ironic.go Outdated
switch err.(type) {
case nil:
p.log.Info("found existing node by ID")
p.log.V(1).Info("found existing node by ID")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's set this up as a debugLog member in the ironicProvisioner struct.

@andfasano
Copy link
Copy Markdown
Member Author

/test-integration

return hb
}

func (hb *hostBuilder) SetStatusError(opStatus metal3v1alpha1.OperationalStatus, errType metal3v1alpha1.ErrorType, errMsg string, errCount int) *hostBuilder {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, does this need to be exported?

@andfasano andfasano force-pushed the registration-log-and-fix branch from 084037f to ef745ac Compare January 15, 2021 09:50
@metal3-io-bot metal3-io-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 15, 2021
@zaneb
Copy link
Copy Markdown
Member

zaneb commented Jan 15, 2021

I ended up with a patch to fix up the logs in the controller in #761 (some other stuff was dependent on moving it all around, so this will reduce conflicts). Stephen has included the change to limit which errors get reset in #762. So maybe let's reduce the scope of this PR to just the provisioner and get it merged?

@andfasano andfasano force-pushed the registration-log-and-fix branch from ef745ac to 15d1330 Compare January 15, 2021 16:35
@metal3-io-bot metal3-io-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 15, 2021
@andfasano andfasano changed the title Review registration log traces and clean error fix Review registration log traces for Provisioner Jan 15, 2021
Comment thread pkg/provisioner/ironic/ironic.go Outdated
Comment thread pkg/provisioner/ironic/ironic.go Outdated
Comment thread pkg/provisioner/ironic/ironic.go Outdated
@andfasano andfasano force-pushed the registration-log-and-fix branch from 15d1330 to 06584aa Compare March 12, 2021 10:50
@andfasano andfasano requested a review from zaneb March 12, 2021 10:50
@andfasano
Copy link
Copy Markdown
Member Author

/test-integration

@zaneb
Copy link
Copy Markdown
Member

zaneb commented Mar 23, 2021

/approve

@metal3-io-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andfasano, zaneb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@metal3-io-bot metal3-io-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 23, 2021
@hardys
Copy link
Copy Markdown
Member

hardys commented Mar 24, 2021

/lgtm

@metal3-io-bot metal3-io-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 24, 2021
@metal3-io-bot metal3-io-bot merged commit 0e6772b into metal3-io:master Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants