Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a metrics report to IO gatherers #275

Merged
merged 2 commits into from
Dec 4, 2020

Conversation

rluders
Copy link
Contributor

@rluders rluders commented Dec 3, 2020

It adds some metrics to log and to generated archive that collects some information about the gathers executions:

  • Elapsed time;
  • Number of records generated;
  • List of errors;

The report is stored at the archive, in the file insights-operator\gathers.json

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 3, 2020
@rluders
Copy link
Contributor Author

rluders commented Dec 3, 2020

/retest

@0sewa0
Copy link
Contributor

0sewa0 commented Dec 4, 2020

/lgtm

tested it locally and it produced the following metrics: (one of the entries has "errors": [] while every other has "errors": null which is a bit weird, but I don't think its because of this PR)

[
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherPodDisruptionBudgets.func1",
        "elapsed": 138000000,
        "report": 2,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherMostRecentMetrics.func1",
        "elapsed": 41000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterOperators.func1",
        "elapsed": 3333000000,
        "report": 48,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherContainerImages.func1",
        "elapsed": 427000000,
        "report": 12,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherNodes.func1",
        "elapsed": 585000000,
        "report": 6,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherConfigMaps.func1",
        "elapsed": 131000000,
        "report": 10,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterVersion.func1",
        "elapsed": 125000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterID.func1",
        "elapsed": 0,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterInfrastructure.func1",
        "elapsed": 131000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterNetwork.func1",
        "elapsed": 124000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterAuthentication.func1",
        "elapsed": 122000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterImageRegistry.func1",
        "elapsed": 127000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterImagePruner.func1",
        "elapsed": 127000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterFeatureGates.func1",
        "elapsed": 123000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterOAuth.func1",
        "elapsed": 181000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterIngress.func1",
        "elapsed": 123000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterProxy.func1",
        "elapsed": 121000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherCertificateSigningRequests.func1",
        "elapsed": 178000000,
        "report": 0,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherCRD.func1",
        "elapsed": 273000000,
        "report": 2,
        "errors": []
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherHostSubnet.func1",
        "elapsed": 126000000,
        "report": 6,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherMachineSet.func1",
        "elapsed": 169000000,
        "report": 3,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherInstallPlans.func1",
        "elapsed": 9693000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherServiceAccounts.func1",
        "elapsed": 10326000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherMachineConfigPool.func1",
        "elapsed": 217000000,
        "report": 2,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherContainerRuntimeConfig.func1",
        "elapsed": 122000000,
        "report": 0,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherStatefulSets.func1",
        "elapsed": 10274000000,
        "report": 2,
        "errors": null
    }
]

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 0sewa0, rluders

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2020
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@tremes
Copy link
Contributor

tremes commented Dec 4, 2020

Is the duration above in ms ? It should be obvious from the file IMO. Do we need to have the name with the github.com/openshift/insights-operator prefix?

@rluders
Copy link
Contributor Author

rluders commented Dec 4, 2020

/lgtm

tested it locally and it produced the following metrics: (one of the entries has "errors": [] while every other has "errors": null which is a bit weird, but I don't think its because of this PR)

[
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherPodDisruptionBudgets.func1",
        "elapsed": 138000000,
        "report": 2,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherMostRecentMetrics.func1",
        "elapsed": 41000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterOperators.func1",
        "elapsed": 3333000000,
        "report": 48,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherContainerImages.func1",
        "elapsed": 427000000,
        "report": 12,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherNodes.func1",
        "elapsed": 585000000,
        "report": 6,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherConfigMaps.func1",
        "elapsed": 131000000,
        "report": 10,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterVersion.func1",
        "elapsed": 125000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterID.func1",
        "elapsed": 0,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterInfrastructure.func1",
        "elapsed": 131000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterNetwork.func1",
        "elapsed": 124000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterAuthentication.func1",
        "elapsed": 122000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterImageRegistry.func1",
        "elapsed": 127000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterImagePruner.func1",
        "elapsed": 127000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterFeatureGates.func1",
        "elapsed": 123000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterOAuth.func1",
        "elapsed": 181000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterIngress.func1",
        "elapsed": 123000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherClusterProxy.func1",
        "elapsed": 121000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherCertificateSigningRequests.func1",
        "elapsed": 178000000,
        "report": 0,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherCRD.func1",
        "elapsed": 273000000,
        "report": 2,
        "errors": []
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherHostSubnet.func1",
        "elapsed": 126000000,
        "report": 6,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherMachineSet.func1",
        "elapsed": 169000000,
        "report": 3,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherInstallPlans.func1",
        "elapsed": 9693000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherServiceAccounts.func1",
        "elapsed": 10326000000,
        "report": 1,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherMachineConfigPool.func1",
        "elapsed": 217000000,
        "report": 2,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherContainerRuntimeConfig.func1",
        "elapsed": 122000000,
        "report": 0,
        "errors": null
    },
    {
        "name": "github.com/openshift/insights-operator/pkg/gather/clusterconfig.GatherStatefulSets.func1",
        "elapsed": 10274000000,
        "report": 2,
        "errors": null
    }
]

Yeah, it is weird. I can omit the errors if it is null, but, handle it as a [] is also possible, just messy. I know that there is a proposal to implement something to handle this cases at the encode/json package, but... so far, I guess that not progress on this one.

Do we think that let it as null could cause any issue?

@rluders
Copy link
Contributor Author

rluders commented Dec 4, 2020

Is the duration above in ms ? It should be obvious from the file IMO. Do we need to have the name with the github.com/openshift/insights-operator prefix?

Nope, time.Duration will always use nanoseconds.
Yes, we already use the full name and I guess that it is a good idea to keep it like this, 'cause the package is important to know where the gather lives.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@tremes
Copy link
Contributor

tremes commented Dec 4, 2020

@rluders Well I would divide it and use elapsed_in_sec or elapsed_in_ms.

@tremes
Copy link
Contributor

tremes commented Dec 4, 2020

For the errors the omitempty would be better IMO.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@rluders
Copy link
Contributor Author

rluders commented Dec 4, 2020

@rluders Well I would divide it and use elapsed_in_sec or elapsed_in_ms.

I don't know. Why both if you can have just one and do the maths? Not that complicated.

@openshift-merge-robot openshift-merge-robot merged commit 9aed455 into openshift:master Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants