-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add scrape config for Service Catalog controller #18694
Conversation
675b180
to
c98a943
Compare
/retest |
1 similar comment
/retest |
/test extended_clusterup |
@mfojtik are you the appropriate person to review this? This change enables Prometheus to pull metrics from Service Catalog. |
@ironcladlou, can you review this? |
out of curiosity is there a place to see sample output of a single scrape? |
Nothing stands out at me as wrong here, but @jeremyeder or @zgalor might be better positioned to review this config in the context of our current Prometheus deployment re: the labelling. |
@jeremyeder A limited set from Catalog:
and another 40 or so from Go and Process info:
And that is why I'm thinking about dropping the Go and Process stats (ie our email). And yes, its cluster wide, only a single Catalog Controller Manager is ever actively answering scrapes. |
Personally haven't used, and haven't heard of anyone else using those go stats (so, up to you). Process stats we already get get from cadvisor as long as it runs in a systemd unit or as a pod. |
Yes, its run within a pod. I'll make the changes upstream to remove both. |
One thing that comes to mind is if it will ever return non-2xx codes? non-2xx is something we could not only alert on, but also trend over time / over versions of SB. I just didn't see 4xx or 5xx in your example, is why I'm asking. |
My sample metrics are too simple. The servicecatalog_osb_request_count metric is dynamicly grouping counts into 2xx, 3xx, 4xx, 5xx buckets: https://github.com/kubernetes-incubator/service-catalog/blob/master/pkg/metrics/osbclientproxy/osbproxy.go#L159-L170 We should be able to alert on these, that is certainly one my upcoming tasks. |
examples/prometheus/prometheus.yaml
Outdated
- role: pod | ||
|
||
relabel_configs: | ||
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_pod_name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're only looking for pods in the kube-service-catalog namespace, it would be better to tell it upfront in the k8s sd configuration rather than use relabeling:
- job_name: 'openshift-service-catalog'
scheme: http
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- kube-service-catalog
relabel_configs:
- source_labels: [__meta_kubernetes_pod_name]
action: keep
regex: controller-manager-(.+)
I've got #17683 pending for fixing the existing jobs but it needs a rebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, that makes sense, thanks for the details. I've updated the config accordingly.
c98a943
to
1ef4fb1
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jboyd01 Assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1ef4fb1
to
40bc1bd
Compare
/test extended_conformance_install |
/test extended_image_ecosystem |
@simonpasquier or @jeremyeder all review comments have been addressed. Can I get a final review and if all is good a merge? Thanks! |
It was pointed out that I'm exposing metrics over HTTP and it should instead be secured. |
@jboyd01: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
closed in favor of #19286 |
new scrape configuration for pulling metrics from Service Catalog