Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus can't collect metrics from hubble-metrics using cilium hubble enable command #1303

Open
Shunpoco opened this issue Dec 19, 2022 · 3 comments
Assignees
Labels
kind/bug Something isn't working sig/hubble Impacts hubble

Comments

@Shunpoco
Copy link

Shunpoco commented Dec 19, 2022

Hi, I caught an unexpected behavior during running cilium hubble enable to enable hubble and gather its metrics using Prometheus.

Bug report

General Information

  • Cilium CLI version: I checked both v0.12.11 and the master branch
  • Orchestration system version in use: v1.25.4
  • Platform / infrastructure information: Building on VMs using kubeadm (kubernetes v1.23.9)

How to reproduce the issue

  1. Run cilium install with the options:
cilium install --helm-set prometheus.enabled=true --helm-set operator.prometheus.enabled=true
  1. Then run cilium enable hubble with the options:
cilium hubble enable --ui --helm-set hubble.metrics.enabled="{dns,drop,tcp,flow,icmp,http}"

Hubble resources are deployed, and hubble-metrics service is created.

Expected behavior
Prometheus can access to hubble-metrics (by default, port 9965) and can gather metrics.

Actual behavior
Prometheus didn't collect any metrics from the endpoint.

The cause of the problem

  1. The backend of hubble-metrics is the pod which has k8s-app=cilium (actually, this is cilium pods from cilium daemonset), and the target port is 9965 by default
  2. However, the daemonset doesn't expose the 9965 port:
$ kubectl get daemonsets.apps -n kube-system cilium -o yaml | grep -A20 ports
        ports:
        - containerPort: 4244
          hostPort: 4244
          name: peer-service
          protocol: TCP
        - containerPort: 9962
          hostPort: 9962
          name: prometheus
          protocol: TCP
        - containerPort: 9964
          hostPort: 9964
          name: envoy-metrics
          protocol: TCP
        readinessProbe:
...
  1. The cilium enable hubble command with --helm-set hubble.metrics.enabled={...} updates cilium-config configmap then restart cilium-xxx pods, and creates both hubble-peer and hubble-metrics service. However, it does not update cilium daemonset to add the port.
    We can see the behaviors around this part of the code: https://github.com/cilium/cilium-cli/blob/master/hubble/hubble.go#L627-L665

  2. As a result, because cilium pods don't expose their 9965 port, Prometheus can't collect metrics through hubble-metrics.

Proposal

In order to enable Prometheus for hubble using not only helm but also using cilium-cli, we should update cilium daemonset adding the port for hubble-metrics when we run cilium enable bubble --helm-set hubble.metrics.enabled={...}.
I assume that the adding code will be similar to updateConfigMap.

update: I found a similar issue: #412 .

@Shunpoco Shunpoco added the kind/bug Something isn't working label Dec 19, 2022
@Shunpoco
Copy link
Author

If the proposal is reasonable for you, I'd like to make a PR to fix the problem!

@rolinh
Copy link
Member

rolinh commented Jan 30, 2023

Thanks for the report!

If the proposal is reasonable for you, I'd like to make a PR to fix the problem!

It does seem reasonable to me and we'd sure have a look at a PR that fixes this issue!

@rolinh rolinh added the sig/hubble Impacts hubble label Jan 30, 2023
@Shunpoco
Copy link
Author

Shunpoco commented Feb 2, 2023

Thanks! If you have a time, please assign me to this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working sig/hubble Impacts hubble
Projects
None yet
Development

No branches or pull requests

2 participants