Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

500 response when scraping prometheus metrics in kubernetes with istio service mesh #4576

Closed
robinsillem opened this issue Aug 1, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@robinsillem
Copy link

Describe the bug

When running fluentd v1.17.0 within an istio service mesh attempts to scrape metrics fails with a 503 error and message "No async task available!"

N.B. istio aggregates the fluentd metrics with its own before presenting the results to prometheus

$ k exec -it fluentd-0 -c istio-proxy -n fluentd  -- bash -c "curl -v http://localhost:24231/aggregated_metrics"
*   Trying 127.0.0.1:24231...
* Connected to localhost (127.0.0.1) port 24231 (#0)
> GET /aggregated_metrics HTTP/1.1
> Host: localhost:24231
> User-Agent: curl/7.81.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 500 Internal Server Error
< Content-Type: text/plain
< Server: WEBrick/1.8.1 (Ruby/3.2.4/2024-04-23)
< Date: Wed, 31 Jul 2024 15:08:11 GMT
< Content-Length: 24
< Connection: Keep-Alive
< 
* Connection #0 to host localhost left intact
No async task available!

This behaviour does not occur with fluentd v1.16.5

To Reproduce

Deploy fluentd in an EKS cluster containing istio, so that the fluentd pods have istio sidecars

Exec into the proxy container and curl the aggregated_metrics endpoint

k exec -it fluentd-0 -c istio-proxy -n fluentd -- bash -c "curl -v http://localhost:24231/aggregated_metrics"

Expected behavior

The response contains the fluentd-* metrics, as it does when using the /metrics endpoint

Your Environment

- Fluentd version: v1.17.0
- Package version:
- Operating system: Alpine Linux v3.19
- Kernel version: 4.18.0-477.58.1.el8_8.x86_64

Your Configuration

<system>
        root_dir /var/log/fluent/
        workers 8
        log_level ${log_level}
      </system>

      <source>
        @type http
        port 9880
        bind 0.0.0.0
        body_size_limit 75m # To match the chunk_limit_size in the OS output config
        keepalive_timeout 10s
        add_http_headers true
        <parse>
          @type json
          time_format %iso8601
        </parse>
      </source>

      <source>
        @type prometheus
        @id in_prometheus
        bind "0.0.0.0"
        port 24231
        metrics_path "/metrics"
      </source>

      <source>
        @type prometheus_monitor
        @id in_prometheus_monitor
      </source>

      <source>
        @type prometheus_output_monitor
        @id in_prometheus_output_monitor
      </source>

etc.

Your Error Log

$ k exec -it fluentd-0 -c istio-proxy -n fluentd  -- bash -c "curl -v http://localhost:24231/aggregated_metrics"
*   Trying 127.0.0.1:24231...
* Connected to localhost (127.0.0.1) port 24231 (#0)
> GET /aggregated_metrics HTTP/1.1
> Host: localhost:24231
> User-Agent: curl/7.81.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 500 Internal Server Error
< Content-Type: text/plain
< Server: WEBrick/1.8.1 (Ruby/3.2.4/2024-04-23)
< Date: Wed, 31 Jul 2024 15:08:11 GMT
< Content-Length: 24
< Connection: Keep-Alive
< 
* Connection #0 to host localhost left intact
No async task available!


### Additional context

_No response_
@robinsillem robinsillem changed the title 503 response when scraping prometheus metrics in kubernetes with istio service mesh 500 response when scraping prometheus metrics in kubernetes with istio service mesh Aug 1, 2024
@daipom daipom added bug Something isn't working and removed waiting-for-triage labels Aug 14, 2024
@kenhys
Copy link
Contributor

kenhys commented Aug 14, 2024

It seems that this behavior was fixed by #4487.
Not released yet.

@daipom
Copy link
Contributor

daipom commented Aug 19, 2024

It will be fixed on v1.17.1 and v1.16.6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants