Skip to content

When using multiple workers with uvicorn, use MultiProcessCollector for Prometheus#11067

Closed
Penagwin wants to merge 3 commits intoBerriAI:mainfrom
Penagwin:penagwin/multiprocess-prometheus
Closed

When using multiple workers with uvicorn, use MultiProcessCollector for Prometheus#11067
Penagwin wants to merge 3 commits intoBerriAI:mainfrom
Penagwin:penagwin/multiprocess-prometheus

Conversation

@Penagwin
Copy link

When using multiple workers with uvicorn, use MultiProcessCollector for prometheus

Relevant issues

Fixes #10595

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes

When using multiple workers with uvicorn, we need to use the MultiProcessCollector with prometheus. If we don't then each worker will have it's own metrics and uvicorn is going to round-robin meaning metrics will be changing as you bounce around different workers

How it works:

According to the docs: https://prometheus.github.io/client_python/multiprocess/
We must set PROMETHEUS_MULTIPROC_DIR before starting our application. Luckily this is possible in proxy_cli.py.
I wasn't sure how to check if we're going to be using prometheus, so I just used "prometheus" in _config.get("litellm_settings", {}).get("callbacks", [])

If we're using prometheus and multiple workers, and gunicorn, and the user does not specify PROMETHEUS_MULTIPROC_DIR - then a random directory is selected with tempfile.mkdtemp(prefix="litellm_prometheus_")

Uvicorn

I only tested with uvicorn. I saw references in the code to gunicorn but no references in the docs.

I did see hypercorn but haven't had a chance to test it.
However based on the changes, hypercorn will likely still have this same bug, and this PR only fixes uvicorn. I can fix hypercorn when I have time.

Testing

I wasn't sure how you'd like tests written for this. I could not find many examples of tests involving either run_server or prometheus.

https://github.com/search?q=repo%3ABerriAI%2Flitellm+path%3A%2F%5Etests%5C%2Flitellm%5C%2F%2F++run_server&type=code

https://github.com/search?q=repo%3ABerriAI%2Flitellm+path%3A%2F%5Etests%5C%2Flitellm%5C%2F%2F++prometheus&type=code

All tests pass, although tests/litellm/proxy/hooks/test_parallel_request_limiter_v2.py::test_normal_router_call_rpm is being flaky. I don't think anything in my PR would impact this test? It does pass and sometimes fails?

Screenshot of them all passing as requested:
image

Verifying

I've discovered this issue is made more complex because of connection keep alives.
To make it more obvious that the metrics are not being shared across workers, I used nginx to make sure the connections weren't reused. The problem still exists without this, but it's less obvious on the graphs.

See below for a basic docker setup.

Before

  1. Start litellm with --num_workers 8
  2. Run nginx and prometheus (see below)
  3. Send several basic requests to gpt4o - the contents don't matter we're just watching litellm_total_tokens_total
  4. Observe that litellm_total_tokens_total bounces all over the place in prometheus
Screenshot 2025-05-22 at 6 21 33 PM

After

  1. Start litellm with --num_workers 8
  2. Run nginx and prometheus (see below)
  3. Send several basic requests to gpt4o
  4. Observe that litellm_total_tokens_total only goes up in prometheus
Screenshot 2025-05-22 at 6 25 39 PM

Docker setup

Here is a basic setup for testing prometheus/nginx if you'd like to graph the numbers.
You can also spam refresh in the browser, but without nginx the browser will reuse the connection so it won't change as frequently, and it can be difficult to see the numbers change.

# Stop containers
docker stop nginx-proxy prometheus 2>/dev/null
docker rm nginx-proxy prometheus 2>/dev/null

# Create a custom network
docker network create prom-network

# Create nginx config
echo 'events { worker_connections 1024; }
http {
  upstream backend {
    server host.docker.internal:4000;
  }
  server {
    listen 8081;
    location /metrics {
      proxy_pass http://backend/metrics;
      proxy_http_version 1.0;
      proxy_set_header Connection "close";
      proxy_set_header Host $host;
      proxy_buffering off;
      proxy_connect_timeout 5s;
      proxy_send_timeout 5s;
      proxy_read_timeout 5s;
    }
  }
}' > nginx.conf

# Run nginx on custom network
docker run -d --name nginx-proxy \
  --network prom-network \
  --add-host=host.docker.internal:host-gateway \
  -p 8081:8081 \
  -v $(pwd)/nginx.conf:/etc/nginx/nginx.conf \
  nginx

# Create Prometheus config using container name (not host.docker.internal)
echo 'global:
  scrape_interval: 5s
scrape_configs:
  - job_name: "target-4000"
    scrape_interval: 5s
    scrape_timeout: 4s
    params:
      connection: ["close"]
    metric_relabel_configs:
      - source_labels: [__name__]
        target_label: scrape_time
        replacement: "{{ .Timestamp }}"
    static_configs:
      - targets: ["nginx-proxy:8081"]' > prometheus.yml

# Run Prometheus on same custom network
docker run -d \
  --name prometheus \
  --network prom-network \
  -p 9090:9090 \
  -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml \
  prom/prometheus

Penagwin added 2 commits May 22, 2025 18:43
When using multiple workers with uvicorn, we need to use the `MultiProcessCollector` with prometheus. If we don't then each worker will have it's own metrics and uvicorn is going to round-robin meaning metrics will be changing as you bounce around different workers
@vercel
Copy link

vercel bot commented May 22, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 22, 2025 11:45pm

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Penagwin seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Penagwin
Copy link
Author

I'm not sure why mypy is upset, it seems okay on my machine?

@khai-lash
Copy link

Any Updates on this PR? @Penagwin

@github-actions
Copy link
Contributor

github-actions bot commented Oct 3, 2025

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale label Oct 3, 2025
@github-actions github-actions bot closed this Oct 10, 2025
@rhoentier
Copy link

We also have the problem and need a fix

@github-actions github-actions bot removed the stale label Nov 26, 2025
@matthias-huber
Copy link

We successfully use the PRs content to patch our LiteLLM Setup with multiple worker. We started patching version 1.80.0-stable.1 until v1.80.11-stable.

We would really appreciate an integration of the PR in the upcoming releases.

jquinter added a commit to jquinter/litellm that referenced this pull request Feb 11, 2026
…setups

When running LiteLLM proxy with multiple uvicorn workers and Prometheus
callbacks enabled, automatically create and set PROMETHEUS_MULTIPROC_DIR
so metrics are correctly aggregated across all worker processes.

Fixes BerriAI#10595
Supersedes BerriAI#11067 — reimplemented against current codebase, crediting
original author @Penagwin for the approach.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@jquinter
Copy link
Contributor

Closing this PR as the feature has already been implemented in main.

Evidence that the feature exists:

  • litellm/integrations/prometheus.py lines 3061-3068 already implement MultiProcessCollector for multi-worker setups
  • PROMETHEUS_MULTIPROC_DIR environment variable support is already present
  • Documentation at docs/my-website/docs/proxy/prometheus.md describes the multi-worker configuration

The functionality this PR was adding is now part of the codebase, so this PR is no longer needed.

Thank you for the contribution and identifying this need! The feature has been valuable to the community.

@jquinter jquinter closed this Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Prometheus metrics aren't shared across Uvicorn workers

7 participants