When using multiple workers with uvicorn, use `MultiProcessCollector` for Prometheus by Penagwin · Pull Request #11067 · BerriAI/litellm

Penagwin · 2025-05-22T23:27:46Z

When using multiple workers with uvicorn, use `MultiProcessCollector` for prometheus

Relevant issues

Fixes #10595

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
I have added a screenshot of my new test passing locally
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes

When using multiple workers with uvicorn, we need to use the MultiProcessCollector with prometheus. If we don't then each worker will have it's own metrics and uvicorn is going to round-robin meaning metrics will be changing as you bounce around different workers

How it works:

According to the docs: https://prometheus.github.io/client_python/multiprocess/
We must set PROMETHEUS_MULTIPROC_DIR before starting our application. Luckily this is possible in proxy_cli.py.
I wasn't sure how to check if we're going to be using prometheus, so I just used "prometheus" in _config.get("litellm_settings", {}).get("callbacks", [])

If we're using prometheus and multiple workers, and gunicorn, and the user does not specify PROMETHEUS_MULTIPROC_DIR - then a random directory is selected with tempfile.mkdtemp(prefix="litellm_prometheus_")

Uvicorn

I only tested with uvicorn. I saw references in the code to gunicorn but no references in the docs.

I did see hypercorn but haven't had a chance to test it.
However based on the changes, hypercorn will likely still have this same bug, and this PR only fixes uvicorn. I can fix hypercorn when I have time.

Testing

I wasn't sure how you'd like tests written for this. I could not find many examples of tests involving either run_server or prometheus.

https://github.com/search?q=repo%3ABerriAI%2Flitellm+path%3A%2F%5Etests%5C%2Flitellm%5C%2F%2F++run_server&type=code

https://github.com/search?q=repo%3ABerriAI%2Flitellm+path%3A%2F%5Etests%5C%2Flitellm%5C%2F%2F++prometheus&type=code

All tests pass, although tests/litellm/proxy/hooks/test_parallel_request_limiter_v2.py::test_normal_router_call_rpm is being flaky. I don't think anything in my PR would impact this test? It does pass and sometimes fails?

Screenshot of them all passing as requested:

Verifying

I've discovered this issue is made more complex because of connection keep alives.
To make it more obvious that the metrics are not being shared across workers, I used nginx to make sure the connections weren't reused. The problem still exists without this, but it's less obvious on the graphs.

See below for a basic docker setup.

Before

Start litellm with --num_workers 8
Run nginx and prometheus (see below)
Send several basic requests to gpt4o - the contents don't matter we're just watching litellm_total_tokens_total
Observe that litellm_total_tokens_total bounces all over the place in prometheus

After

Start litellm with --num_workers 8
Run nginx and prometheus (see below)
Send several basic requests to gpt4o
Observe that litellm_total_tokens_total only goes up in prometheus

Docker setup

Here is a basic setup for testing prometheus/nginx if you'd like to graph the numbers.
You can also spam refresh in the browser, but without nginx the browser will reuse the connection so it won't change as frequently, and it can be difficult to see the numbers change.

# Stop containers
docker stop nginx-proxy prometheus 2>/dev/null
docker rm nginx-proxy prometheus 2>/dev/null

# Create a custom network
docker network create prom-network

# Create nginx config
echo 'events { worker_connections 1024; }
http {
  upstream backend {
    server host.docker.internal:4000;
  }
  server {
    listen 8081;
    location /metrics {
      proxy_pass http://backend/metrics;
      proxy_http_version 1.0;
      proxy_set_header Connection "close";
      proxy_set_header Host $host;
      proxy_buffering off;
      proxy_connect_timeout 5s;
      proxy_send_timeout 5s;
      proxy_read_timeout 5s;
    }
  }
}' > nginx.conf

# Run nginx on custom network
docker run -d --name nginx-proxy \
  --network prom-network \
  --add-host=host.docker.internal:host-gateway \
  -p 8081:8081 \
  -v $(pwd)/nginx.conf:/etc/nginx/nginx.conf \
  nginx

# Create Prometheus config using container name (not host.docker.internal)
echo 'global:
  scrape_interval: 5s
scrape_configs:
  - job_name: "target-4000"
    scrape_interval: 5s
    scrape_timeout: 4s
    params:
      connection: ["close"]
    metric_relabel_configs:
      - source_labels: [__name__]
        target_label: scrape_time
        replacement: "{{ .Timestamp }}"
    static_configs:
      - targets: ["nginx-proxy:8081"]' > prometheus.yml

# Run Prometheus on same custom network
docker run -d \
  --name prometheus \
  --network prom-network \
  -p 9090:9090 \
  -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml \
  prom/prometheus

When using multiple workers with uvicorn, we need to use the `MultiProcessCollector` with prometheus. If we don't then each worker will have it's own metrics and uvicorn is going to round-robin meaning metrics will be changing as you bounce around different workers

vercel · 2025-05-22T23:27:52Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
litellm	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	May 22, 2025 11:45pm

CLAassistant · 2025-05-22T23:27:52Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Penagwin seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Penagwin · 2025-05-22T23:48:21Z

I'm not sure why mypy is upset, it seems okay on my machine?

khai-lash · 2025-07-03T10:10:28Z

Any Updates on this PR? @Penagwin

github-actions · 2025-10-03T00:01:47Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

rhoentier · 2025-11-20T08:31:07Z

We also have the problem and need a fix

matthias-huber · 2026-01-21T10:37:30Z

We successfully use the PRs content to patch our LiteLLM Setup with multiple worker. We started patching version 1.80.0-stable.1 until v1.80.11-stable.

We would really appreciate an integration of the PR in the upcoming releases.

@Penagwin

…setups When running LiteLLM proxy with multiple uvicorn workers and Prometheus callbacks enabled, automatically create and set PROMETHEUS_MULTIPROC_DIR so metrics are correctly aggregated across all worker processes. Fixes BerriAI#10595 Supersedes BerriAI#11067 — reimplemented against current codebase, crediting original author @Penagwin for the approach. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

jquinter · 2026-02-13T20:57:04Z

Closing this PR as the feature has already been implemented in main.

Evidence that the feature exists:

litellm/integrations/prometheus.py lines 3061-3068 already implement MultiProcessCollector for multi-worker setups
PROMETHEUS_MULTIPROC_DIR environment variable support is already present
Documentation at docs/my-website/docs/proxy/prometheus.md describes the multi-worker configuration

The functionality this PR was adding is now part of the codebase, so this PR is no longer needed.

Thank you for the contribution and identifying this need! The feature has been valuable to the community.

Penagwin added 2 commits May 22, 2025 18:43

Comment cleanup

93bd0db

vercel bot deployed to Preview May 22, 2025 23:28 View deployment

Fixed the metrics_endpoint to make mypy happy

63143fe

vercel bot deployed to Preview May 22, 2025 23:45 View deployment

github-actions bot added the stale label Oct 3, 2025

github-actions bot closed this Oct 10, 2025

AlexsanderHamir reopened this Nov 25, 2025

github-actions bot removed the stale label Nov 26, 2025

This was referenced Feb 11, 2026

fix(proxy): auto-configure PROMETHEUS_MULTIPROC_DIR for multi-worker setups #20911

Open

fix(bedrock): filter internal json_tool_call when mixed with real tools #20916

Closed

fix(bedrock): filter internal json_tool_call when mixed with real tools #21107

Merged

jquinter closed this Feb 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

When using multiple workers with uvicorn, use `MultiProcessCollector` for Prometheus#11067

When using multiple workers with uvicorn, use `MultiProcessCollector` for Prometheus#11067
Penagwin wants to merge 3 commits intoBerriAI:mainfrom
Penagwin:penagwin/multiprocess-prometheus

Penagwin commented May 22, 2025

Uh oh!

vercel bot commented May 22, 2025 •

edited

Loading

Uh oh!

CLAassistant commented May 22, 2025

Uh oh!

Penagwin commented May 22, 2025

Uh oh!

khai-lash commented Jul 3, 2025

Uh oh!

github-actions bot commented Oct 3, 2025

Uh oh!

rhoentier commented Nov 20, 2025

Uh oh!

matthias-huber commented Jan 21, 2026

Uh oh!

jquinter commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Uh oh!

Conversation

Penagwin commented May 22, 2025