When using multiple workers with uvicorn, use MultiProcessCollector for Prometheus#11067
When using multiple workers with uvicorn, use MultiProcessCollector for Prometheus#11067Penagwin wants to merge 3 commits intoBerriAI:mainfrom
MultiProcessCollector for Prometheus#11067Conversation
When using multiple workers with uvicorn, we need to use the `MultiProcessCollector` with prometheus. If we don't then each worker will have it's own metrics and uvicorn is going to round-robin meaning metrics will be changing as you bounce around different workers
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
|
Penagwin seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
|
I'm not sure why mypy is upset, it seems okay on my machine? |
|
Any Updates on this PR? @Penagwin |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
|
We also have the problem and need a fix |
|
We successfully use the PRs content to patch our LiteLLM Setup with multiple worker. We started patching version 1.80.0-stable.1 until v1.80.11-stable. We would really appreciate an integration of the PR in the upcoming releases. |
…setups When running LiteLLM proxy with multiple uvicorn workers and Prometheus callbacks enabled, automatically create and set PROMETHEUS_MULTIPROC_DIR so metrics are correctly aggregated across all worker processes. Fixes BerriAI#10595 Supersedes BerriAI#11067 — reimplemented against current codebase, crediting original author @Penagwin for the approach. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Closing this PR as the feature has already been implemented in main. Evidence that the feature exists:
The functionality this PR was adding is now part of the codebase, so this PR is no longer needed. Thank you for the contribution and identifying this need! The feature has been valuable to the community. |
When using multiple workers with uvicorn, use
MultiProcessCollectorfor prometheusRelevant issues
Fixes #10595
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unitType
🐛 Bug Fix
Changes
When using multiple workers with uvicorn, we need to use the
MultiProcessCollectorwith prometheus. If we don't then each worker will have it's own metrics and uvicorn is going to round-robin meaning metrics will be changing as you bounce around different workersHow it works:
According to the docs: https://prometheus.github.io/client_python/multiprocess/
We must set
PROMETHEUS_MULTIPROC_DIRbefore starting our application. Luckily this is possible inproxy_cli.py.I wasn't sure how to check if we're going to be using prometheus, so I just used
"prometheus" in _config.get("litellm_settings", {}).get("callbacks", [])If we're using prometheus and multiple workers, and gunicorn, and the user does not specify
PROMETHEUS_MULTIPROC_DIR- then a random directory is selected withtempfile.mkdtemp(prefix="litellm_prometheus_")Uvicorn
I only tested with
uvicorn. I saw references in the code togunicornbut no references in the docs.I did see
hypercornbut haven't had a chance to test it.However based on the changes,
hypercornwill likely still have this same bug, and this PR only fixesuvicorn. I can fixhypercornwhen I have time.Testing
I wasn't sure how you'd like tests written for this. I could not find many examples of tests involving either
run_serverorprometheus.https://github.com/search?q=repo%3ABerriAI%2Flitellm+path%3A%2F%5Etests%5C%2Flitellm%5C%2F%2F++run_server&type=code
https://github.com/search?q=repo%3ABerriAI%2Flitellm+path%3A%2F%5Etests%5C%2Flitellm%5C%2F%2F++prometheus&type=code
All tests pass, although
tests/litellm/proxy/hooks/test_parallel_request_limiter_v2.py::test_normal_router_call_rpmis being flaky. I don't think anything in my PR would impact this test? It does pass and sometimes fails?Screenshot of them all passing as requested:

Verifying
I've discovered this issue is made more complex because of connection keep alives.
To make it more obvious that the metrics are not being shared across workers, I used nginx to make sure the connections weren't reused. The problem still exists without this, but it's less obvious on the graphs.
See below for a basic docker setup.
Before
--num_workers 8litellm_total_tokens_totallitellm_total_tokens_totalbounces all over the place in prometheusAfter
--num_workers 8litellm_total_tokens_totalonly goes up in prometheusDocker setup
Here is a basic setup for testing prometheus/nginx if you'd like to graph the numbers.
You can also spam refresh in the browser, but without nginx the browser will reuse the connection so it won't change as frequently, and it can be difficult to see the numbers change.