Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lesson 239: Use multiprocess.MultiProcessCollector from prometheus_client #412

Open
daxartio opened this issue Jan 20, 2025 · 3 comments
Open

Comments

@daxartio
Copy link

Since there are four workers ("gunicorn", "-w", "4"), meaning four separate processes, I recommend using multiprocess.MultiProcessCollector from prometheus_client. Otherwise, you’ll collect metrics from only one random process.

app.mount("/metrics", metrics_app)

https://prometheus.github.io/client_python/exporting/http/fastapi-gunicorn/

from fastapi import FastAPI
from prometheus_client import make_asgi_app

app = FastAPI(debug=False)

# Using multiprocess collector for registry
def make_metrics_app():
    registry = CollectorRegistry()
    multiprocess.MultiProcessCollector(registry)
    return make_asgi_app(registry=registry)


metrics_app = make_metrics_app()
app.mount("/metrics", metrics_app)
@daxartio
Copy link
Author

It’s better to use async handlers because sync handlers always run in a thread pool (which has 40 threads in AnyIO, if I’m not mistaken). However, since Python has a GIL and can run only one thread at a time, this creates unnecessary overhead.

@app.get("/healthz", response_class=PlainTextResponse)
async def health():
    return "OK"


@app.get("/api/devices", response_class=ORJSONResponse)
async def get_devices():

@daxartio
Copy link
Author

It’s not a good practice to do it like this:

async def get_db() -> AsyncGenerator[asyncpg.Connection, None]:
    async with db.get_connection() as conn:
        yield conn


@app.post("/api/devices", status_code=201, response_class=ORJSONResponse)
async def create_device(
    device: DeviceRequest, conn: PostgresDep, cache_client: MemcachedDep
):
    ...
    await conn...
    await cache_client...
    ...

Because await cache_client... executes after await conn..., meaning the connection is still in use while waiting for the cache operation. This prevents the connection from being returned to the pool in a timely manner.

A better approach is:

@app.post("/api/devices", status_code=201, response_class=ORJSONResponse)
async def create_device(
    device: DeviceRequest, db: PostgresDep, cache_client: MemcachedDep
):
    ...
    async with db.get_connection() as conn:
        await conn...
    await cache_client...
    ...

This way, the connection is properly returned to the pool before executing the cache operation, improving resource management.

@daxartio
Copy link
Author

I think Python is not faster than Go or Node.js, and that’s okay. But these changes will slightly improve performance and help manage expectations so people won’t be too disappointed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant