Skip to content

[openapi server] log exception in exception handler(2/N)#36201

Merged
vllm-bot merged 1 commit intovllm-project:mainfrom
andyxning:log_http_exception_in_handler
Mar 11, 2026
Merged

[openapi server] log exception in exception handler(2/N)#36201
vllm-bot merged 1 commit intovllm-project:mainfrom
andyxning:log_http_exception_in_handler

Conversation

@andyxning
Copy link
Contributor

@andyxning andyxning commented Mar 6, 2026

Purpose

This is a follow-up PR for #31164

  1. Refactor models related exception handle just like other openapi
  2. Refactor create_error_response to the vllm.entrypoints.utils.create_error_response.
  3. Refactor all the handler is None openapi response status code from BadRequestError to NotImplementedError.

Test Plan

NA

Test Result

NA


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors exception handling by centralizing it in vllm/entrypoints/utils.py, which is a positive step towards better maintainability. My feedback focuses on improving the robustness of this new centralized logic. The current implementation relies on string matching on exception messages, which is fragile. I've suggested using custom exception types instead, which would make the error handling more explicit and resilient. I also pointed out an instance where overly broad exception catching could lead to mis-classifying server-side errors as client errors, and recommended more granular handling.

Comment on lines +638 to +640
except Exception as e:
operation = "translation" if is_translation else "transcription"
return ErrorResponse(
error=ErrorInfo(
message=f"Failed to process {operation}: {str(e)}",
type="BadRequestError",
code=HTTPStatus.BAD_REQUEST.value,
)
)
raise Exception(f"Failed to process {operation}: {str(e)}") from e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Catching a broad Exception and re-raising it to be classified as a BadRequestError can be problematic. This try block covers operations like downloading from a URL, which can fail for various reasons (e.g., network issues, server-side problems with the URL host) that are not client errors. Classifying all such failures as a 400 Bad Request can be misleading and hide the true cause of the error. It would be better to have more granular exception handling to differentiate between client errors (like an invalid URL) and server-side or transient errors, and map them to appropriate HTTP status codes (e.g., 4xx vs 5xx).

Comment on lines +330 to +337
elif "No adapter found" in str(exc):
err_type = "NotFoundError"
status_code = HTTPStatus.NOT_FOUND
param = None
elif "translation" in str(exc) or "transcription" in str(exc):
err_type = "BadRequestError"
status_code = HTTPStatus.BAD_REQUEST
param = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using string matching on exception messages to determine error types is brittle and can lead to incorrect error classification. If the underlying error messages change, this logic will break. A more robust approach is to use custom exception types.

For example:

  1. For the 'No adapter found' case, a specific LoRAAdapterNotFoundError could be raised from the LoRA loading logic.
  2. For transcription/translation failures, a custom AudioProcessingError could be raised from run_batch.py.

Then, you could check for these specific exception types here:

# from vllm.lora.exception import LoRAAdapterNotFoundError
# from vllm.entrypoints.openai.run_batch import AudioProcessingError

# ...

elif isinstance(exc, LoRAAdapterNotFoundError):
    err_type = "NotFoundError"
    status_code = HTTPStatus.NOT_FOUND
    param = None
elif isinstance(exc, AudioProcessingError):
    err_type = "BadRequestError"
    status_code = HTTPStatus.BAD_REQUEST
    param = None

This would make the error handling more explicit, maintainable, and resilient to changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@noooop
Copy link
Collaborator

noooop commented Mar 6, 2026

Can create_error_response here be replaced with NotImplementedError? If yes, please do it.

if handler is None:
error_response = create_error_response(
message="The model does not support Classification API"
)
return JSONResponse(
content=error_response.model_dump(),
status_code=error_response.error.code,
)

@andyxning andyxning force-pushed the log_http_exception_in_handler branch from d4d8bf8 to 9c1bfd8 Compare March 6, 2026 03:59
@andyxning andyxning requested a review from jeejeelee as a code owner March 6, 2026 03:59
@andyxning andyxning force-pushed the log_http_exception_in_handler branch from 9c1bfd8 to 9c412c1 Compare March 6, 2026 04:12
@andyxning andyxning requested a review from NickLucche as a code owner March 6, 2026 04:12
try:
await self.engine_client.add_lora(lora_request)
except Exception as e:
error_type = "BadRequestError"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normal Exception has changed from BadRequestError to InternalServerError, i.e., http response status code from 404 to 500

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, if --enable-lora arg is not set, exception occurred during load lora adapter will be:

{"error":{"message":"Call to add_lora method failed: LoRA is not enabled. Use --enable-lora to enable LoRA.","type":"InternalServerError","param":null,"code":500}}

@andyxning andyxning force-pushed the log_http_exception_in_handler branch 2 times, most recently from 8da82a3 to e943ed1 Compare March 9, 2026 02:57
Comment on lines +469 to +472
class AudioProcessingError(Exception):
"""Exception raised when audio processing encounters an error.

This exception is used to handle various error conditions that may occur
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now think it's better to put exceptions in vllm/exceptions.py rather than keeping them separate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code has been deleted.

@andyxning andyxning force-pushed the log_http_exception_in_handler branch from e943ed1 to b17623b Compare March 9, 2026 03:26
@andyxning andyxning requested a review from mgoin as a code owner March 9, 2026 03:26
@andyxning
Copy link
Contributor Author

Can create_error_response here be replaced with NotImplementedError? If yes, please do it.

if handler is None:
error_response = create_error_response(
message="The model does not support Classification API"
)
return JSONResponse(
content=error_response.model_dump(),
status_code=error_response.error.code,
)

@noooop I have refactored all the occurances about handler is None in the api from message to NotImplementedError exception. Thus, the response will be refactored from

{
    "error": {
        "message": "The model does not support Chat Completions API",
        "type": "BadRequestError",
        "param": null,
        "code": 400
    }
}

to

{
    "error": {
        "message": "The model does not support Chat Completions API",
        "type": "NotImplementedError",
        "param": null,
        "code": 501
    }
}

@noooop
Copy link
Collaborator

noooop commented Mar 9, 2026

Can create_error_response here be replaced with NotImplementedError? If yes, please do it.

if handler is None:
error_response = create_error_response(
message="The model does not support Classification API"
)
return JSONResponse(
content=error_response.model_dump(),
status_code=error_response.error.code,
)

@noooop I have refactored all the occurances about handler is None in the api from message to NotImplementedError exception. Thus, the response will be refactored from

{
    "error": {
        "message": "The model does not support Chat Completions API",
        "type": "BadRequestError",
        "param": null,
        "code": 400
    }
}

to

{
    "error": {
        "message": "The model does not support Chat Completions API",
        "type": "NotImplementedError",
        "param": null,
        "code": 501
    }
}

I think at least for pooling entrypoints, this change is acceptable. Because now the corresponding pooling entrypoints will only be mounted on entrypoints that support pooling tasks, I think this exception will hardly ever be triggered.

def register_pooling_api_routers(
app: FastAPI, supported_tasks: tuple["SupportedTask", ...]
):
from vllm.entrypoints.pooling.pooling.api_router import router as pooling_router
app.include_router(pooling_router)
if "classify" in supported_tasks:
from vllm.entrypoints.pooling.classify.api_router import (
router as classify_router,
)
app.include_router(classify_router)
if "embed" in supported_tasks:
from vllm.entrypoints.pooling.embed.api_router import router as embed_router
app.include_router(embed_router)
# Score/rerank endpoints are available for:
# - "score" task (cross-encoder models)
# - "embed" task (bi-encoder models)
# - "token_embed" task (late interaction models like ColBERT)
if any(t in supported_tasks for t in ("score", "embed", "token_embed")):
from vllm.entrypoints.pooling.score.api_router import router as score_router
app.include_router(score_router)

@mergify
Copy link

mergify bot commented Mar 9, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @andyxning.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Mar 9, 2026
@andyxning andyxning force-pushed the log_http_exception_in_handler branch from b17623b to f1c17b0 Compare March 9, 2026 03:35
@mergify mergify bot removed the needs-rebase label Mar 9, 2026
@andyxning andyxning force-pushed the log_http_exception_in_handler branch from f1c17b0 to f085cfe Compare March 9, 2026 03:47
@andyxning
Copy link
Contributor Author

/cc @DarkLight1337 @noooop

@andyxning andyxning force-pushed the log_http_exception_in_handler branch 2 times, most recently from 4e44d3c to 6bbf10d Compare March 10, 2026 03:25
@andyxning andyxning force-pushed the log_http_exception_in_handler branch 2 times, most recently from 99c8d63 to 5a44de0 Compare March 10, 2026 03:55
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) March 10, 2026 03:58
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 10, 2026
auto-merge was automatically disabled March 10, 2026 04:02

Head branch was pushed to by a user without write access

@andyxning andyxning force-pushed the log_http_exception_in_handler branch from 5a44de0 to 203ed93 Compare March 10, 2026 04:02
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) March 10, 2026 04:08
auto-merge was automatically disabled March 10, 2026 09:34

Head branch was pushed to by a user without write access

@andyxning andyxning force-pushed the log_http_exception_in_handler branch 2 times, most recently from d76e966 to a298288 Compare March 10, 2026 09:50
@noooop noooop enabled auto-merge (squash) March 10, 2026 10:45
Signed-off-by: Andy Xie <andy.xning@gmail.com>
auto-merge was automatically disabled March 10, 2026 11:46

Head branch was pushed to by a user without write access

@andyxning andyxning force-pushed the log_http_exception_in_handler branch from a298288 to 25bdf2e Compare March 10, 2026 11:46
@andyxning
Copy link
Contributor Author

@DarkLight1337 Seems the ci has nothing to do with this PR. can you help force merge this PR.

@vllm-bot vllm-bot merged commit fe714dd into vllm-project:main Mar 11, 2026
59 of 61 checks passed
@andyxning andyxning deleted the log_http_exception_in_handler branch March 11, 2026 03:27
wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026
fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants