[openapi server] log exception in exception handler(2/N)#36201
[openapi server] log exception in exception handler(2/N)#36201vllm-bot merged 1 commit intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors exception handling by centralizing it in vllm/entrypoints/utils.py, which is a positive step towards better maintainability. My feedback focuses on improving the robustness of this new centralized logic. The current implementation relies on string matching on exception messages, which is fragile. I've suggested using custom exception types instead, which would make the error handling more explicit and resilient. I also pointed out an instance where overly broad exception catching could lead to mis-classifying server-side errors as client errors, and recommended more granular handling.
vllm/entrypoints/openai/run_batch.py
Outdated
| except Exception as e: | ||
| operation = "translation" if is_translation else "transcription" | ||
| return ErrorResponse( | ||
| error=ErrorInfo( | ||
| message=f"Failed to process {operation}: {str(e)}", | ||
| type="BadRequestError", | ||
| code=HTTPStatus.BAD_REQUEST.value, | ||
| ) | ||
| ) | ||
| raise Exception(f"Failed to process {operation}: {str(e)}") from e |
There was a problem hiding this comment.
Catching a broad Exception and re-raising it to be classified as a BadRequestError can be problematic. This try block covers operations like downloading from a URL, which can fail for various reasons (e.g., network issues, server-side problems with the URL host) that are not client errors. Classifying all such failures as a 400 Bad Request can be misleading and hide the true cause of the error. It would be better to have more granular exception handling to differentiate between client errors (like an invalid URL) and server-side or transient errors, and map them to appropriate HTTP status codes (e.g., 4xx vs 5xx).
vllm/entrypoints/utils.py
Outdated
| elif "No adapter found" in str(exc): | ||
| err_type = "NotFoundError" | ||
| status_code = HTTPStatus.NOT_FOUND | ||
| param = None | ||
| elif "translation" in str(exc) or "transcription" in str(exc): | ||
| err_type = "BadRequestError" | ||
| status_code = HTTPStatus.BAD_REQUEST | ||
| param = None |
There was a problem hiding this comment.
Using string matching on exception messages to determine error types is brittle and can lead to incorrect error classification. If the underlying error messages change, this logic will break. A more robust approach is to use custom exception types.
For example:
- For the 'No adapter found' case, a specific
LoRAAdapterNotFoundErrorcould be raised from the LoRA loading logic. - For transcription/translation failures, a custom
AudioProcessingErrorcould be raised fromrun_batch.py.
Then, you could check for these specific exception types here:
# from vllm.lora.exception import LoRAAdapterNotFoundError
# from vllm.entrypoints.openai.run_batch import AudioProcessingError
# ...
elif isinstance(exc, LoRAAdapterNotFoundError):
err_type = "NotFoundError"
status_code = HTTPStatus.NOT_FOUND
param = None
elif isinstance(exc, AudioProcessingError):
err_type = "BadRequestError"
status_code = HTTPStatus.BAD_REQUEST
param = NoneThis would make the error handling more explicit, maintainable, and resilient to changes.
|
Can create_error_response here be replaced with NotImplementedError? If yes, please do it. vllm/vllm/entrypoints/pooling/classify/api_router.py Lines 32 to 39 in c012a8c |
d4d8bf8 to
9c1bfd8
Compare
9c1bfd8 to
9c412c1
Compare
| try: | ||
| await self.engine_client.add_lora(lora_request) | ||
| except Exception as e: | ||
| error_type = "BadRequestError" |
There was a problem hiding this comment.
Normal Exception has changed from BadRequestError to InternalServerError, i.e., http response status code from 404 to 500
There was a problem hiding this comment.
For example, if --enable-lora arg is not set, exception occurred during load lora adapter will be:
{"error":{"message":"Call to add_lora method failed: LoRA is not enabled. Use --enable-lora to enable LoRA.","type":"InternalServerError","param":null,"code":500}}
8da82a3 to
e943ed1
Compare
| class AudioProcessingError(Exception): | ||
| """Exception raised when audio processing encounters an error. | ||
|
|
||
| This exception is used to handle various error conditions that may occur |
There was a problem hiding this comment.
I now think it's better to put exceptions in vllm/exceptions.py rather than keeping them separate.
There was a problem hiding this comment.
This code has been deleted.
e943ed1 to
b17623b
Compare
@noooop I have refactored all the occurances about to |
I think at least for pooling entrypoints, this change is acceptable. Because now the corresponding pooling entrypoints will only be mounted on entrypoints that support pooling tasks, I think this exception will hardly ever be triggered. vllm/vllm/entrypoints/pooling/__init__.py Lines 21 to 47 in bd2659a |
|
This pull request has merge conflicts that must be resolved before it can be |
b17623b to
f1c17b0
Compare
f1c17b0 to
f085cfe
Compare
4e44d3c to
6bbf10d
Compare
99c8d63 to
5a44de0
Compare
Head branch was pushed to by a user without write access
5a44de0 to
203ed93
Compare
Head branch was pushed to by a user without write access
d76e966 to
a298288
Compare
Signed-off-by: Andy Xie <andy.xning@gmail.com>
Head branch was pushed to by a user without write access
a298288 to
25bdf2e
Compare
|
@DarkLight1337 Seems the ci has nothing to do with this PR. can you help force merge this PR. |
…t#36201) Signed-off-by: Andy Xie <andy.xning@gmail.com>
…t#36201) Signed-off-by: Andy Xie <andy.xning@gmail.com>
Purpose
This is a follow-up PR for #31164
create_error_responseto thevllm.entrypoints.utils.create_error_response.BadRequestErrortoNotImplementedError.Test Plan
NA
Test Result
NA
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.