[openai api] log exception in exception handler (1/N)#31164
[openai api] log exception in exception handler (1/N)#31164DarkLight1337 merged 1 commit intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request modifies exception handling and logging across vllm/entrypoints/launcher.py and vllm/entrypoints/openai/api_server.py. In launcher.py, a logger.exception call was added to the runtime_exception_handler for RuntimeError, EngineDeadError, and EngineGenerateError, and the exception parameter was explicitly named exc. In api_server.py, a logger.exception call was removed from an except block handling create_messages errors, while new logger.exception calls were added to the http_exception_handler and validation_exception_handler for HTTPException and RequestValidationError respectively.
|
This pull request has merge conflicts that must be resolved before it can be |
1c67435 to
81c268f
Compare
81c268f to
7dc44f9
Compare
|
I dont think this can be on by default --- there could be a lot of logs from users creating malformed requests |
|
How about adding a new env |
|
This is quite useful when used in debug. |
a52e66b to
2a099ba
Compare
|
@robertgshaw2-redhat PTAL. |
|
Friendly ping. @robertgshaw2-redhat |
DarkLight1337
left a comment
There was a problem hiding this comment.
Isn't this controlled by FrontendArgs.log_error_stack already?
|
Thanks for the tips. Some api does not correctly handle handler exception thus the exception will not be logged. The code blocks shows that the exception is recorded with Will dig this later. |
3f0c514 to
fc56b85
Compare
|
Only |
|
Guard |
|
I am ok with this change but let's have a second opinion from @njhill @robertgshaw2-redhat |
fc56b85 to
92187e8
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
92187e8 to
e2188f6
Compare
5b1cb62 to
6ac7c29
Compare
|
|
||
| try: | ||
| generator = await handler.create_classify(request, raw_request) | ||
| except Exception as e: | ||
| generator = handler.create_error_response(e) | ||
| generator = await handler.create_classify(request, raw_request) | ||
|
|
There was a problem hiding this comment.
awesome work!
I am currently refactoring the pooling entrypoints #35604, which may conflict with this PR. Below are three options.
- This PR does not modify the pooling entrypoints, waiting for [Frontend][1/n] Improve pooling entrypoints | classify. #35604 to land, then another PR will make the modifications for pooling entrypoints.
2. [Frontend][1/n] Improve pooling entrypoints | classify. #35604 wait for this PR to land.
3. First, merge [Frontend][1/n] Improve pooling entrypoints | classify. #35604, then refactor the exception handler of the pooling entrypoints, followed by refactoring the other parts of this pr.
I prefer option 3 because this PR has a large scope and is difficult to land. If we only modify the pooling entrypoints, it might be simpler. (Also, the exception handler in #35604 is handled in one place, and I believe it is the direction for future API server optimization.
There was a problem hiding this comment.
Sorry, I am currently refactoring the pooling entrypoints #35604, which may conflict with this PR. Please do not modify the vllm/entrypoints/pooling/ section for the time being to avoid conflicts.
I have separated a create_error_response in vllm/entrypoints/utils.py, which you can use.
https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/utils.py#L305
|
/cc @njhill |
|
This pull request has merge conflicts that must be resolved before it can be |
njhill
left a comment
There was a problem hiding this comment.
Thanks @andyxning, looks like a good simplification.
| serving._raise_if_error(None, "test-request-id") # should not raise | ||
|
|
||
|
|
||
| @pytest.mark.asyncio |
1d6ceef to
7cf20e9
Compare
|
|
||
| from ...utils import create_error_response | ||
| from .io_processor import PoolingIOProcessor | ||
|
|
|
Hi @andyxning, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
7cf20e9 to
3a14559
Compare
|
All except ut comments has been addressed. PTAL. @DarkLight1337 @njhill @noooop Others changes will be made in the follow-up prs. |
There was a problem hiding this comment.
| HTTPStatus.BAD_REQUEST.value: {"model": ErrorResponse}, | ||
| HTTPStatus.NOT_FOUND.value: {"model": ErrorResponse}, | ||
| HTTPStatus.INTERNAL_SERVER_ERROR.value: {"model": ErrorResponse}, | ||
| HTTPStatus.NOT_IMPLEMENTED.value: {"model": ErrorResponse}, |
There was a problem hiding this comment.
| HTTPStatus.BAD_REQUEST.value: {"model": ErrorResponse}, | ||
| HTTPStatus.NOT_FOUND.value: {"model": ErrorResponse}, | ||
| HTTPStatus.INTERNAL_SERVER_ERROR.value: {"model": ErrorResponse}, | ||
| HTTPStatus.NOT_IMPLEMENTED.value: {"model": ErrorResponse}, |
There was a problem hiding this comment.
| return strategy.filter(no_invalid_types) | ||
|
|
||
|
|
||
| def customized_not_a_server_error( |
There was a problem hiding this comment.
New changes. By default all server error will cause an exception or failure in schemathesis. We need to customize this by allowing /v1/chat/completions/render and /v1/chat/completions openapi return 501 status code.
Signed-off-by: Andy Xie <andy.xning@gmail.com>
| reasoning_parser, | ||
| ) | ||
| except GenerationError as e: | ||
| return self._convert_generation_error_to_response(e) |
There was a problem hiding this comment.
This should not be removed. This is a custom error defined by vLLM and should not produce an error log.
@DarkLight1337 @andyxning @noooop @njhill
see #37148
There was a problem hiding this comment.
Will invest later.


Since this PR has been complicated enough. It is XXXL. I will make another PR to refactor
_convert_generation_error_to_streaming_response.Purpose
try...exceptaroundcreate_error_responseto simplify the code and enhance code readibility.example:
Test Plan
NA
Test Result
NA
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Note
Improves debuggability and simplifies logging configuration.
args.log_error_stackis true) in FastAPI handlers forHTTPExceptionandRequestValidationError, and in launcher runtime exception handlerlog_error_stackparameter/usage across serving classes (OpenAIServing, chat/completion/responses/embedding/pooling/classify/score/tokenize/transcription/translation); stop manual traceback printing in serving layerlog_error_stack; keep structured error responses unchanged and avoid duplicate logs inmessageserror pathWritten by Cursor Bugbot for commit c19e018427f080d3ed27e387e2aeefde1e3de64a. This will update automatically on new commits. Configure here.
Note
Cursor Bugbot is generating a summary for commit cbb7ff7349e0e8b974e7893b9b0c211b48982b59. Configure here.
Note
Improves debuggability while simplifying logging configuration.
args.log_error_stack) in FastAPI handlers forHTTPExceptionandRequestValidationError, and in launcher runtime exception handlerlog_error_stackparameter/usage across serving classes (OpenAIServingand all OpenAI/pooling/speech/tokenize derivatives) and delete manual traceback printing inserving_enginelog_error_stack; error response payloads remain the same; drop extra exception log increate_messagesto avoid duplicatesOpenAIServingChatAPI change: replacelog_error_stackwithexclude_log_deltas; wireargs.exclude_log_deltasduring constructionWritten by Cursor Bugbot for commit cbb7ff7349e0e8b974e7893b9b0c211b48982b59. This will update automatically on new commits. Configure here.
Note
Improves debuggability while simplifying logging configuration.
args.log_error_stack) inHTTPExceptionandRequestValidationErrorhandlers and in the launcher runtime exception handlerlog_error_stackplumbing from serving classes (OpenAIServingand all chat/completion/responses/pooling/embed/classify/score/tokenize/transcription/translation variants) and stop manual traceback printing inserving_enginelog_error_stack; keep error payloads unchanged and drop duplicate logging increate_messageserror pathWritten by Cursor Bugbot for commit 8905b7e1df3ead3c1cd1b83155522dc4b2985561. This will update automatically on new commits. Configure here.
Note
Cursor Bugbot is generating a summary for commit 80d180879b44e789d25b1472709dd67e0fca3917. Configure here.
Note
Improves debuggability while simplifying logging.
args.log_error_stack) inHTTPExceptionandRequestValidationErrorhandlers and in the launcher runtime exception handlerlog_error_stackparameter/usage from serving classes (OpenAIServingand all chat/completion/responses/pooling/embed/classify/score/tokenize/transcription/translation) and delete manual traceback printing inserving_enginelog_error_stack; keep error response payloads unchanged and drop duplicate logging increate_messageserror pathWritten by Cursor Bugbot for commit 4e77554346d2d5deef3e4053303b0ad7172112c1. This will update automatically on new commits. Configure here.