[UX][Metrics] improving vllm-omni metrics#3042
Conversation
Signed-off-by: allgather <all2allops@gmail.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
hsliuustc0106
left a comment
There was a problem hiding this comment.
BLOCKING:
-
Correctness — In
async_omni_engine.py, bothinput_preprocess_time_msandbuild_add_request_message_time_msare set to the same variable (build_add_request_message_time_ms). RFC #3039 specifies these should be separate:input_preprocess_time_msforInputProcessor.process_inputs()andbuild_add_request_message_time_msfor the full function. -
Test Coverage — PR description states "I need someone to test bc I can't find available compute rn." For metrics changes that affect timing reporting, add a unit test verifying the timing format and field values are correct.
Signed-off-by: allgather <all2allops@gmail.com>
|
Thanks for your contribution! Would you mind showing log output before and after? |
Signed-off-by: allgather <all2allops@gmail.com>
Signed-off-by: allgather <all2allops@gmail.com>
|
Tested GLM-Image, Qwen3 Omni, Wan-2.2 all on 8xh100\ GLM-Image log outputServer command: CUDA_VISIBLE_DEVICES=0,1 HF_HOME=$HF_HOME HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN HF_TOKEN=$HF_TOKEN vllm serve zai-org/GLM-Image --omni --port 8091 --log-statsPrompt / request: OmniTiming: StageTiming: Overall Summary: Timing Composition: RequestE2EStats excerpt: Qwen3-Omni log outputModel: CUDA_VISIBLE_DEVICES=0,1 HF_HOME=$HF_HOME HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN HF_TOKEN=$HF_TOKEN vllm serve Qwen/Qwen3-Omni-30B-A3B-Instruct --omni --port 8091 --log-statsPrompt / request: OmniTiming: Overall Summary: Timing Composition: RequestE2EStats excerpt: WAN2.2 T2V log outputModel: CUDA_VISIBLE_DEVICES=0 HF_HOME=$HF_HOME HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN HF_TOKEN=$HF_TOKEN vllm serve Wan-AI/Wan2.2-T2V-A14B-Diffusers --omni --port 8091 --log-statsPrompt / request: StageTiming: Overall Summary: Timing Composition: RequestE2EStats excerpt: |
|
can we follow the style of #3149 for metrics? |
|
@hsliuustc0106 this topic has gotten very competitive. I think 3069 is on its way to get merged so ill close this one, its fine. Thanks for the review effort. I appreciate your suggestions as well. |
Motivation
#3039 proposed changes for omni metrics based on pain points discovered in #2834
Changes
Changes are based off of the proposal in #3039
Testing
ut passed