Skip to content

Conversation

@shenoyvvarun
Copy link
Contributor

@shenoyvvarun shenoyvvarun commented Jul 29, 2025

Purpose

This only adds tests as the fix is addressed as part of #21798

This PR fixes the issue #21344 where large image requests hangs in vllm waiting for the request to timeout.
Bug: MultiModalProfiler counts only patch tokens but, there are other bookeeping tokens like tile_seperator, image_start, image_end tokens in the input. This causes the encoder_budget to be slightly lower the actual budget. Whenever an image that uses all tiles is sent, VLLM accept the request but, scheduler can never schedule it because there is not enough encoder budget. Silent issue and No error is produced

Test Plan

  1. Large image ( 4k x 4k) - LLama Guard 4
  2. Full Context length Text + Large image + Llama Guard 4
  3. Sanity test with (7k x 4k Image)

Test Result

  1. Large image ( 4k x 4k)
{
    "id": "chatcmpl-464e45d8a7834c41aa96b93c388f5be6",
    "object": "chat.completion",
    "created": 1753798054,
    "model": "vllm-model",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "\n\nsafe",
                "refusal": null,
                "annotations": null,
                "audio": null,
                "function_call": null,
                "tool_calls": [],
                "reasoning_content": null
            },
            "logprobs": null,
            "finish_reason": "stop",
            "stop_reason": null
        }
    ],
    "service_tier": null,
    "system_fingerprint": null,
    "usage": {
        "prompt_tokens": 2669,
        "total_tokens": 2672,
        "completion_tokens": 3,
        "prompt_tokens_details": null
    },
    "prompt_logprobs": null,
    "kv_transfer_params": null
}
  1. Full Context length Text + Large image (2467 is the encoder budget)
{
    "id": "chatcmpl-9406bb1facd54ea194a270fed2577ca8",
    "object": "chat.completion",
    "created": 1753798730,
    "model": "vllm-model",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "0",
                "refusal": null,
                "annotations": null,
                "audio": null,
                "function_call": null,
                "tool_calls": [],
                "reasoning_content": null
            },
            "logprobs": null,
            "finish_reason": "stop",
            "stop_reason": 200001
        }
    ],
    "service_tier": null,
    "system_fingerprint": null,
    "usage": {
        "prompt_tokens": 131069,
        "total_tokens": 131071,
        "completion_tokens": 2,
        "prompt_tokens_details": null
    },
    "prompt_logprobs": null,
    "kv_transfer_params": null
}
  1. Sanity Test
curl http: //localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://upload.wikimedia.org/wikipedia/commons/8/85/Portas_da_Cidade%2C_Ponta_Delgada%2C_isla_de_San_Miguel%2C_Azores%2C_Portugal%2C_2020-07-29%2C_DD_123-125_HDR.jpg"
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this image in detail and also specify where this can be found."
                }
            ]
        }
    ],
    "model": "vllm"
}'
{
    "id": "chatcmpl-86908cb63b644decbcfcbc63d26fbb36",
    "object": "chat.completion",
    "created": 1753799599,
    "model": "vllm",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "The image depicts a large, ornate archway in the center of a city square at dusk. The archway is made of stone and features two large arches with a decorative top section. It is illuminated by purple lights on either side.\n\nHere are the main points describing the image:\n\n* **Archway**\n\t+ Made of stone\n\t+ Features two large arches\n\t+ Decorative top section with a crown-like design\n\t+ Illuminated by purple lights on either side\n* **City Square**\n\t+ Located in front of the archway\n\t+ Features a statue in the center\n\t+ Surrounded by buildings on all sides\n\t+ Streetlights and cars visible\n* **Buildings**\n\t+ White with brown trim\n\t+ Feature arched windows and doors\n\t+ Have balconies on the second floor\n\t+ Appear to be old and historic\n* **Statue**\n\t+ Located in the center of the square\n\t+ Depicts a person standing on a pedestal\n\t+ Not clearly visible due to distance\n* **Streetlights and Cars**\n\t+ Streetlights line the square and surrounding streets\n\t+ Cars are parked along the streets and driving through the area\n\t+ Traffic lights visible in the distance\n* **Sky**\n\t+ Dark blue and cloudy\n\t+ Indicates that it is dusk or evening\n\nIn summary, the image shows a beautiful and historic archway in a city square, surrounded by old buildings and a statue. The archway is illuminated by purple lights, and the square is bustling with streetlights and cars. The dark blue sky suggests that it is dusk or evening. \n\nThis archway can be found in Horta, Faial, Azores, Portugal. The archway is known as the Portão da Cidade (City Gate) and is a iconic landmark in Horta.",
                "refusal": null,
                "annotations": null,
                "audio": null,
                "function_call": null,
                "tool_calls": [],
                "reasoning_content": null
            },
            "logprobs": null,
            "finish_reason": "stop",
            "stop_reason": null
        }
    ],
    "service_tier": null,
    "system_fingerprint": null,
    "usage": {
        "prompt_tokens": 2346,
        "total_tokens": 2732,
        "completion_tokens": 386,
        "prompt_tokens_details": null
    },
    "prompt_logprobs": null,
    "kv_transfer_params": null
}

(Optional) Documentation Update

@mergify mergify bot added llama Related to Llama models multi-modality Related to multi-modality (#4194) labels Jul 29, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR fixes a critical bug in the MultiModalProfiler that caused hangs with large image requests. The fix involves using the length parameter on PlaceHolderRange for accurate token budget calculation. A new test suite has been added to validate the fix. I've identified a high-severity issue in the test file related to incorrect token calculation, which needs to be addressed.

@DarkLight1337
Copy link
Member

See my comments in #21798

@DarkLight1337
Copy link
Member

Maybe you can add the tests while the other author focuses on the fix itself

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@shenoyvvarun
Copy link
Contributor Author

Maybe you can add the tests while the other author focuses on the fix itself

Makes sense 👍

@shenoyvvarun shenoyvvarun force-pushed the vasheno/profiler_bug_fix branch from 05da435 to ff1c085 Compare July 29, 2025 15:18
@shenoyvvarun shenoyvvarun force-pushed the vasheno/profiler_bug_fix branch from ff1c085 to 2a80323 Compare July 30, 2025 01:28
@shenoyvvarun shenoyvvarun changed the title [Bugfix] Fixing bug inside MultiModalProfiler. [Tests] Fixing bug inside MultiModalProfiler. Jul 30, 2025
@DarkLight1337
Copy link
Member

Can you fix pre-commit?

@shenoyvvarun shenoyvvarun force-pushed the vasheno/profiler_bug_fix branch from 2a80323 to caf4475 Compare July 30, 2025 03:16
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM given tests pass

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) July 30, 2025 03:19
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 30, 2025
@shenoyvvarun
Copy link
Contributor Author

Can you fix pre-commit?

Fixed. Please note that the failure now is not due to changes in the PR, but, due to file below. I will rebase and check if that fixes the issue.

docs/design/fused_moe_modular_kernel.md

@DarkLight1337
Copy link
Member

No need to rebase, I can force-merge

@vllm-bot vllm-bot merged commit 5477952 into vllm-project:main Jul 30, 2025
44 of 50 checks passed
liuyumoye pushed a commit to liuyumoye/vllm that referenced this pull request Jul 31, 2025
vadiklyutiy pushed a commit to CentML/vllm that referenced this pull request Aug 5, 2025
x22x22 pushed a commit to x22x22/vllm that referenced this pull request Aug 5, 2025
npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025
jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025
noamgat pushed a commit to noamgat/vllm that referenced this pull request Aug 9, 2025
paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025
diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llama Related to Llama models multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants