fix: Properly handle multiple text components from request by jh-nv · Pull Request #5196 · ai-dynamo/dynamo

jh-nv · 2026-01-05T23:47:25Z

Overview:

Properly concatenate all text components from a multi-modal request to build the user prompt.

Details:

This is a follow up to #5143. If there are multiple text components present in a single OpenAI chat request, we concatenate them to build the text prompt. If the request is a multi-turn conversation, the code add a newline between each turn.

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

Bug Fixes
- Improved multi-modal message processing to correctly handle multi-turn conversations with multiple text segments, ensuring all user input is properly captured and concatenated rather than processing only individual text items.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…rompt_DIS-1224

coderabbitai · 2026-01-05T23:51:06Z

Walkthrough

Modified user prompt extraction logic in processor.py to collect text segments from all user-role messages, concatenate text within each message, and join user turns with newlines. Updated error condition to trigger when no user messages contain text.

Changes

Cohort / File(s)
Multi-turn prompt extraction refactor `examples/multimodal/components/processor.py`	Reworked user prompt extraction to handle multiple user-role messages by collecting and concatenating all text segments, joining user turns with newlines, and updating error handling for cases where no user messages contain text.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Through multi-turn conversations we hop,
Gathering text from each message stop,
Stitching together with newlines so neat,
User voices joined in harmonious beat! 🌟

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: handling multiple text components from requests, which aligns with the code rework to concatenate all text segments from multi-modal messages.
Description check	✅ Passed	The description includes Overview, Details, and Related Issues sections as per the template. However, the 'Where should the reviewer start?' section is empty, which is a minor gap but not critical.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Fix all issues with AI Agents 🤖

In @examples/multimodal/components/processor.py:
- Around line 207-230: The current extraction appends empty strings from
message.content into user_texts so user_text may become empty or only newlines;
update the collection in the loop over raw_request.messages/message.content so
you only append item.text if item.type == "text" and item.text.strip() is
non-empty (i.e., skip whitespace-only strings), ensure you only append joined
text_parts to user_texts when text_parts contains at least one non-empty part,
and after building user_text (from "\n".join(user_texts)) validate that
user_text.strip() is non-empty and raise the existing ValueError if not;
reference symbols: raw_request.messages, message.content, item.type, item.text,
user_texts, user_text.

🧹 Nitpick comments (1)

examples/multimodal/components/processor.py (1)
227-227: Optional: Consider extracting error message to a constant.

The static analysis tool suggests avoiding long messages directly in exception raises. While this is a minor style issue, you could optionally extract it to improve maintainability:
🔎 Proposed refactor

Near the top of the class or module:
_ERR_NO_USER_TEXT = "No text content found in user messages"
Then at line 227:
-        raise ValueError("No text content found in user messages")
+        raise ValueError(_ERR_NO_USER_TEXT)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e791871 and 64f8182.

📒 Files selected for processing (1)

examples/multimodal/components/processor.py

🧰 Additional context used

🧠 Learnings (4)

📓 Common learnings

Learnt from: KrishnanPrash
Repo: ai-dynamo/dynamo PR: 3067
File: lib/llm/src/preprocessor/prompt/template/oai.rs:87-134
Timestamp: 2025-09-16T19:47:30.312Z
Learning: In Dynamo, multimodal requests (containing image_url or other non-text content) are processed through a completely different workflow than text-only requests, so the may_be_fix_msg_content function in lib/llm/src/preprocessor/prompt/template/oai.rs will only encounter text-only content arrays.

📚 Learning: 2026-01-03T01:29:46.547Z

Learnt from: tanmayv25
Repo: ai-dynamo/dynamo PR: 5143
File: examples/multimodal/components/processor.py:207-214
Timestamp: 2026-01-03T01:29:46.547Z
Learning: In examples/multimodal/components/processor.py, the processor currently supports only a single text field per request. Do not add support for multiple text fields; this is out of scope for the current implementation. When reviewing or modifying this file, ensure changes do not introduce multi-text-field handling and, if relevant, update tests and documentation to reflect the single-text-field limitation.

Applied to files:

examples/multimodal/components/processor.py

📚 Learning: 2026-01-03T01:27:08.763Z

Learnt from: tanmayv25
Repo: ai-dynamo/dynamo PR: 5143
File: examples/multimodal/components/processor.py:207-215
Timestamp: 2026-01-03T01:27:08.763Z
Learning: In examples/multimodal/components/processor.py, ensure the Dynamo multimodal processor supports flexible content ordering in multimodal requests (audio, image, or video content before text), reflecting the OpenAI multimodal API specification where content items can appear in any order. Do not enforce a text-first ordering as mandatory. Implement and verify logic that checks for required content types regardless of position, and update tests to cover all permutations of content ordering.

Applied to files:

examples/multimodal/components/processor.py

📚 Learning: 2025-09-16T19:47:30.312Z

Learnt from: KrishnanPrash
Repo: ai-dynamo/dynamo PR: 3067
File: lib/llm/src/preprocessor/prompt/template/oai.rs:87-134
Timestamp: 2025-09-16T19:47:30.312Z
Learning: In Dynamo, multimodal requests (containing image_url or other non-text content) are processed through a completely different workflow than text-only requests, so the may_be_fix_msg_content function in lib/llm/src/preprocessor/prompt/template/oai.rs will only encounter text-only content arrays.

Applied to files:

examples/multimodal/components/processor.py

🧬 Code graph analysis (1)

examples/multimodal/components/processor.py (2)

lib/llm/src/preprocessor/prompt.rs (1)

messages (54-54)

lib/llm/src/tokenizers.rs (1)

text (348-353)

🪛 Ruff (0.14.10)

examples/multimodal/components/processor.py

227-227: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: trtllm (amd64)
GitHub Check: vllm (amd64)
GitHub Check: sglang (arm64)
GitHub Check: trtllm (arm64)
GitHub Check: sglang (amd64)
GitHub Check: vllm (arm64)
GitHub Check: Build and Test - dynamo

examples/multimodal/components/processor.py

…rompt_DIS-1224

tanmayv25 · 2026-01-06T22:45:40Z

Can you include some test cases to cover this aspect?

…ectly

…rompt_DIS-1224

jh-nv · 2026-01-10T04:22:08Z

Can you include some test cases to cover this aspect?

Added test in the dynamo part.

…rompt_DIS-1224

…#5196) Signed-off-by: Shang Wang <shangw@nvidia.com>

jh-nv added 2 commits January 5, 2026 18:41

fix: Properly handle multiple text parts in multi-modal request

b669e56

Merge branch 'main' of github.com:ai-dynamo/dynamo into jihao/audio_p…

64f8182

…rompt_DIS-1224

jh-nv requested review from a team as code owners January 5, 2026 23:47

pull-request-size bot added the size/M label Jan 5, 2026

github-actions bot added the fix label Jan 5, 2026

coderabbitai bot reviewed Jan 5, 2026

View reviewed changes

examples/multimodal/components/processor.py Outdated Show resolved Hide resolved

jh-nv added 2 commits January 6, 2026 13:09

check for empty and None string

ba7a8f7

Merge branch 'main' of github.com:ai-dynamo/dynamo into jihao/audio_p…

573fc7d

…rompt_DIS-1224

copy-pr-bot bot had a problem deploying to GITLAB January 6, 2026 18:10 Failure

rmccorm4 requested review from KrishnanPrash and tanmayv25 January 7, 2026 17:22

jh-nv added 3 commits January 9, 2026 18:33

Update the dynamo vllm impelmentation to handle user text prompt corr…

3a7b619

…ectly

Merge branch 'main' of github.com:ai-dynamo/dynamo into jihao/audio_p…

f1e889e

…rompt_DIS-1224

fix

b7a38a1

pull-request-size bot added size/L and removed size/M labels Jan 9, 2026

copy-pr-bot bot temporarily deployed to GITLAB January 9, 2026 23:50 Inactive

github-actions bot added backend::vllm Relates to the vllm backend multimodal labels Jan 9, 2026

copy-pr-bot bot temporarily deployed to GITLAB January 9, 2026 23:55 Inactive

jh-nv requested a review from ayushag-nv January 10, 2026 04:24

Merge branch 'main' of github.com:ai-dynamo/dynamo into jihao/audio_p…

45882ce

…rompt_DIS-1224

copy-pr-bot bot temporarily deployed to GITLAB January 13, 2026 17:38 Inactive

copy-pr-bot bot temporarily deployed to GITLAB January 13, 2026 17:45 Inactive

tanmayv25 approved these changes Jan 14, 2026

View reviewed changes

jh-nv merged commit e22bb03 into main Jan 14, 2026
30 of 32 checks passed

jh-nv deleted the jihao/audio_prompt_DIS-1224 branch January 14, 2026 16:13

wangshangsam pushed a commit to CentML/dynamo that referenced this pull request Jan 15, 2026

fix: Properly handle multiple text components from request (ai-dynamo…

1c450ab

…#5196) Signed-off-by: Shang Wang <shangw@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Properly handle multiple text components from request#5196

fix: Properly handle multiple text components from request#5196
jh-nv merged 8 commits intomainfrom
jihao/audio_prompt_DIS-1224

jh-nv commented Jan 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 5, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

tanmayv25 commented Jan 6, 2026 •

edited

Loading

Uh oh!

jh-nv commented Jan 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jh-nv commented Jan 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 5, 2026

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tanmayv25 commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jh-nv commented Jan 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jh-nv commented Jan 5, 2026 •

edited by coderabbitai bot

Loading

tanmayv25 commented Jan 6, 2026 •

edited

Loading