fix: handle UTF-8 boundary issues in streaming responses for vLLM models #8788

roomote · 2025-10-23T11:54:25Z

Description

This PR fixes issue #8787 where garbled characters appear when analyzing large files (1000+ lines) with vLLM models through the OpenAI-compatible API.

Problem

When streaming large outputs from vLLM models, UTF-8 multi-byte characters can be split across chunk boundaries, resulting in garbled output. This particularly affects users working with non-ASCII content or large codebases.

Solution

Implemented a UTF8StreamDecoder utility that:

Properly buffers incomplete UTF-8 sequences at chunk boundaries
Only decodes complete characters, holding partial sequences for the next chunk
Handles 1-4 byte UTF-8 sequences (including emojis)
Integrates seamlessly with existing OpenAI and OpenAI-compatible providers

Changes

Created UTF8StreamDecoder utility class for proper UTF-8 boundary handling
Integrated decoder into OpenAiHandler and BaseOpenAiCompatibleProvider
Added comprehensive test suite covering various UTF-8 edge cases
Ensures no performance impact for ASCII-only content

Testing

✅ All existing tests pass
✅ New test suite with 16 test cases covering:
- Multi-byte character splitting
- 4-byte emoji handling
- Large file simulation
- Edge cases and error handling

Impact

This fix ensures reliable output when using vLLM or other OpenAI-compatible models with large file analysis, improving the experience for users working with:

Large codebases
Non-ASCII content (Chinese, Japanese, emojis, etc.)
High-volume streaming responses

Fixes #8787

Important

Fixes UTF-8 boundary issues in vLLM model streaming by introducing UTF8StreamDecoder for proper character decoding.

Behavior:
- Introduces UTF8StreamDecoder to handle UTF-8 boundary issues in streaming responses.
- Integrates UTF8StreamDecoder into BaseOpenAiCompatibleProvider and OpenAiHandler to decode UTF-8 content properly.
- Ensures no performance impact for ASCII-only content.
Testing:
- Adds utf8-stream-decoder.test.ts with 16 test cases for UTF-8 edge cases, including multi-byte character splitting and emoji handling.
Files:
- base-openai-compatible-provider.ts: Integrates UTF8StreamDecoder for streaming responses.
- openai.ts: Integrates UTF8StreamDecoder for handling large outputs.
- utf8-stream-decoder.ts: Implements the UTF8StreamDecoder class.

^{This description was created by}^{for 92dd39b. You can customize this summary. It will automatically update as commits are pushed.}

- Add UTF8StreamDecoder utility to properly handle multi-byte UTF-8 characters split across chunk boundaries - Integrate decoder into OpenAI and BaseOpenAiCompatibleProvider to fix garbled output with large files - Add comprehensive tests for UTF-8 boundary handling - Fixes issue #8787 where vLLM outputs showed garbled characters with large file outputs

roomote · 2025-10-23T11:54:48Z

Review Summary

I've reviewed this PR that fixes UTF-8 boundary issues in streaming responses for vLLM models. The implementation is well-structured and correctly handles the core use case of buffering incomplete UTF-8 sequences from byte chunks.

Issues Found

Buffer handling in handleStringChunk: The method overwrites the existing buffer instead of combining it with previously buffered content when processing string chunks. While mixing chunk types is unlikely in practice, this should use combineBuffers for correctness.

Overall Assessment

The core UTF-8 decoding logic is sound and the comprehensive test suite validates the main use cases. The flagged issue is a minor robustness concern that should be addressed.

Follow Along on Roo Code Cloud

roomote · 2025-10-23T12:02:05Z

src/api/utils/utf8-stream-decoder.ts

+				const decoded = this.decoder.decode(validBytes, { stream: true })
+
+				// Buffer the incomplete portion for the next chunk
+				this.buffer = bytes.slice(lastValid)


The buffer is overwritten here without combining with existing content from previous chunks. If a Uint8Array chunk leaves partial bytes in the buffer, and the next chunk is a string, those buffered bytes will be lost. While mixing chunk types is unlikely in practice (OpenAI SDK typically returns consistent types), this should use this.buffer = this.combineBuffers(this.buffer, bytes.slice(lastValid)) for correctness.

Suggested change

this.buffer = bytes.slice(lastValid)

// Buffer the incomplete portion for the next chunk

this.buffer = this.combineBuffers(this.buffer, bytes.slice(lastValid))

roomote

Review complete. One minor robustness issue has been flagged regarding buffer handling in handleStringChunk.

roomote bot requested review from cte, jr and mrubens as code owners October 23, 2025 11:54

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Oct 23, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Oct 23, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Oct 23, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Oct 23, 2025

roomote bot mentioned this pull request Oct 23, 2025

[BUG] When the model outputs slightly more content or when the input file (1000 lines of code) is large, garbled printing may occur #8787

Closed

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Oct 23, 2025

roomote bot commented Oct 23, 2025

View reviewed changes

hannesrudolph closed this Oct 23, 2025

github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Oct 23, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Oct 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handle UTF-8 boundary issues in streaming responses for vLLM models #8788

fix: handle UTF-8 boundary issues in streaming responses for vLLM models #8788

roomote bot commented Oct 23, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

roomote bot commented Oct 23, 2025 •

edited

Loading

Uh oh!

roomote bot Oct 23, 2025

Uh oh!

roomote bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	this.buffer = bytes.slice(lastValid)
	// Buffer the incomplete portion for the next chunk
	this.buffer = this.combineBuffers(this.buffer, bytes.slice(lastValid))

fix: handle UTF-8 boundary issues in streaming responses for vLLM models #8788

fix: handle UTF-8 boundary issues in streaming responses for vLLM models #8788

Conversation

roomote bot commented Oct 23, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Changes

Testing

Impact

Uh oh!

roomote bot commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Summary

Issues Found

Overall Assessment

Uh oh!

roomote bot Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

roomote bot commented Oct 23, 2025 •

edited by ellipsis-dev bot

Loading

roomote bot commented Oct 23, 2025 •

edited

Loading