Skip to content

Conversation

roomote[bot]
Copy link

@roomote roomote bot commented Oct 23, 2025

Description

This PR fixes issue #8787 where garbled characters appear when analyzing large files (1000+ lines) with vLLM models through the OpenAI-compatible API.

Problem

When streaming large outputs from vLLM models, UTF-8 multi-byte characters can be split across chunk boundaries, resulting in garbled output. This particularly affects users working with non-ASCII content or large codebases.

Solution

Implemented a UTF8StreamDecoder utility that:

  • Properly buffers incomplete UTF-8 sequences at chunk boundaries
  • Only decodes complete characters, holding partial sequences for the next chunk
  • Handles 1-4 byte UTF-8 sequences (including emojis)
  • Integrates seamlessly with existing OpenAI and OpenAI-compatible providers

Changes

  • Created UTF8StreamDecoder utility class for proper UTF-8 boundary handling
  • Integrated decoder into OpenAiHandler and BaseOpenAiCompatibleProvider
  • Added comprehensive test suite covering various UTF-8 edge cases
  • Ensures no performance impact for ASCII-only content

Testing

  • ✅ All existing tests pass
  • ✅ New test suite with 16 test cases covering:
    • Multi-byte character splitting
    • 4-byte emoji handling
    • Large file simulation
    • Edge cases and error handling

Impact

This fix ensures reliable output when using vLLM or other OpenAI-compatible models with large file analysis, improving the experience for users working with:

  • Large codebases
  • Non-ASCII content (Chinese, Japanese, emojis, etc.)
  • High-volume streaming responses

Fixes #8787


Important

Fixes UTF-8 boundary issues in vLLM model streaming by introducing UTF8StreamDecoder for proper character decoding.

  • Behavior:
    • Introduces UTF8StreamDecoder to handle UTF-8 boundary issues in streaming responses.
    • Integrates UTF8StreamDecoder into BaseOpenAiCompatibleProvider and OpenAiHandler to decode UTF-8 content properly.
    • Ensures no performance impact for ASCII-only content.
  • Testing:
    • Adds utf8-stream-decoder.test.ts with 16 test cases for UTF-8 edge cases, including multi-byte character splitting and emoji handling.
  • Files:
    • base-openai-compatible-provider.ts: Integrates UTF8StreamDecoder for streaming responses.
    • openai.ts: Integrates UTF8StreamDecoder for handling large outputs.
    • utf8-stream-decoder.ts: Implements the UTF8StreamDecoder class.

This description was created by Ellipsis for 92dd39b. You can customize this summary. It will automatically update as commits are pushed.

- Add UTF8StreamDecoder utility to properly handle multi-byte UTF-8 characters split across chunk boundaries
- Integrate decoder into OpenAI and BaseOpenAiCompatibleProvider to fix garbled output with large files
- Add comprehensive tests for UTF-8 boundary handling
- Fixes issue #8787 where vLLM outputs showed garbled characters with large file outputs
@roomote roomote bot requested review from cte, jr and mrubens as code owners October 23, 2025 11:54
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Oct 23, 2025
@roomote
Copy link
Author

roomote bot commented Oct 23, 2025

Review Summary

I've reviewed this PR that fixes UTF-8 boundary issues in streaming responses for vLLM models. The implementation is well-structured and correctly handles the core use case of buffering incomplete UTF-8 sequences from byte chunks.

Issues Found

  • Buffer handling in handleStringChunk: The method overwrites the existing buffer instead of combining it with previously buffered content when processing string chunks. While mixing chunk types is unlikely in practice, this should use combineBuffers for correctness.

Overall Assessment

The core UTF-8 decoding logic is sound and the comprehensive test suite validates the main use cases. The flagged issue is a minor robustness concern that should be addressed.


Follow Along on Roo Code Cloud

const decoded = this.decoder.decode(validBytes, { stream: true })

// Buffer the incomplete portion for the next chunk
this.buffer = bytes.slice(lastValid)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buffer is overwritten here without combining with existing content from previous chunks. If a Uint8Array chunk leaves partial bytes in the buffer, and the next chunk is a string, those buffered bytes will be lost. While mixing chunk types is unlikely in practice (OpenAI SDK typically returns consistent types), this should use this.buffer = this.combineBuffers(this.buffer, bytes.slice(lastValid)) for correctness.

Suggested change
this.buffer = bytes.slice(lastValid)
// Buffer the incomplete portion for the next chunk
this.buffer = this.combineBuffers(this.buffer, bytes.slice(lastValid))

Copy link
Author

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete. One minor robustness issue has been flagged regarding buffer handling in handleStringChunk.

@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Oct 23, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[BUG] When the model outputs slightly more content or when the input file (1000 lines of code) is large, garbled printing may occur

2 participants