Skip to content

Comments

Fixing error 400 on Kimi K2.5 in thinking mode#5544

Closed
Neonsy wants to merge 24 commits intoKilo-Org:mainfrom
Neonsy:fix/kimi-k2-5_thinking-fix
Closed

Fixing error 400 on Kimi K2.5 in thinking mode#5544
Neonsy wants to merge 24 commits intoKilo-Org:mainfrom
Neonsy:fix/kimi-k2-5_thinking-fix

Conversation

@Neonsy
Copy link

@Neonsy Neonsy commented Jan 30, 2026

Context

Fixes the "400 thinking is enabled but reasoning_content is missing in assistant tool call message" error when using Kimi K2.5 models (via Moonshot provider) with thinking enabled during multi-step tool calling.

Error Encountered:

Date/time: 2026-01-30T07:23:12.688Z
Extension version: 5.2.2
Provider: moonshot
Model: kimi-k2.5

400
OpenAI completion error: 400 thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

According to the Moonshot documentation, when using Kimi K2.5 with thinking enabled:

  • The API returns reasoning_content as a top-level field in assistant messages
  • During multi-step tool calling, this reasoning_content must be preserved in the context
  • If reasoning_content is missing from assistant messages with tool_calls, the API returns a 400 error

This fix ensures that reasoning blocks stored in the internal message format are properly extracted and added as reasoning_content at the message level when converting back to OpenAI format for API requests.

Closes #3996

Implementation

The Problem

The message conversion flow was losing reasoning_content:

  1. openai.ts receives reasoning_content in the delta and yields it as a reasoning chunk
  2. Task.ts stores reasoning as {type: "reasoning", text: "..."} blocks in the message content array
  3. convertToOpenAiMessages() converts messages back to OpenAI format but did NOT extract reasoning blocks and map them to reasoning_content
  4. When converted messages were sent back to the API, reasoning_content was missing → 400 error

The Fix

Modified src/api/transform/openai-format.ts to:

  1. Extract reasoning blocks from content array: Filter for blocks with type: "reasoning" or type: "thinking" from the message content
  2. Concatenate and add as reasoning_content: Join multiple reasoning blocks with newlines and add as a top-level message field (not inside content array)
  3. Handle message-level reasoning fields: Also check for reasoning_content or reasoning already present at the message level (for re-conversion scenarios)
  4. Preserve existing functionality: reasoning_details handling for Gemini 3 / xAI remains unchanged and works alongside the new reasoning_content extraction

Key Code Changes

// Extract reasoning blocks from content for Kimi K2.5 / DeepSeek compatibility
// Kimi K2.5 requires reasoning_content at the message level, not in content array
const reasoningBlocks = (messageWithDetails.content || []).filter(
    (block: any) => block.type === "reasoning" || block.type === "thinking"
)
const reasoningContent = reasoningBlocks
    .map((block: any) => block.text || block.thinking || "")
    .filter(Boolean)
    .join("\n")

// Add reasoning_content for Kimi K2.5 / DeepSeek compatibility
// This must be at the message level, not inside content array
if (reasoningContent) {
    baseMessage.reasoning_content = reasoningContent
}

Tradeoffs

  • Property ordering: Added reasoning_content before reasoning_details and tool_calls to preserve the order expected by providers. Property order matters when sending messages back to some APIs.
  • Backward compatibility: The change is additive: existing reasoning_details handling is preserved, and reasoning_content is only added when reasoning blocks exist.

Screenshots

before after
400 error: thinking is enabled but reasoning_content is missing in assistant tool call message at index 2 Multi-step tool calling works correctly with Kimi K2.5

How to Test

  1. Configure Kimi K2.5 model via Moonshot provider:

    • Set API Provider to "Moonshot"
    • Select "kimi-k2.5" model
    • Enable thinking mode
  2. Start a task that requires multi-step tool calling:

    • Ask: "Read the package.json file and then list all dependencies"
    • The assistant should make a read_file tool call
  3. Verify the conversation continues after the tool result:

    • Without this fix: The second API request would fail with 400 error
    • With this fix: The assistant receives the tool result and continues the conversation
  4. Check that reasoning is preserved:

    • The assistant's reasoning from the first response should be visible in the conversation history

Test Coverage

Added comprehensive tests in openai-format.spec.ts covering:

  • Extraction of reasoning blocks as reasoning_content
  • Extraction of thinking blocks (alternative format)
  • Concatenation of multiple reasoning blocks
  • Assistant messages with only reasoning (no tool_calls)
  • Assistant messages with no reasoning
  • Message-level reasoning_content preservation
  • Coexistence of reasoning_content and reasoning_details
  • Filtering of empty reasoning text

Affected Providers and Models

Directly Affected (require reasoning_content at message level)

Provider Models Impact
Moonshot kimi-k2.5, kimi-k2.5-thinking Critical: These models require reasoning_content on all assistant messages with thinking enabled
DeepSeek deepseek-reasoner, deepseek-r1 High: Similar requirement for reasoning_content during multi-step tool calling

Indirectly Benefited (reasoning preserved correctly)

Provider Models Impact
OpenAI o1, o3, o4 series Medium: Reasoning is now properly extracted and preserved in conversation history
OpenRouter Any reasoning models Medium: Consistent reasoning handling across providers
xAI grok-3 Low: Uses reasoning_details format, but benefits from consistent handling
Gemini gemini-2.5-pro Low: Uses reasoning_details format, unchanged but compatible

Not Affected

Provider Models Reason
Anthropic All Claude models Uses native thinking blocks within content array
Bedrock All models Uses Anthropic-style message format

@changeset-bot
Copy link

changeset-bot bot commented Jan 30, 2026

🦋 Changeset detected

Latest commit: 4bf6ca0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
kilo-code Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@Neonsy Neonsy mentioned this pull request Jan 30, 2026
Fixing an error related to Kimi K2.5's behavior.
@Neonsy Neonsy changed the title Fixing error on Kimi K2.5 thinking Fixing error 400 on Kimi K2.5 in thinking mode Jan 30, 2026
@Neonsy
Copy link
Author

Neonsy commented Jan 30, 2026

I'm on it to improve type-safety

@Neonsy
Copy link
Author

Neonsy commented Jan 30, 2026

I'm running tests and verify the type-safety changes

@Neonsy
Copy link
Author

Neonsy commented Jan 30, 2026

I'm done on my end with 52ad122

Besides on chore nitpick, but that will hopefully be the last thing 😁 cause I overlooked something

@Neonsy
Copy link
Author

Neonsy commented Jan 30, 2026

This is the final commit from my end now (I hope) 😄

@Neonsy
Copy link
Author

Neonsy commented Jan 30, 2026

@kevinvandijk Do you know when the person you have requested as a reviewer will be available, or if there is someone else.

I don't wanna rush things, but this PR is blocking everyone from using the Moonshot provider with thinking enabled (At least K2.5, if not others)

@CaiDingxian
Copy link

@kevinvandijk Do you know when the person you have requested as a reviewer will be available, or if there is someone else.

I don't wanna rush things, but this PR is blocking everyone from using the Moonshot provider with thinking enabled (At least K2.5, if not others)

The official merger cycle seems to average a month...
Thank you so much for your PR, I'll merge it directly into my own code and let my team and friends use it

@Neonsy
Copy link
Author

Neonsy commented Jan 30, 2026

@CaiDingxian Maybe release (not even, as they release regularly), but in terms of merging, it can be fast, especially if you bring attention to it on the Discord Server

That's how your PR got merged with the endpoint, because I discovered it and asked about it.

@mikij
Copy link
Contributor

mikij commented Jan 31, 2026

My first usage of latest main with this fix are positive. It seems LLM got smarter with this thinking mode enabled properly. I will continue to use it and update here only if I find any issue.

@Neonsy
Copy link
Author

Neonsy commented Jan 31, 2026

Tried following command

pnpm install:vsix

efe9cef fixed the jetbrains part, that was sadly not windows compatible 😭

@Neonsy
Copy link
Author

Neonsy commented Feb 2, 2026

Looks like main broke my stuff

042d403 should fix that (hopefully)

@Neonsy
Copy link
Author

Neonsy commented Feb 2, 2026

Why did those runs cancel ❓

@Neonsy
Copy link
Author

Neonsy commented Feb 4, 2026

What happened on 0fd94d9

pnpm install ran postinstall for @vscode/ripgrep

That script tries to download the ripgrep binary from GitHub Releases:
https://api.github.com/repos/microsoft/ripgrep-prebuilt/releases/tags/v15.0.0

It got HTTP 403 on every attempt, so the postinstall failed and the whole install exited.
This is not a dependency resolution issue. It’s a network/auth/rate‑limit issue to GitHub’s API during the ripgrep download step.

@Neonsy
Copy link
Author

Neonsy commented Feb 5, 2026

The official merger cycle seems to average a month... Thank you so much for your PR, I'll merge it directly into my own code and let my team and friends use it

Maybe you are right. I lost all hope.

@CaiDingxian

@Neonsy Neonsy requested review from mikij and xiaoxiangmoe February 5, 2026 14:23
@kevinvandijk
Copy link
Collaborator

Hi @Neonsy,

We were not convinced this would not break other providers which is why we were checking for other solutions but should have commented that here, I'm not sure why that didn't happen. Apologies for that! Because we also rely on upstream and there the moonshot provider was converted to use the Vercel AI SDK we pulled in #5662. This should actually also fix the reasoning on Kimi 2.5. I have not seen any 400s anymore since this, can you let me know if that fixed it for you too? If not I'll reconsider this PR with priority. Feel free to DM me about it on Discord in that case as well!

@smetanokr
Copy link

smetanokr commented Feb 5, 2026

acked a bit and should have commented that here. Apologies for that! Because we also rely on upstream and there the moonshot provider was converted to use the Vercel AI SDK we pulled in from there. This should actually also fix the reasoning on Kimi 2.5. I have not seen any 400s anymore since this, can you let me know if that fixed it for you too? If not I'll reconsider this PR with priority. Feel free to DM me about it on Discord in that case as well!

hello @kevinvandijk .

I have been heavy using Neonsy VSIX with his fix for past couple of days with Kimi Coding Plan as a solution.

Since you released 5.3.3 4 hours ago -> i have switched to it NOW , and i will spent next 3-4 hours using it

I will let you know at the end of my working day (2-3 more hours) if i have stubled upon any issues.

@Neonsy
Copy link
Author

Neonsy commented Feb 5, 2026

@kevinvandijk If it works, then I understand that it is safer to use what comes from upstream.

I'm currently no longer using the Kilo extension, but @smetanokr as already replied.

I accept whichever decision is considered best for the codebase.

@smetanokr
Copy link

smetanokr commented Feb 5, 2026

unfortunatelly not working

image image image

@smetanokr
Copy link

same with 5.4.0 -> not working <-

Date/time: 2026-02-05T21:06:59.435Z
Extension version: 5.4.0
Provider: moonshot
Model: kimi-k2.5

MODEL_NO_TOOLS_USED

image image

@Neonsy
Copy link
Author

Neonsy commented Feb 5, 2026

Let's see if updating this branch will break my fix due to upstream stuff

@Neonsy
Copy link
Author

Neonsy commented Feb 5, 2026

We were not convinced this would not break other providers

@kevinvandijk could you please elaborate as to why this conclusion has been drawn. Did any testing reveal that? Does the code look like it would break with other providers?

I would like to understand if there is evidence.

@smetanokr
Copy link

smetanokr commented Feb 5, 2026

@kevinvandijk i can share with you kimi code api key , if it will help you with investigation -> to find out why cherry-pick-kimi-sdk is not a sucess and kimi k2.5 is still unable to do tool calls

@Neonsy
Copy link
Author

Neonsy commented Feb 5, 2026

@smetanokr Well looks like main broke my fix

I can no longer see a thinking checkbox and Kimi is looping hard. Therefore I will no longer tend to this PR.

@Neonsy
Copy link
Author

Neonsy commented Feb 5, 2026

Maybe this helps (hope you guys find a way, as I don't wanna spend more time trying to fix what main breaks... It's not fun):

AI SDK Warning (moonshot.chat / kimi-k2.5): The feature "specificationVersion" is used in a compatibility mode. Using v2 specification compatibility mode. Some features may not be available.
extensionHostProcess.js:219
[DebugTelemetry] LLM Completion {appName: 'kilo-code', appVersion: '5.4.0', vscodeVersion: '1.109.0', platform: 'win32', editorName: 'Visual Studio Code', …}
extensionHostProcess.js:219
[DebugTelemetry] Conversation Message {appName: 'kilo-code', appVersion: '5.4.0', vscodeVersion: '1.109.0', platform: 'win32', editorName: 'Visual Studio Code', …}
extensionHostProcess.js:219
[DebugTelemetry] Conversation Message {appName: 'kilo-code', appVersion: '5.4.0', vscodeVersion: '1.109.0', platform: 'win32', editorName: 'Visual Studio Code', …}
extensionHostProcess.js:219
Session synced: d0fd4864-1127-4241-9983-0eab4d7b5071
extensionHostProcess.js:219
Session title generated: d0fd4864-1127-4241-9983-0eab4d7b5071 - I want 3 subtasks 1, overview of github workflows 2. overview of project main process stuff 3. overview of renderer related stuff in project
extensionHostProcess.js:219
AI SDK Warning (moonshot.chat / kimi-k2.5): The feature "specificationVersion" is used in a compatibility mode. Using v2 specification compatibility mode. Some features may not be available.
extensionHostProcess.js:219
Session synced: d0fd4864-1127-4241-9983-0eab4d7b5071

@Neonsy Neonsy closed this Feb 5, 2026
@CaiDingxian
Copy link

@Neonsy
I have merged your PR and it has been working properly for several days. As for CI, it means "continuous" integration. It's about making small steps forward. Therefore, I suggest completing the smallest work loop first rather than a perfect PR.

@Neonsy
Copy link
Author

Neonsy commented Feb 6, 2026

@Neonsy
I have merged your PR and it has been working properly for several days. As for CI, it means "continuous" integration. It's about making small steps forward. Therefore, I suggest completing the smallest work loop first rather than a perfect PR.

As said, the upstream bit wrecked my fix and I don't wanna have to spend time on something, that comes from somewhere else, as it has been a decision from the team. Therefore the chance of my fix being applied is slim.

I don't wanna work on something, that is most likely not going to be merged.

I've forced pushed backwards to remove what broke the fix, so my branch should still work.

@smetanokr
Copy link

created new issue for it #5719 as this one seems to be closed

@Neonsy
Copy link
Author

Neonsy commented Feb 6, 2026

#5722

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compatibility Issue: Lack of support for reasoning_content in historical messages hinders advanced models (e.g., K2-Thinking)

7 participants