Skip to content

Conversation

@qandrew
Copy link
Owner

@qandrew qandrew commented Nov 25, 2025

Purpose

  • parser manages tool parsing (non streaming)
  • can do a MCP loop

Test Plan

VLLM_GPT_OSS_SYSTEM_TOOL_MCP_LABELS=web_search_preview,container,code_interpreter VLLM_USE_EXPERIMENTAL_PARSER_CONTEXT=1 vllm serve MiniMaxAI/MiniMax-M2   --tensor-parallel-size 4   --tool-call-parser minimax_m2   --reasoning-parser minimax_m2    --enable-auto-tool-choice --trust-remote-code --tool-server=localhost:8081/container,localhost:8081/browser,localhost:8081/python

function tool

curl http://localhost:8000/v1/responses   -H "Content-Type: application/json"   -N   -d '{
    "model": "MiniMaxAI/MiniMax-M2",
    "input": [
      {
        "role": "user",
        "content": "What is the weather in Paris in Celsius today?"
      }
    ],
    "tools": [{
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location", "unit"]
        }
    }],
    "temperature": 0.7,
    "max_output_tokens": 2000
  }'

mcp tool

curl -X POST "http://localhost:8000/v1/responses"   -H "Content-Type: application/json"   -H "Authorization: Bearer dummy-api-key"   -d '{
        "model": "MiniMaxAI/MiniMax-M2",
        "input": "Multiply 64548*15151 using the python tool.",
        "tools": [
          {
            "type": "mcp",
            "server_label": "code_interpreter",
            "headers": {"test": "test"},
            "server_url": "IGNORED"
          }
        ]
      }'
{
    "id": "resp_175d2aab96cc4d519a732db73370a42f",
    "created_at": 1764055537,
    "incomplete_details": null,
    "instructions": null,
    "metadata": null,
    "model": "MiniMaxAI/MiniMax-M2",
    "object": "response",
    "output": [
        {
            "id": "temp",
            "summary": [],
            "type": "reasoning",
            "content": [
                {
                    "text": "Okay, the user wants me to multiply two numbers, 64548 and 15151, using a Python tool. Looking at my available tools, I see I have a `code_interpreter` which allows me to execute Python code. This is perfect for performing the multiplication. I'll use the `code_interpreter` tool with the multiplication expression.\n",
                    "type": "reasoning_text"
                }
            ],
            "encrypted_content": null,
            "status": null
        },
        {
            "id": "lol",
            "arguments": "{\"code\": \"64548 * 15151\"}",
            "name": "code_interpreter",
            "server_label": "code_interpreter",
            "type": "mcp_call",
            "approval_request_id": null,
            "error": null,
            "output": "977966748\n",
            "status": "completed"
        },
        {
            "id": "temp",
            "summary": [],
            "type": "reasoning",
            "content": [
                {
                    "text": "Okay, I see that I've just used the code_interpreter tool to calculate the product of 64548 and 15151. Let me look at the result that was returned.\n\nThe tool executed the multiplication and returned the value 977966748. This is the mathematical result of 64548 × 15151, which is what the user requested.\n\nLet me double-check this calculation in my head... Actually, that's a pretty large multiplication to do mentally, so I'll trust the tool's calculation. Code interpreters are generally reliable for basic arithmetic operations like this.\n\nThe result looks like a 9-digit number, which makes sense since we're multiplying two 5-digit numbers. The product should be somewhere in the range of 10^9 to 10^10, and 977966748 fits within that range, so that seems reasonable.\n\nI should present this result clearly to the user. Since they specifically asked for the multiplication using a Python tool, I should mention that the calculation was done using the code_interpreter tool, and then provide the exact numerical result.\n\nI don't think I need to show the full calculation process since the result is straightforward. The user just wanted the product, not a step-by-step explanation. If they had asked for how to do the multiplication in Python or if there was any complexity to the calculation, I might have shown the code as well.\n\nSo my response will be simple: just state the result of 64548 × 15151 = 977966748.\n",
                    "type": "reasoning_text"
                }
            ],
            "encrypted_content": null,
            "status": null
        },
        {
            "id": "lol",
            "content": [
                {
                    "annotations": [],
                    "text": "\n\nThe result of multiplying 64548 by 15151 is 977966748.",
                    "type": "output_text",
                    "logprobs": null
                }
            ],
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": true,
    "temperature": 1.0,
    "tool_choice": "auto",
    "tools": [
        {
            "server_label": "code_interpreter",
            "type": "mcp",
            "allowed_tools": null,
            "authorization": null,
            "connector_id": null,
            "headers": {
                "test": "test"
            },
            "require_approval": null,
            "server_description": null,
            "server_url": "IGNORED"
        }
    ],
    "top_p": 0.95,
    "background": false,
    "max_output_tokens": 196281,
    "max_tool_calls": null,
    "previous_response_id": null,
    "prompt": null,
    "reasoning": null,
    "service_tier": "auto",
    "status": "completed",
    "text": null,
    "top_logprobs": null,
    "truncation": "disabled",
    "usage": {
        "input_tokens": 327,
        "input_tokens_details": {
            "cached_tokens": 256,
            "input_tokens_per_turn": [],
            "cached_tokens_per_turn": []
        },
        "output_tokens": 422,
        "output_tokens_details": {
            "reasoning_tokens": 0,
            "tool_output_tokens": 0,
            "output_tokens_per_turn": [],
            "tool_output_tokens_per_turn": []
        },
        "total_tokens": 749
    },
    "user": null,
    "input_messages": null,
    "output_messages": null
}

@qandrew qandrew force-pushed the mcp-1 branch 3 times, most recently from c7bb24b to 5b284ed Compare November 25, 2025 07:27
@qandrew qandrew changed the base branch from main to refactor-responses-utils-1 November 25, 2025 07:30
@qandrew qandrew force-pushed the mcp-1 branch 2 times, most recently from 7d18427 to 5988075 Compare November 25, 2025 07:52
@qandrew qandrew changed the base branch from refactor-responses-utils-1 to refactor-responses-utils-2 November 25, 2025 14:14
@qandrew qandrew force-pushed the mcp-1 branch 3 times, most recently from 1bf7843 to 3493686 Compare November 25, 2025 14:43
Signed-off-by: Andrew Xia <[email protected]>

round trip works

Signed-off-by: Andrew Xia <[email protected]>

add env var

Signed-off-by: Andrew Xia <[email protected]>

generic eos

Signed-off-by: Andrew Xia <[email protected]>

WIP track sentences in parser

Signed-off-by: Andrew Xia <[email protected]>

dont use sentence types

Signed-off-by: Andrew Xia <[email protected]>

parser can get the function tool call initial commit

Signed-off-by: Andrew Xia <[email protected]>

implement call tool for MCP

Signed-off-by: Andrew Xia <[email protected]>

working mcp

Signed-off-by: Andrew Xia <[email protected]>

kimi k2 fixes

Signed-off-by: Andrew Xia <[email protected]>

merge fixes

Signed-off-by: Andrew Xia <[email protected]>

lol

Signed-off-by: Andrew Xia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants