[1/n] MCP stuff #10

qandrew · 2025-11-25T00:04:52Z

Purpose

parser manages tool parsing (non streaming)
can do a MCP loop

Test Plan

VLLM_GPT_OSS_SYSTEM_TOOL_MCP_LABELS=web_search_preview,container,code_interpreter VLLM_USE_EXPERIMENTAL_PARSER_CONTEXT=1 vllm serve MiniMaxAI/MiniMax-M2   --tensor-parallel-size 4   --tool-call-parser minimax_m2   --reasoning-parser minimax_m2    --enable-auto-tool-choice --trust-remote-code --tool-server=localhost:8081/container,localhost:8081/browser,localhost:8081/python

function tool

curl http://localhost:8000/v1/responses   -H "Content-Type: application/json"   -N   -d '{
    "model": "MiniMaxAI/MiniMax-M2",
    "input": [
      {
        "role": "user",
        "content": "What is the weather in Paris in Celsius today?"
      }
    ],
    "tools": [{
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location", "unit"]
        }
    }],
    "temperature": 0.7,
    "max_output_tokens": 2000
  }'

mcp tool

curl -X POST "http://localhost:8000/v1/responses"   -H "Content-Type: application/json"   -H "Authorization: Bearer dummy-api-key"   -d '{
        "model": "MiniMaxAI/MiniMax-M2",
        "input": "Multiply 64548*15151 using the python tool.",
        "tools": [
          {
            "type": "mcp",
            "server_label": "code_interpreter",
            "headers": {"test": "test"},
            "server_url": "IGNORED"
          }
        ]
      }'

{
    "id": "resp_175d2aab96cc4d519a732db73370a42f",
    "created_at": 1764055537,
    "incomplete_details": null,
    "instructions": null,
    "metadata": null,
    "model": "MiniMaxAI/MiniMax-M2",
    "object": "response",
    "output": [
        {
            "id": "temp",
            "summary": [],
            "type": "reasoning",
            "content": [
                {
                    "text": "Okay, the user wants me to multiply two numbers, 64548 and 15151, using a Python tool. Looking at my available tools, I see I have a `code_interpreter` which allows me to execute Python code. This is perfect for performing the multiplication. I'll use the `code_interpreter` tool with the multiplication expression.\n",
                    "type": "reasoning_text"
                }
            ],
            "encrypted_content": null,
            "status": null
        },
        {
            "id": "lol",
            "arguments": "{\"code\": \"64548 * 15151\"}",
            "name": "code_interpreter",
            "server_label": "code_interpreter",
            "type": "mcp_call",
            "approval_request_id": null,
            "error": null,
            "output": "977966748\n",
            "status": "completed"
        },
        {
            "id": "temp",
            "summary": [],
            "type": "reasoning",
            "content": [
                {
                    "text": "Okay, I see that I've just used the code_interpreter tool to calculate the product of 64548 and 15151. Let me look at the result that was returned.\n\nThe tool executed the multiplication and returned the value 977966748. This is the mathematical result of 64548 × 15151, which is what the user requested.\n\nLet me double-check this calculation in my head... Actually, that's a pretty large multiplication to do mentally, so I'll trust the tool's calculation. Code interpreters are generally reliable for basic arithmetic operations like this.\n\nThe result looks like a 9-digit number, which makes sense since we're multiplying two 5-digit numbers. The product should be somewhere in the range of 10^9 to 10^10, and 977966748 fits within that range, so that seems reasonable.\n\nI should present this result clearly to the user. Since they specifically asked for the multiplication using a Python tool, I should mention that the calculation was done using the code_interpreter tool, and then provide the exact numerical result.\n\nI don't think I need to show the full calculation process since the result is straightforward. The user just wanted the product, not a step-by-step explanation. If they had asked for how to do the multiplication in Python or if there was any complexity to the calculation, I might have shown the code as well.\n\nSo my response will be simple: just state the result of 64548 × 15151 = 977966748.\n",
                    "type": "reasoning_text"
                }
            ],
            "encrypted_content": null,
            "status": null
        },
        {
            "id": "lol",
            "content": [
                {
                    "annotations": [],
                    "text": "\n\nThe result of multiplying 64548 by 15151 is 977966748.",
                    "type": "output_text",
                    "logprobs": null
                }
            ],
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": true,
    "temperature": 1.0,
    "tool_choice": "auto",
    "tools": [
        {
            "server_label": "code_interpreter",
            "type": "mcp",
            "allowed_tools": null,
            "authorization": null,
            "connector_id": null,
            "headers": {
                "test": "test"
            },
            "require_approval": null,
            "server_description": null,
            "server_url": "IGNORED"
        }
    ],
    "top_p": 0.95,
    "background": false,
    "max_output_tokens": 196281,
    "max_tool_calls": null,
    "previous_response_id": null,
    "prompt": null,
    "reasoning": null,
    "service_tier": "auto",
    "status": "completed",
    "text": null,
    "top_logprobs": null,
    "truncation": "disabled",
    "usage": {
        "input_tokens": 327,
        "input_tokens_details": {
            "cached_tokens": 256,
            "input_tokens_per_turn": [],
            "cached_tokens_per_turn": []
        },
        "output_tokens": 422,
        "output_tokens_details": {
            "reasoning_tokens": 0,
            "tool_output_tokens": 0,
            "output_tokens_per_turn": [],
            "tool_output_tokens_per_turn": []
        },
        "total_tokens": 749
    },
    "user": null,
    "input_messages": null,
    "output_messages": null
}

Signed-off-by: Andrew Xia <[email protected]> round trip works Signed-off-by: Andrew Xia <[email protected]> add env var Signed-off-by: Andrew Xia <[email protected]> generic eos Signed-off-by: Andrew Xia <[email protected]> WIP track sentences in parser Signed-off-by: Andrew Xia <[email protected]> dont use sentence types Signed-off-by: Andrew Xia <[email protected]> parser can get the function tool call initial commit Signed-off-by: Andrew Xia <[email protected]> implement call tool for MCP Signed-off-by: Andrew Xia <[email protected]> working mcp Signed-off-by: Andrew Xia <[email protected]> kimi k2 fixes Signed-off-by: Andrew Xia <[email protected]> merge fixes Signed-off-by: Andrew Xia <[email protected]> lol Signed-off-by: Andrew Xia <[email protected]>

qandrew force-pushed the mcp-1 branch 3 times, most recently from c7bb24b to 5b284ed Compare November 25, 2025 07:27

qandrew changed the base branch from main to refactor-responses-utils-1 November 25, 2025 07:30

qandrew force-pushed the mcp-1 branch 2 times, most recently from 7d18427 to 5988075 Compare November 25, 2025 07:52

qandrew mentioned this pull request Nov 25, 2025

[responsesAPI][2] parse ResponseFunctionToolCallOutputItem vllm-project/vllm#29383

Merged

qandrew changed the base branch from refactor-responses-utils-1 to refactor-responses-utils-2 November 25, 2025 14:14

qandrew force-pushed the mcp-1 branch 3 times, most recently from 1bf7843 to 3493686 Compare November 25, 2025 14:43

qandrew force-pushed the mcp-1 branch from 3493686 to 9b294e8 Compare November 25, 2025 14:45

qandrew mentioned this pull request Nov 25, 2025

[responsesAPI][3] ResponsesParser to set up non harmony MCP vllm-project/vllm#29413

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[1/n] MCP stuff #10

[1/n] MCP stuff #10

Uh oh!

qandrew commented Nov 25, 2025 •

edited by github-actions bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[1/n] MCP stuff #10

Are you sure you want to change the base?

[1/n] MCP stuff #10

Uh oh!

Conversation

qandrew commented Nov 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qandrew commented Nov 25, 2025 •

edited by github-actions bot

Loading