Skip to content

Conversation

@qandrew
Copy link
Owner

@qandrew qandrew commented Nov 7, 2025

Purpose

  • parser manages tool parsing (non streaming)
  • can do a MCP loop

Test Plan

VLLM_USE_EXPERIMENTAL_PARSER_CONTEXT=1 vllm serve MiniMaxAI/MiniMax-M2   --tensor-parallel-size 4   --tool-call-parser minimax_m2   --reasoning-parser minimax_m2    --enable-auto-tool-choice --trust-remote-code --tool-server=localhost:8081/container,localhost:8081/browser,localhost:8081/python

function tool

curl http://localhost:8000/v1/responses   -H "Content-Type: application/json"   -N   -d '{
    "model": "MiniMaxAI/MiniMax-M2",
    "input": [
      {
        "role": "user",
        "content": "What is the weather in Paris in Celsius today?"
      }
    ],
    "tools": [{
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location", "unit"]
        }
    }],
    "temperature": 0.7,
    "max_output_tokens": 2000
  }'

mcp tool

curl -X POST "http://localhost:8000/v1/responses"   -H "Content-Type: application/json"   -H "Authorization: Bearer dummy-api-key"   -d '{
        "model": "MiniMaxAI/MiniMax-M2",
        "input": "Multiply 64548*15151 using the python tool.",
        "tools": [
          {
            "type": "mcp",
            "server_label": "code_interpreter",
            "headers": {"test": "test"},
            "server_url": "IGNORED"
          }
        ]
      }'

Test Result

MCP

{
    "id": "resp_e09f23e5a25d41f19917db4d17cd25bb",
    "created_at": 1762559699,
    "incomplete_details": null,
    "instructions": null,
    "metadata": null,
    "model": "MiniMaxAI/MiniMax-M2",
    "object": "response",
    "output": [
        {
            "id": "rs_88851126b65044f29dbd915193344075",
            "summary": [],
            "type": "reasoning",
            "content": [
                {
                    "text": "Okay, the user wants me to multiply two numbers, 64548 and 15151, using a Python tool. Looking at my available tools, I see I have a `code_interpreter` tool that can run Python code. This is perfect for this task.\n\nLet me think about how to approach this. The user has given me two integers to multiply, which is a straightforward arithmetic operation. I could certainly do this calculation mentally, but since the user specifically asked me to use the Python tool, I should use that.\n\nTo use the code_interpreter tool, I need to:\n1. Write a Python script that performs the multiplication\n2. Pass this script to the code_interpreter tool\n3. The tool will execute the script and return the result\n\nFor the Python script, I'll write a simple line that multiplies the two numbers:\n```python\nresult = 64548 * 15151\n```\n\nAnd then I should print the result so it's visible:\n```python\nprint(result)\n```\n\nI could combine these into a single line:\n```python\nprint(64548 * 15151)\n```\n\nBut keeping them as two separate lines might make it clearer what's happening in the code.\n\nLet me structure the tool call correctly. The code_interpreter tool requires a JSON object with the code to execute. So I'll format my request as:\n- tool name: \"code_interpreter\"\n- arguments: {\"code\": \"result = 64548 * 15151\\nprint(result)\"}\n\nThis should execute the Python code and return the product of 64548 and 15151. The multiplication of these two numbers will give us the final answer, which I can then provide to the user.\n",
                    "type": "reasoning_text"
                }
            ],
            "encrypted_content": null,
            "status": "completed"
        },
        {
            "id": "msg_91cb29d6ec5b4d65835156e1f452aa36",
            "content": [
                {
                    "annotations": [],
                    "text": "\n\n\n",
                    "type": "output_text",
                    "logprobs": null
                }
            ],
            "role": "assistant",
            "status": "completed",
            "type": "message"
        },
        {
            "arguments": "{\"code\": \"result = 64548 * 15151\\nprint(result)\"}",
            "call_id": "call_c4c9b8c5e6f84ded8bdf059c76607e2b",
            "name": "code_interpreter",
            "type": "function_call",
            "id": "fc_c28f846b7361458895f7c2503a7a6b65",
            "status": "completed"
        },
        {
            "id": "rs_eae56d29a97e4698a4b33cf55f2ac63b",
            "summary": [],
            "type": "reasoning",
            "content": [
                {
                    "text": "I need to analyze what happened in this conversation. The user asked me to multiply 64548 by 15151 using the Python tool. This is a straightforward arithmetic operation, but since they specifically requested using Python, I need to use the code_interpreter tool that I have available.\n\nLooking at my available tools, I can see I have access to a \"code_interpreter\" which can execute Python code. This is perfect for this multiplication task.\n\nI called the code_interpreter tool with a simple Python script:\n```python\nresult = 64548 * 15151\nprint(result)\n```\n\nThis script does two things:\n1. It calculates the product of 64548 and 15151 and stores it in a variable called \"result\"\n2. It prints the value of that variable to output the answer\n\nThe tool executed this code and returned \"977966748\" as the result. This is the product of multiplying 64548 by 15151, which is what the user requested.\n\nNow I need to provide this answer to the user in a clear and straightforward way. I don't need to include the Python code itself in my response since the user just wanted the result of the calculation. I'll simply state the answer directly: 977,966,748.\n\nI should format this number with commas for better readability, as large numbers are typically written with thousands separators. So I'll write it as \"977,966,748\" rather than \"977966748\".\n\nThe calculation is complete, and I have the correct answer to provide to the user.\n",
                    "type": "reasoning_text"
                }
            ],
            "encrypted_content": null,
            "status": "completed"
        },
        {
            "id": "msg_ae7f39c27bd34559a372adef918787b4",
            "content": [
                {
                    "annotations": [],
                    "text": "\n\nThe result of multiplying 64548 by 15151 is 977,966,748.",
                    "type": "output_text",
                    "logprobs": null
                }
            ],
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": true,
    "temperature": 1.0,
    "tool_choice": "auto",
    "tools": [
        {
            "server_label": "code_interpreter",
            "type": "mcp",
            "allowed_tools": null,
            "authorization": null,
            "connector_id": null,
            "headers": {
                "test": "test"
            },
            "require_approval": null,
            "server_description": null,
            "server_url": "IGNORED"
        }
    ],
    "top_p": 0.95,
    "background": false,
    "max_output_tokens": 196607,
    "max_tool_calls": null,
    "previous_response_id": null,
    "prompt": null,
    "reasoning": null,
    "service_tier": "auto",
    "status": "completed",
    "text": null,
    "top_logprobs": null,
    "truncation": "disabled",
    "usage": {
        "input_tokens": 614,
        "input_tokens_details": {
            "cached_tokens": 528,
            "input_tokens_per_turn": [],
            "cached_tokens_per_turn": []
        },
        "output_tokens": 715,
        "output_tokens_details": {
            "reasoning_tokens": 0,
            "tool_output_tokens": 0,
            "output_tokens_per_turn": [],
            "tool_output_tokens_per_turn": []
        },
        "total_tokens": 1329
    },
    "user": null,
    "input_messages": null,
    "output_messages": null
}

@qandrew qandrew force-pushed the parser-sentence-mcp branch from 9861f0f to 69022c4 Compare November 8, 2025 00:28
@qandrew qandrew force-pushed the parser-sentence-vllm branch 4 times, most recently from 4272381 to c7e56ac Compare November 16, 2025 00:36
Andrew Xia added 5 commits November 15, 2025 16:36
Signed-off-by: Andrew Xia <[email protected]>
Signed-off-by: Andrew Xia <[email protected]>
Signed-off-by: Andrew Xia <[email protected]>
Signed-off-by: Andrew Xia <[email protected]>
@qandrew qandrew force-pushed the parser-sentence-mcp branch from a02b1be to c225a90 Compare November 16, 2025 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants