Skip to content

[Model] Support IQuestCoder model#31575

Merged
DarkLight1337 merged 7 commits intovllm-project:mainfrom
yxing-bj:dev/loop_model
Jan 8, 2026
Merged

[Model] Support IQuestCoder model#31575
DarkLight1337 merged 7 commits intovllm-project:mainfrom
yxing-bj:dev/loop_model

Conversation

@yxing-bj
Copy link
Copy Markdown
Contributor

@yxing-bj yxing-bj commented Dec 31, 2025

Purpose

IQuest-Coder-V1 is a new family of code large language models (LLMs) designed to advance autonomous software engineering and code intelligence. We built a repo about IQuestCoder.

We had uploaded these models to Hugging Face, including IQuestCoder and IQuestLoopCoder. To make them easier for everyone to use, we support these models on vLLM platform.

Test Plan

Firstly, we start to launch a vLLM server

  • For instruct models:
vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code
  • For Loop instruct models:
vllm serve IQuestLab/IQuest-Coder-V1-40B-Loop-Instruct --tensor-parallel-size 4 --trust-remote-code
  • For Thinking models with reasoning support:
vllm serve IQuestLab/IQuest-Coder-V1-40B-Thinking --reasoning-parser qwen3 --tensor-parallel-size 4 --trust-remote-code

Then , we use vLLM with an OpenAI-compatible API endpoint

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the new-model Requests to new models label Dec 31, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for two new models, IQuestCoder and IQuestLoopCoder, by adding their respective implementations. The code is well-structured and largely follows the existing patterns for model integration in vLLM. My review focuses on ensuring the correctness and clarity of the new model definitions. I've identified a few areas for improvement, mainly related to cleaning up unused code for pipeline parallelism that seems to have been carried over from template files, and correcting some type hint mismatches. Addressing these points will improve the maintainability and correctness of the new model implementations.

@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

@yxing-bj yxing-bj changed the title support iquest model [Model] Support iquest model Dec 31, 2025
@yxing-bj yxing-bj changed the title [Model] Support iquest model [Model] Support IQuestCoder model Dec 31, 2025
@youkaichao
Copy link
Copy Markdown
Member

@yxing-bj you can send email to collaboration@vllm.ai for official collaboration!

@yxing-bj
Copy link
Copy Markdown
Contributor Author

yxing-bj commented Jan 1, 2026

@yxing-bj you can send email to collaboration@vllm.ai for official collaboration!

Thanks. How should I send an email to establish a collaboration?

@youkaichao
Copy link
Copy Markdown
Member

@yxing-bj you can send email to collaboration@vllm.ai for official collaboration!

Thanks. How should I send an email to establish a collaboration?

Just send an email from your company email address, and then we can better coordinate model releases before it goes public.

@linygood
Copy link
Copy Markdown

linygood commented Jan 6, 2026

When can this PR be merged into the main branch?

@yxing-bj
Copy link
Copy Markdown
Contributor Author

yxing-bj commented Jan 6, 2026

When can this PR be merged into the main branch?

I replaced the LoopCoderNorm with RMSNorm and did eval again. If OK , I will refactor code

@mergify
Copy link
Copy Markdown

mergify bot commented Jan 6, 2026

Documentation preview: https://vllm--31575.org.readthedocs.build/en/31575/

@mergify mergify bot added the documentation Improvements or additions to documentation label Jan 6, 2026
Signed-off-by: yxing <yxing@iquestlab.com>
Signed-off-by: yxing <yxing@iquestlab.com>
Signed-off-by: yxing <yxing@iquestlab.com>
Signed-off-by: yxing <yxing@iquestlab.com>
@linygood
Copy link
Copy Markdown

linygood commented Jan 6, 2026

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?

vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

@yxing-bj
Copy link
Copy Markdown
Contributor Author

yxing-bj commented Jan 7, 2026

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?

vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

We support qwen3_coder template. You can try. However, the content generated by the current model is still not quite satisfactory. In the future, we will update model to support tools more better

@linygood
Copy link
Copy Markdown

linygood commented Jan 7, 2026

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?
vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

We support qwen3_coder template. You can try. However, the content generated by the current model is still not quite satisfactory. In the future, we will update model to support tools more better

I have try --tool-call-parser qwen3_coder And --tool-call-parser hermers ,the tool call accuracy of the terminal is close to 0,maybe there has something error,please give me the correct config?

@tbraun96
Copy link
Copy Markdown

tbraun96 commented Jan 7, 2026

We need the tokenizer.json for the loop models

@Alan-D-Chen
Copy link
Copy Markdown

When can this PR be merged into the main branch?When can this PR be merged into the main branch?这个PR什么时候可以合并到主分支?

the same question !!

@yxing-bj
Copy link
Copy Markdown
Contributor Author

yxing-bj commented Jan 8, 2026

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?
vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

We support qwen3_coder template. You can try. However, the content generated by the current model is still not quite satisfactory. In the future, we will update model to support tools more better

I have try --tool-call-parser qwen3_coder And --tool-call-parser hermers ,the tool call accuracy of the terminal is close to 0,maybe there has something error,please give me the correct config?

In order to support tool call, we start to launch vllm service with the following command:

vllm IQuestLab/IQuestCoder-Instruct --port 8000 --enable-log-requests -tp 4 -dp 1 --trust-remote-code --tool-call-parser hermes --enable-auto-tool-choice

Then we send request and get response. The following is an example:

from openai import OpenAI

def send_messages(messages, tools_list):
    response = client.chat.completions.create(
        model="IQuestLab/IQuestCoder-Instruct",
        messages=messages,
        tools=tools_list,
    )
    return response.choices[0].message

client = OpenAI(
    api_key="1234",
    base_url="http://localhost:8000/v1/",
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather of a location, the user should supply a location first.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    }
                },
                "required": ["location"]
            },
        }
    },
]

messages = [{"role": "user", "content": "How's the weather in Beijing?"}]
message = send_messages(messages, tools)

print(f"User>\t {messages[0]['content']}")

tool = message.tool_calls[0]
print("tool: ", tool)
messages.append(message)

messages.append({"role": "tool", "tool_call_id": tool.id, "content": "the temperature is 24℃ and it would be sunny"})
message = send_messages(messages, tools)

print(f"Model>\t {message.content}")

Then we would get the result:

User>	 How's the weather in Beijing?
tool:  ChatCompletionMessageFunctionToolCall(id='chatcmpl-tool-9ac812fa4039c085', function=Function(arguments='{"location": "Beijing"}', name='get_weather'), type='function')
Model>	 The weather in Beijing is sunny with a temperature of 24°C.

And we can try other BFCL examples, samples of user prompts are BFCL_v4_web_search.json and the tools are web_search.json

Signed-off-by: yxing <yxing@iquestlab.com>
Signed-off-by: yxing <yxing@iquestlab.com>
Copy link
Copy Markdown
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, LGTM then

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) January 8, 2026 09:34
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 8, 2026
@mergify
Copy link
Copy Markdown

mergify bot commented Jan 8, 2026

Hi @yxing-bj, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

@DarkLight1337
Copy link
Copy Markdown
Member

Please fix pre-commit

Signed-off-by: yxing <yxing@iquestlab.com>
auto-merge was automatically disabled January 8, 2026 11:10

Head branch was pushed to by a user without write access

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) January 8, 2026 13:25
@DarkLight1337 DarkLight1337 merged commit fe86be6 into vllm-project:main Jan 8, 2026
53 of 54 checks passed
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026
Signed-off-by: yxing <yxing@iquestlab.com>
@linygood
Copy link
Copy Markdown

linygood commented Jan 13, 2026

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?
vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

We support qwen3_coder template. You can try. However, the content generated by the current model is still not quite satisfactory. In the future, we will update model to support tools more better

I have try --tool-call-parser qwen3_coder And --tool-call-parser hermers ,the tool call accuracy of the terminal is close to 0,maybe there has something error,please give me the correct config?

In order to support tool call, we start to launch vllm service with the following command:

vllm IQuestLab/IQuestCoder-Instruct --port 8000 --enable-log-requests -tp 4 -dp 1 --trust-remote-code --tool-call-parser hermes --enable-auto-tool-choice

Then we send request and get response. The following is an example:

from openai import OpenAI

def send_messages(messages, tools_list):
    response = client.chat.completions.create(
        model="IQuestLab/IQuestCoder-Instruct",
        messages=messages,
        tools=tools_list,
    )
    return response.choices[0].message

client = OpenAI(
    api_key="1234",
    base_url="http://localhost:8000/v1/",
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather of a location, the user should supply a location first.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    }
                },
                "required": ["location"]
            },
        }
    },
]

messages = [{"role": "user", "content": "How's the weather in Beijing?"}]
message = send_messages(messages, tools)

print(f"User>\t {messages[0]['content']}")

tool = message.tool_calls[0]
print("tool: ", tool)
messages.append(message)

messages.append({"role": "tool", "tool_call_id": tool.id, "content": "the temperature is 24℃ and it would be sunny"})
message = send_messages(messages, tools)

print(f"Model>\t {message.content}")

Then we would get the result:

User>	 How's the weather in Beijing?
tool:  ChatCompletionMessageFunctionToolCall(id='chatcmpl-tool-9ac812fa4039c085', function=Function(arguments='{"location": "Beijing"}', name='get_weather'), type='function')
Model>	 The weather in Beijing is sunny with a temperature of 24°C.

And we can try other BFCL examples, samples of user prompts are BFCL_v4_web_search.json and the tools are web_search.json

I have try this config, but i find model return toolCalls.function.name = ***,this name is not definition in the request tools list ,i want to know if current model has the accuracy of tool call been verified ?

{
  "toolCalls": [
    {
      "index": 0,
      "id": "chatcmpl-tool-***",
      "type": "function",
      "function": {
        "arguments": "***",
        "name": "****"
      }
    }
  ]
}

"temperature":0.6
"top_p":0.8
"top_k":20

@yxing-bj
Copy link
Copy Markdown
Contributor Author

yxing-bj commented Jan 14, 2026

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?
vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

We support qwen3_coder template. You can try. However, the content generated by the current model is still not quite satisfactory. In the future, we will update model to support tools more better

I have try --tool-call-parser qwen3_coder And --tool-call-parser hermers ,the tool call accuracy of the terminal is close to 0,maybe there has something error,please give me the correct config?

In order to support tool call, we start to launch vllm service with the following command:

vllm IQuestLab/IQuestCoder-Instruct --port 8000 --enable-log-requests -tp 4 -dp 1 --trust-remote-code --tool-call-parser hermes --enable-auto-tool-choice

Then we send request and get response. The following is an example:

from openai import OpenAI

def send_messages(messages, tools_list):
    response = client.chat.completions.create(
        model="IQuestLab/IQuestCoder-Instruct",
        messages=messages,
        tools=tools_list,
    )
    return response.choices[0].message

client = OpenAI(
    api_key="1234",
    base_url="http://localhost:8000/v1/",
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather of a location, the user should supply a location first.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    }
                },
                "required": ["location"]
            },
        }
    },
]

messages = [{"role": "user", "content": "How's the weather in Beijing?"}]
message = send_messages(messages, tools)

print(f"User>\t {messages[0]['content']}")

tool = message.tool_calls[0]
print("tool: ", tool)
messages.append(message)

messages.append({"role": "tool", "tool_call_id": tool.id, "content": "the temperature is 24℃ and it would be sunny"})
message = send_messages(messages, tools)

print(f"Model>\t {message.content}")

Then we would get the result:

User>	 How's the weather in Beijing?
tool:  ChatCompletionMessageFunctionToolCall(id='chatcmpl-tool-9ac812fa4039c085', function=Function(arguments='{"location": "Beijing"}', name='get_weather'), type='function')
Model>	 The weather in Beijing is sunny with a temperature of 24°C.

And we can try other BFCL examples, samples of user prompts are BFCL_v4_web_search.json and the tools are web_search.json

I have try this config, but i find model return toolCalls.function.name = ***,this name is not definition in the request tools list ,i want to know if current model has the accuracy of tool call been verified ?

{
  "toolCalls": [
    {
      "index": 0,
      "id": "chatcmpl-tool-***",
      "type": "function",
      "function": {
        "arguments": "***",
        "name": "****"
      }
    }
  ]
}

"temperature":0.6 "top_p":0.8 "top_k":20

We have noticed this issue and will fixed it in the new version.

akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: yxing <yxing@iquestlab.com>
@Mte90
Copy link
Copy Markdown

Mte90 commented Jan 19, 2026

We have noticed this issue and will fixed it in the new version. @yxing-bj

new version of vllm or the model? because I have the issue that with opencode is hallucinating and calling tools that doest' exists like str_replace and when I specifically says to use edit instead pass to that json parameters.
Maybe it is needed to call it in a specific way to vllm to have everything fine?

dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: yxing <yxing@iquestlab.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: yxing <yxing@iquestlab.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants