[Model] Support IQuestCoder model by yxing-bj · Pull Request #31575 · vllm-project/vllm

yxing-bj · 2025-12-31T11:08:17Z

Purpose

IQuest-Coder-V1 is a new family of code large language models (LLMs) designed to advance autonomous software engineering and code intelligence. We built a repo about IQuestCoder.

We had uploaded these models to Hugging Face, including IQuestCoder and IQuestLoopCoder. To make them easier for everyone to use, we support these models on vLLM platform.

Test Plan

Firstly, we start to launch a vLLM server

For instruct models:

vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code

For Loop instruct models:

vllm serve IQuestLab/IQuest-Coder-V1-40B-Loop-Instruct --tensor-parallel-size 4 --trust-remote-code

For Thinking models with reasoning support:

vllm serve IQuestLab/IQuest-Coder-V1-40B-Thinking --reasoning-parser qwen3 --tensor-parallel-size 4 --trust-remote-code

Then , we use vLLM with an OpenAI-compatible API endpoint

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request introduces support for two new models, IQuestCoder and IQuestLoopCoder, by adding their respective implementations. The code is well-structured and largely follows the existing patterns for model integration in vLLM. My review focuses on ensuring the correctness and clarity of the new model definitions. I've identified a few areas for improvement, mainly related to cleaning up unused code for pipeline parallelism that seems to have been carried over from template files, and correcting some type hint mismatches. Addressing these points will improve the maintainability and correctness of the new model implementations.

vllm/model_executor/models/iquest_coder.py

vllm/model_executor/models/iquest_loopcoder.py

github-actions · 2025-12-31T11:22:20Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

vllm/model_executor/models/iquest_coder.py

vllm/model_executor/models/iquest_loopcoder.py

youkaichao · 2026-01-01T03:59:34Z

@yxing-bj you can send email to collaboration@vllm.ai for official collaboration!

yxing-bj · 2026-01-01T11:31:11Z

@yxing-bj you can send email to collaboration@vllm.ai for official collaboration!

Thanks. How should I send an email to establish a collaboration?

youkaichao · 2026-01-01T17:47:06Z

@yxing-bj you can send email to collaboration@vllm.ai for official collaboration!

Thanks. How should I send an email to establish a collaboration?

Just send an email from your company email address, and then we can better coordinate model releases before it goes public.

linygood · 2026-01-06T03:24:17Z

When can this PR be merged into the main branch?

yxing-bj · 2026-01-06T03:54:29Z

When can this PR be merged into the main branch?

I replaced the LoopCoderNorm with RMSNorm and did eval again. If OK , I will refactor code

mergify · 2026-01-06T06:04:31Z

Documentation preview: https://vllm--31575.org.readthedocs.build/en/31575/

Signed-off-by: yxing <yxing@iquestlab.com>

linygood · 2026-01-06T06:23:51Z

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?

vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

yxing-bj · 2026-01-07T03:02:23Z

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?

vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

We support qwen3_coder template. You can try. However, the content generated by the current model is still not quite satisfactory. In the future, we will update model to support tools more better

linygood · 2026-01-07T11:17:21Z

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?
vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

We support qwen3_coder template. You can try. However, the content generated by the current model is still not quite satisfactory. In the future, we will update model to support tools more better

I have try --tool-call-parser qwen3_coder And --tool-call-parser hermers ，the tool call accuracy of the terminal is close to 0，maybe there has something error，please give me the correct config？

tbraun96 · 2026-01-07T15:23:39Z

We need the tokenizer.json for the loop models

Alan-D-Chen · 2026-01-08T02:37:19Z

When can this PR be merged into the main branch?When can this PR be merged into the main branch?这个PR什么时候可以合并到主分支？

the same question !!

yxing-bj · 2026-01-08T03:43:31Z

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?
vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

We support qwen3_coder template. You can try. However, the content generated by the current model is still not quite satisfactory. In the future, we will update model to support tools more better

I have try --tool-call-parser qwen3_coder And --tool-call-parser hermers ，the tool call accuracy of the terminal is close to 0，maybe there has something error，please give me the correct config？

In order to support tool call, we start to launch vllm service with the following command:

vllm IQuestLab/IQuestCoder-Instruct --port 8000 --enable-log-requests -tp 4 -dp 1 --trust-remote-code --tool-call-parser hermes --enable-auto-tool-choice

Then we send request and get response. The following is an example:

from openai import OpenAI

def send_messages(messages, tools_list):
    response = client.chat.completions.create(
        model="IQuestLab/IQuestCoder-Instruct",
        messages=messages,
        tools=tools_list,
    )
    return response.choices[0].message

client = OpenAI(
    api_key="1234",
    base_url="http://localhost:8000/v1/",
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather of a location, the user should supply a location first.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    }
                },
                "required": ["location"]
            },
        }
    },
]

messages = [{"role": "user", "content": "How's the weather in Beijing?"}]
message = send_messages(messages, tools)

print(f"User>\t {messages[0]['content']}")

tool = message.tool_calls[0]
print("tool: ", tool)
messages.append(message)

messages.append({"role": "tool", "tool_call_id": tool.id, "content": "the temperature is 24℃ and it would be sunny"})
message = send_messages(messages, tools)

print(f"Model>\t {message.content}")

Then we would get the result:

User>	 How's the weather in Beijing?
tool:  ChatCompletionMessageFunctionToolCall(id='chatcmpl-tool-9ac812fa4039c085', function=Function(arguments='{"location": "Beijing"}', name='get_weather'), type='function')
Model>	 The weather in Beijing is sunny with a temperature of 24°C.

And we can try other BFCL examples, samples of user prompts are BFCL_v4_web_search.json and the tools are web_search.json

vllm/model_executor/models/iquest_loopcoder.py

vllm/model_executor/models/registry.py

Signed-off-by: yxing <yxing@iquestlab.com>

vllm/model_executor/models/iquest_loopcoder.py

Signed-off-by: yxing <yxing@iquestlab.com>

DarkLight1337

Ok, LGTM then

mergify · 2026-01-08T10:01:57Z

Hi @yxing-bj, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

DarkLight1337 · 2026-01-08T10:17:34Z

Please fix pre-commit

Signed-off-by: yxing <yxing@iquestlab.com>

linygood · 2026-01-13T06:29:40Z

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?
vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

We support qwen3_coder template. You can try. However, the content generated by the current model is still not quite satisfactory. In the future, we will update model to support tools more better

I have try --tool-call-parser qwen3_coder And --tool-call-parser hermers ，the tool call accuracy of the terminal is close to 0，maybe there has something error，please give me the correct config？

In order to support tool call, we start to launch vllm service with the following command:
vllm IQuestLab/IQuestCoder-Instruct --port 8000 --enable-log-requests -tp 4 -dp 1 --trust-remote-code --tool-call-parser hermes --enable-auto-tool-choice
Then we send request and get response. The following is an example:
from openai import OpenAI

def send_messages(messages, tools_list):
    response = client.chat.completions.create(
        model="IQuestLab/IQuestCoder-Instruct",
        messages=messages,
        tools=tools_list,
    )
    return response.choices[0].message

client = OpenAI(
    api_key="1234",
    base_url="http://localhost:8000/v1/",
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather of a location, the user should supply a location first.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    }
                },
                "required": ["location"]
            },
        }
    },
]

messages = [{"role": "user", "content": "How's the weather in Beijing?"}]
message = send_messages(messages, tools)

print(f"User>\t {messages[0]['content']}")

tool = message.tool_calls[0]
print("tool: ", tool)
messages.append(message)

messages.append({"role": "tool", "tool_call_id": tool.id, "content": "the temperature is 24℃ and it would be sunny"})
message = send_messages(messages, tools)

print(f"Model>\t {message.content}")
Then we would get the result:
User>	 How's the weather in Beijing?
tool:  ChatCompletionMessageFunctionToolCall(id='chatcmpl-tool-9ac812fa4039c085', function=Function(arguments='{"location": "Beijing"}', name='get_weather'), type='function')
Model>	 The weather in Beijing is sunny with a temperature of 24°C.
And we can try other BFCL examples, samples of user prompts are BFCL_v4_web_search.json and the tools are web_search.json

I have try this config， but i find model return toolCalls.function.name = ***，this name is not definition in the request tools list ，i want to know if current model has the accuracy of tool call been verified ?

{
  "toolCalls": [
    {
      "index": 0,
      "id": "chatcmpl-tool-***",
      "type": "function",
      "function": {
        "arguments": "***",
        "name": "****"
      }
    }
  ]
}

"temperature":0.6
"top_p":0.8
"top_k":20

yxing-bj · 2026-01-14T03:57:15Z

if IQuestLab/IQuest-Coder-V1-40B-Instruct support tool call? use qwen3_coder template?
vllm serve IQuestLab/IQuest-Coder-V1-40B-Instruct --tensor-parallel-size 4 --trust-remote-code --tool-call-parser qwen3_coder --enable-auto-tool-choice

We support qwen3_coder template. You can try. However, the content generated by the current model is still not quite satisfactory. In the future, we will update model to support tools more better

I have try --tool-call-parser qwen3_coder And --tool-call-parser hermers ，the tool call accuracy of the terminal is close to 0，maybe there has something error，please give me the correct config？

In order to support tool call, we start to launch vllm service with the following command:
vllm IQuestLab/IQuestCoder-Instruct --port 8000 --enable-log-requests -tp 4 -dp 1 --trust-remote-code --tool-call-parser hermes --enable-auto-tool-choice
Then we send request and get response. The following is an example:
from openai import OpenAI

def send_messages(messages, tools_list):
    response = client.chat.completions.create(
        model="IQuestLab/IQuestCoder-Instruct",
        messages=messages,
        tools=tools_list,
    )
    return response.choices[0].message

client = OpenAI(
    api_key="1234",
    base_url="http://localhost:8000/v1/",
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather of a location, the user should supply a location first.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    }
                },
                "required": ["location"]
            },
        }
    },
]

messages = [{"role": "user", "content": "How's the weather in Beijing?"}]
message = send_messages(messages, tools)

print(f"User>\t {messages[0]['content']}")

tool = message.tool_calls[0]
print("tool: ", tool)
messages.append(message)

messages.append({"role": "tool", "tool_call_id": tool.id, "content": "the temperature is 24℃ and it would be sunny"})
message = send_messages(messages, tools)

print(f"Model>\t {message.content}")
Then we would get the result:
User>	 How's the weather in Beijing?
tool:  ChatCompletionMessageFunctionToolCall(id='chatcmpl-tool-9ac812fa4039c085', function=Function(arguments='{"location": "Beijing"}', name='get_weather'), type='function')
Model>	 The weather in Beijing is sunny with a temperature of 24°C.
And we can try other BFCL examples, samples of user prompts are BFCL_v4_web_search.json and the tools are web_search.json
I have try this config， but i find model return toolCalls.function.name = ***，this name is not definition in the request tools list ，i want to know if current model has the accuracy of tool call been verified ?
{
  "toolCalls": [
    {
      "index": 0,
      "id": "chatcmpl-tool-***",
      "type": "function",
      "function": {
        "arguments": "***",
        "name": "****"
      }
    }
  ]
}
"temperature":0.6 "top_p":0.8 "top_k":20

We have noticed this issue and will fixed it in the new version.

Signed-off-by: yxing <yxing@iquestlab.com>

Mte90 · 2026-01-19T11:16:08Z

We have noticed this issue and will fixed it in the new version. @yxing-bj

new version of vllm or the model? because I have the issue that with opencode is hallucinating and calling tools that doest' exists like str_replace and when I specifically says to use edit instead pass to that json parameters.
Maybe it is needed to call it in a specific way to vllm to have everything fine?

Signed-off-by: yxing <yxing@iquestlab.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

Signed-off-by: yxing <yxing@iquestlab.com>

mergify bot added the new-model Requests to new models label Dec 31, 2025

gemini-code-assist bot reviewed Dec 31, 2025

View reviewed changes

Isotr0py reviewed Dec 31, 2025

View reviewed changes

vllm/model_executor/models/iquest_coder.py Outdated Show resolved Hide resolved

yxing-bj changed the title ~~support iquest model~~ [Model] Support iquest model Dec 31, 2025

yxing-bj changed the title ~~[Model] Support iquest model~~ [Model] Support IQuestCoder model Dec 31, 2025

Isotr0py reviewed Jan 1, 2026

View reviewed changes

vllm/model_executor/models/iquest_loopcoder.py Outdated Show resolved Hide resolved

vllm/model_executor/models/iquest_loopcoder.py Show resolved Hide resolved

csfjing mentioned this pull request Jan 2, 2026

Reward Hacking of SWE-Bench IQuestLab/IQuest-Coder-V1#14

Open

yxing-bj force-pushed the dev/loop_model branch from 08f15c8 to e75857e Compare January 6, 2026 05:05

yxing-bj requested review from DarkLight1337 and ywang96 as code owners January 6, 2026 06:03

mergify bot added the documentation Improvements or additions to documentation label Jan 6, 2026

yxing-bj force-pushed the dev/loop_model branch from 92cdea2 to 2f4e88d Compare January 6, 2026 06:13

yxing-bj added 4 commits January 6, 2026 14:15

support iquest model

2138a3b

Signed-off-by: yxing <yxing@iquestlab.com>

refactor code

dc1fc8c

Signed-off-by: yxing <yxing@iquestlab.com>

replace LoopCoderNorm with RMSNorm

18b2463

Signed-off-by: yxing <yxing@iquestlab.com>

update models

5b56358

Signed-off-by: yxing <yxing@iquestlab.com>

yxing-bj force-pushed the dev/loop_model branch from 2f4e88d to 5b56358 Compare January 6, 2026 06:16

DarkLight1337 reviewed Jan 8, 2026

View reviewed changes

vllm/model_executor/models/iquest_loopcoder.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Jan 8, 2026

View reviewed changes

vllm/model_executor/models/registry.py Outdated Show resolved Hide resolved

refactor code

d60187d

Signed-off-by: yxing <yxing@iquestlab.com>

DarkLight1337 reviewed Jan 8, 2026

View reviewed changes

vllm/model_executor/models/iquest_loopcoder.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Jan 8, 2026

View reviewed changes

vllm/model_executor/models/iquest_loopcoder.py Show resolved Hide resolved

refactor iquest loopcoder

6e4804c

Signed-off-by: yxing <yxing@iquestlab.com>

DarkLight1337 approved these changes Jan 8, 2026

View reviewed changes

DarkLight1337 enabled auto-merge (squash) January 8, 2026 09:34

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 8, 2026

fix precommit bug

8eaf51c

Signed-off-by: yxing <yxing@iquestlab.com>

auto-merge was automatically disabled January 8, 2026 11:10
Head branch was pushed to by a user without write access

DarkLight1337 enabled auto-merge (squash) January 8, 2026 13:25

DarkLight1337 merged commit fe86be6 into vllm-project:main Jan 8, 2026
53 of 54 checks passed

yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026

[Model] Support IQuestCoder model (vllm-project#31575)

743b44a

Signed-off-by: yxing <yxing@iquestlab.com>

akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026

[Model] Support IQuestCoder model (vllm-project#31575)

cdbec13

Signed-off-by: yxing <yxing@iquestlab.com>

dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026

[Model] Support IQuestCoder model (vllm-project#31575)

3c72e6c

Signed-off-by: yxing <yxing@iquestlab.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

[Model] Support IQuestCoder model (vllm-project#31575)

f5413c4

Signed-off-by: yxing <yxing@iquestlab.com>

Uh oh!

Conversation

yxing-bj commented Dec 31, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

youkaichao commented Jan 1, 2026

Uh oh!

yxing-bj commented Jan 1, 2026

Uh oh!

youkaichao commented Jan 1, 2026

Uh oh!

linygood commented Jan 6, 2026

Uh oh!

yxing-bj commented Jan 6, 2026

Uh oh!

mergify bot commented Jan 6, 2026

Uh oh!

linygood commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yxing-bj commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linygood commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbraun96 commented Jan 7, 2026

Uh oh!

Alan-D-Chen commented Jan 8, 2026

Uh oh!

yxing-bj commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Jan 8, 2026

Uh oh!

DarkLight1337 commented Jan 8, 2026

Uh oh!

Uh oh!

linygood commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yxing-bj commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mte90 commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

yxing-bj commented Dec 31, 2025 •

edited by github-actions bot

Loading

linygood commented Jan 6, 2026 •

edited

Loading

yxing-bj commented Jan 7, 2026 •

edited

Loading

linygood commented Jan 7, 2026 •

edited

Loading

yxing-bj commented Jan 8, 2026 •

edited

Loading

linygood commented Jan 13, 2026 •

edited

Loading

yxing-bj commented Jan 14, 2026 •

edited

Loading