[Model] Add Grok-2 by dangoldbj · Pull Request #31847 · vllm-project/vllm

dangoldbj · 2026-01-07T00:27:50Z

Purpose

Add first-class vLLM support for Grok-2 models by extending tokenizer support, model registry resolution, execution wiring, tests, and documentation.

This change is not a config-only enablement. It introduces a tiktoken-based Grok-2 tokenizer with .tok.json loading and Grok-style chat templates, wires Grok-family model configurations into the vLLM loading and execution paths, and ensures compatibility with existing inference behavior.

Targeted tests and documentation updates are included to validate correctness and make Grok-2 usable in vLLM without downstream patches.

Test Plan

pytest tests/models/language/generation/test_grok.py -k grok
pytest tests/tokenizers_/test_registry.py -k grok2

Test Result

Pass

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

github-actions · 2026-01-07T00:27:59Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

mergify · 2026-01-07T00:28:29Z

Documentation preview: https://vllm--31847.org.readthedocs.build/en/31847/

gemini-code-assist

Code Review

This pull request adds support for the Grok-2 model. The changes are comprehensive, including the model implementation, a new tiktoken-based tokenizer, and updates to documentation and tests. The model logic is cleanly adapted from the existing Grok-1 implementation to support both models. The new tokenizer for Grok-2 is also well-implemented. I've identified a couple of high-severity issues related to silent error handling in the new tokenizer, which could lead to incorrect behavior without any warning to the user. Addressing these will improve the robustness and debuggability of the new tokenizer.

vllm/tokenizers/grok2.py

dangoldbj · 2026-01-07T00:30:41Z

cc @mgoin

mergify · 2026-01-07T00:32:32Z

Hi @dangoldbj, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

mergify · 2026-01-07T01:06:45Z

Hi @dangoldbj, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

dangoldbj · 2026-01-07T01:24:50Z

@DarkLight1337 @ywang96 @mgoin Added Grok‑2 support (tiktoken tokenizer.tok.json tokenizer + Grok1/Grok2 model plumbing/weight mapping tweaks) and included registry/docs updates. I would appreciate a review.

DarkLight1337

Thanks, any chance you can add some e2e tests for ensuring correctness?

mergify · 2026-01-07T10:32:24Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @dangoldbj.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

vllm/model_executor/models/grok1.py

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

dangoldbj · 2026-01-08T10:38:11Z

Thanks, LGTM now

@DarkLight1337 Thanks for the review! Appreciate it!

dangoldbj · 2026-01-08T12:49:10Z

@DarkLight1337 Looks like the current Buildkite failures are likely infra/env, not from this PR’s diff.

buildkite/ci/pr fails in 2 Node Tests (4 GPUs) with ROCM_HOME: unbound variable in .buildkite/scripts/run-multi-node-test.sh.
buildkite/amd-ci fails during collection of tests/entrypoints/test_grpc_server.py with ImportError: cannot import name 'vllm_engine_pb2' from 'vllm.grpc'.

This PR only touches Grok model/tokenizer + docs/tests. OK to merge/ignore?

DarkLight1337 · 2026-01-08T12:59:40Z

They are unrelated so we can merge.

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

Signed-off-by: dangoldbj <dangoldbj23@gmail.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

dangoldbj requested review from DarkLight1337 and ywang96 as code owners January 7, 2026 00:27

mergify bot added the documentation Improvements or additions to documentation label Jan 7, 2026

gemini-code-assist bot reviewed Jan 7, 2026

View reviewed changes

vllm/tokenizers/grok2.py Outdated Show resolved Hide resolved

vllm/tokenizers/grok2.py Outdated Show resolved Hide resolved

dangoldbj force-pushed the grok-2 branch 2 times, most recently from 83af8bf to f3138e5 Compare January 7, 2026 01:02

dangoldbj force-pushed the grok-2 branch from 407c59d to 47d69a0 Compare January 7, 2026 01:19

DarkLight1337 reviewed Jan 7, 2026

View reviewed changes

mergify bot added the new-model Requests to new models label Jan 7, 2026

mergify bot added the needs-rebase label Jan 7, 2026

dangoldbj force-pushed the grok-2 branch from d2323b8 to b2d7053 Compare January 7, 2026 10:39

mergify bot removed the needs-rebase label Jan 7, 2026

DarkLight1337 reviewed Jan 7, 2026

View reviewed changes

vllm/model_executor/models/grok1.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Jan 7, 2026

View reviewed changes

vllm/model_executor/models/grok1.py Show resolved Hide resolved

dangoldbj force-pushed the grok-2 branch from 56c2f49 to 7bbc402 Compare January 7, 2026 11:37

dangoldbj mentioned this pull request Jan 7, 2026

[New Model]: Grok 2 #23557

Closed

1 task

dangoldbj force-pushed the grok-2 branch from 7bbc402 to f2890a6 Compare January 7, 2026 13:13

dangoldbj requested a review from DarkLight1337 January 7, 2026 13:32

dangoldbj requested review from WoosukKwon, mgoin, tjtanaa, tlrmchlsmth and yewentao256 as code owners January 7, 2026 16:41

dangoldbj added 15 commits January 8, 2026 10:51

Add Grok-2 to supported models documentation

adc5f29

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

Add tiktoken-based tokenizer for Grok-2

3eb3d80

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

Register Grok-2 tokenizer with auto-detection

e81aa23

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

Linting

03d03fc

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

Remove catching a broad Exception

ed8bd66

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

Linting

ed2350c

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

Linting

4761c1b

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

improve type import for ChatCompletionMessageParam

aafea6e

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

Add e2e tests for checking correctness

4be0b3e

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

default router_logit_soft_cap to 0.0

3b39463

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

Fix vocab_size in grok2 e2e tests

f0fb253

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

Refactor Grok1/Grok2 into base + dispatcher

1f4820f

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

docs move Grok2 note out of table

1f8dab3

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

remove unused supports_grok2_tokenizer function

306c025

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

improve tests

a9177c6

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

auto-merge was automatically disabled January 8, 2026 09:51
Head branch was pushed to by a user without write access

dangoldbj force-pushed the grok-2 branch from 1fb2733 to a9177c6 Compare January 8, 2026 09:51

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 8, 2026

vllm-bot merged commit 59d260f into vllm-project:main Jan 8, 2026
50 of 53 checks passed

github-project-automation bot moved this from Ready to Done in NVIDIA Jan 8, 2026

yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026

[Model] Add Grok-2 (vllm-project#31847)

57603cf

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026

[Model] Add Grok-2 (vllm-project#31847)

e1b7f59

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026

[Model] Add Grok-2 (vllm-project#31847)

7902b75

Signed-off-by: dangoldbj <dangoldbj23@gmail.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

[Model] Add Grok-2 (vllm-project#31847)

fc34b92

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>

hmellor mentioned this pull request Mar 6, 2026

[Model] Add Support for Grok2 #24286

Closed

5 tasks

Uh oh!

Conversation

dangoldbj commented Jan 7, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Jan 7, 2026

Uh oh!

mergify bot commented Jan 7, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

dangoldbj commented Jan 7, 2026

Uh oh!

mergify bot commented Jan 7, 2026

Uh oh!

mergify bot commented Jan 7, 2026

Uh oh!

dangoldbj commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Jan 7, 2026

Uh oh!

Uh oh!

Uh oh!

dangoldbj commented Jan 8, 2026

Uh oh!

dangoldbj commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 commented Jan 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dangoldbj commented Jan 7, 2026 •

edited by github-actions bot

Loading

dangoldbj commented Jan 7, 2026 •

edited

Loading

DarkLight1337 left a comment •

edited

Loading

dangoldbj commented Jan 8, 2026 •

edited

Loading