Skip to content

[Model] Add Grok-2 #31847

Merged
vllm-bot merged 17 commits intovllm-project:mainfrom
dangoldbj:grok-2
Jan 8, 2026
Merged

[Model] Add Grok-2 #31847
vllm-bot merged 17 commits intovllm-project:mainfrom
dangoldbj:grok-2

Conversation

@dangoldbj
Copy link
Copy Markdown
Contributor

@dangoldbj dangoldbj commented Jan 7, 2026

Purpose

Add first-class vLLM support for Grok-2 models by extending tokenizer support, model registry resolution, execution wiring, tests, and documentation.

This change is not a config-only enablement. It introduces a tiktoken-based Grok-2 tokenizer with .tok.json loading and Grok-style chat templates, wires Grok-family model configurations into the vLLM loading and execution paths, and ensures compatibility with existing inference behavior.

Targeted tests and documentation updates are included to validate correctness and make Grok-2 usable in vLLM without downstream patches.

Test Plan

  • pytest tests/models/language/generation/test_grok.py -k grok
  • pytest tests/tokenizers_/test_registry.py -k grok2

Test Result

  • Pass

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 7, 2026

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

@mergify
Copy link
Copy Markdown

mergify bot commented Jan 7, 2026

Documentation preview: https://vllm--31847.org.readthedocs.build/en/31847/

@mergify mergify bot added the documentation Improvements or additions to documentation label Jan 7, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the Grok-2 model. The changes are comprehensive, including the model implementation, a new tiktoken-based tokenizer, and updates to documentation and tests. The model logic is cleanly adapted from the existing Grok-1 implementation to support both models. The new tokenizer for Grok-2 is also well-implemented. I've identified a couple of high-severity issues related to silent error handling in the new tokenizer, which could lead to incorrect behavior without any warning to the user. Addressing these will improve the robustness and debuggability of the new tokenizer.

@dangoldbj
Copy link
Copy Markdown
Contributor Author

cc @mgoin

@mergify
Copy link
Copy Markdown

mergify bot commented Jan 7, 2026

Hi @dangoldbj, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

@dangoldbj dangoldbj force-pushed the grok-2 branch 2 times, most recently from 83af8bf to f3138e5 Compare January 7, 2026 01:02
@mergify
Copy link
Copy Markdown

mergify bot commented Jan 7, 2026

Hi @dangoldbj, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

@dangoldbj
Copy link
Copy Markdown
Contributor Author

dangoldbj commented Jan 7, 2026

@DarkLight1337 @ywang96 @mgoin Added Grok‑2 support (tiktoken tokenizer.tok.json tokenizer + Grok1/Grok2 model plumbing/weight mapping tweaks) and included registry/docs updates. I would appreciate a review.

Copy link
Copy Markdown
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, any chance you can add some e2e tests for ensuring correctness?

@mergify mergify bot added the new-model Requests to new models label Jan 7, 2026
@mergify
Copy link
Copy Markdown

mergify bot commented Jan 7, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @dangoldbj.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
auto-merge was automatically disabled January 8, 2026 09:51

Head branch was pushed to by a user without write access

@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 8, 2026
@dangoldbj
Copy link
Copy Markdown
Contributor Author

Thanks, LGTM now

@DarkLight1337 Thanks for the review! Appreciate it!

@dangoldbj
Copy link
Copy Markdown
Contributor Author

dangoldbj commented Jan 8, 2026

@DarkLight1337 Looks like the current Buildkite failures are likely infra/env, not from this PR’s diff.

  • buildkite/ci/pr fails in 2 Node Tests (4 GPUs) with ROCM_HOME: unbound variable in .buildkite/scripts/run-multi-node-test.sh.
  • buildkite/amd-ci fails during collection of tests/entrypoints/test_grpc_server.py with ImportError: cannot import name 'vllm_engine_pb2' from 'vllm.grpc'.

This PR only touches Grok model/tokenizer + docs/tests. OK to merge/ignore?

@DarkLight1337
Copy link
Copy Markdown
Member

They are unrelated so we can merge.

@vllm-bot vllm-bot merged commit 59d260f into vllm-project:main Jan 8, 2026
50 of 53 checks passed
@github-project-automation github-project-automation bot moved this from Ready to Done in NVIDIA Jan 8, 2026
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: dangoldbj <dangoldbj23@gmail.com>
@hmellor hmellor mentioned this pull request Mar 6, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation frontend new-model Requests to new models nvidia ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants