Skip to content

[Test] Verify grok.py rope_theta fix on NVIDIA (draft — do not merge)#21725

Closed
michaelzhang-ai wants to merge 4 commits intosgl-project:mainfrom
michaelzhang-ai:test/nv-grok1-verify
Closed

[Test] Verify grok.py rope_theta fix on NVIDIA (draft — do not merge)#21725
michaelzhang-ai wants to merge 4 commits intosgl-project:mainfrom
michaelzhang-ai:test/nv-grok1-verify

Conversation

@michaelzhang-ai
Copy link
Copy Markdown
Collaborator

Purpose

Draft PR to run NVIDIA CI and verify that PR #21518's grok.py change (safe rope_theta fallback for Grok-1 INT4) does not break NVIDIA.

Do not merge — this only re-enables the disabled dummy Grok-1 test for CI verification.

Changes

  1. grok.py — from [AMD] Fix Handle missing rope_theta in get_rope_config for Grok-1 #21518: replaces get_rope_config(config) with safe getattr + fallback for Grok1Config which lacks rope_theta
  2. test_dummy_grok_models.py — removes disabled="Temporarily disabled" to run dummy Grok-1 on stage-b-test-2-gpu-large

Expected result

stage-b-test-2-gpu-largetest_dummy_grok_1 passes (uses model_type: "mixtral"MixtralConfig with rope_theta default).

The Grok-1 HuggingFace config does not define `rope_theta` (it relies
on the standard default of 10000). After sgl-project#21135 migrated grok.py from
`getattr(config, "rope_theta", 10000)` to `get_rope_config(config)`,
loading Grok-1 crashes with:

  AttributeError: 'Grok1Config' object has no attribute 'rope_theta'

Fix by replacing the `get_rope_config()` call in grok.py with local
rope_theta extraction that safely falls back to 10000, matching the
original behavior before sgl-project#21135.

Fixes AMD nightly failures: nightly-8-gpu-grok1-int4 and
nightly-8-gpu-mi35x-grok1-int4 (both exit code 255).
…a fix

Temporarily re-enable the disabled dummy Grok-1 test on NVIDIA CI
(stage-b-test-2-gpu-large) to verify sgl-project#21518's grok.py change doesn't
break NVIDIA. This PR is draft-only for CI verification.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@michaelzhang-ai michaelzhang-ai marked this pull request as ready for review March 31, 2026 06:30
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@michaelzhang-ai
Copy link
Copy Markdown
Collaborator Author

Closing this draft verification PR. Since it comes from a fork branch, we can't reliably run/validate the intended internal NVIDIA stage here. We'll keep verification on the main fix PR path instead.

@michaelzhang-ai michaelzhang-ai deleted the test/nv-grok1-verify branch March 31, 2026 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant