feat: support internlm2 by zhyncs · Pull Request #636 · sgl-project/sglang

zhyncs · 2024-07-17T03:29:31Z

cc @Ying1123 @merrymercy @hnyls2002

tested with internlm2-chat-7b

python3 playground/reference_hf.py --model internlm2-chat-7b
python3 -m sglang.bench_latency --model internlm2-chat-7b --correct --output-len 16 --trust-remote-code

transformers

prefill logits tensor([291.2500, 293.7500, 364.5000,  ..., 307.5000, 332.2500, 257.0000],
       device='cuda:0')
prefill logits tensor([314.7500, 320.5000, 398.0000,  ..., 333.0000, 359.5000, 280.0000],
       device='cuda:0')
prefill logits tensor([332.2500, 334.2500, 404.2500,  ..., 351.5000, 374.5000, 293.0000],
       device='cuda:0')

sglang

prefill logits (final) tensor([[288.7500, 291.2500, 361.5000,  ..., 305.0000, 329.5000, 254.8750],
        [313.7500, 319.7500, 396.2500,  ..., 331.7500, 358.7500, 279.0000],
        [331.5000, 333.5000, 403.2500,  ..., 350.7500, 373.7500, 292.5000]],
       device='cuda:0', dtype=torch.float16)

Ying1123

On my machine (H100):
SRT:

prefill logits (final) tensor([[262., 262., 342.,  ..., 272., 290., 240.],
        [272., 276., 348.,  ..., 284., 302., 247.],
        [360., 362., 450.,  ..., 378., 392., 326.]], device='cuda:0',
       dtype=torch.bfloat16)

HF:

prefill logits tensor([265.5000, 264.5000, 345.0000,  ..., 274.7500, 293.0000, 242.7500],
       device='cuda:0')
prefill logits tensor([279.2500, 281.7500, 355.2500,  ..., 290.0000, 310.0000, 252.6250],
       device='cuda:0')
prefill logits tensor([359.7500, 361.2500, 448.5000,  ..., 377.0000, 390.5000, 324.5000],
       device='cuda:0')

….py (sgl-project#636) * Create test_ascend_disaggregation_deepep_k8s_2p1d.py * Create pr-test-npu-innersource.yml * Add files via upload * Add files via upload * Delete test_ascend_tp1_bf16.py * Delete test/srt/ascend_k8s/test_vlm_models_glm_4_1v_9b_thinking.py * Create deepep.yaml * Create run_ascend_ci.py * Create run_ascend_deepep.sh * Update run_suite.py * Update pr-test-npu-innersource.yml * Update pr-test-npu-innersource.yml * Update pr-test-npu-innersource.yml * Update run_ascend_deepep.sh * 111 * Update test_ascend_disaggregation_deepep_k8s_2p1d.py * Update test_ascend_disaggregation_deepep_k8s_2p1d.py * Update pr-test-npu-innersource.yml * Update pr-test-npu-innersource.yml * Update pr-test-npu-innersource.yml * Update pr-test-npu-innersource.yml * Update pr-test-npu-innersource.yml * Update pr-test-npu-innersource.yml * Update test_ascend_disaggregation_deepep_k8s_2p1d.py * Update test_ascend_disaggregation_deepep_k8s_2p1d.py * Update test_ascend_disaggregation_deepep_k8s_2p1d.py * Update test_ascend_disaggregation_deepep_k8s_2p1d.py * Update test_ascend_disaggregation_deepep_k8s_2p1d.py * Update test_ascend_disaggregation_deepep_k8s_2p1d.py * Update test_ascend_disaggregation_deepep_k8s_2p1d.py * Update test_ascend_disaggregation_deepep_k8s_2p1d.py * Update run_ascend_ci.py * Update test_ascend_disaggregation_deepep_k8s_2p1d.py --------- Co-authored-by: Sugar920 <duzhangchen@h-partners.com>

…ccuracy threshold PR sgl-project#17784 (transformers 5.3.0 upgrade) changed grok.py to access config.rope_parameters["rope_theta"] directly, but GitConfig (grok-2) does not have this attribute, crashing the server on startup with AttributeError: 'GitConfig' object has no attribute 'rope_parameters'. Restore safe access via getattr with fallback, matching the pattern used elsewhere in the codebase. Also lower the MI325 Grok-2 GSM8K accuracy threshold from 0.915 to 0.90 to match the MI35x test, since nightly sgl-project#636 showed 0.910 which is within normal run-to-run variance.

feat: support internlm2

10cca2a

Ying1123 approved these changes Jul 17, 2024

View reviewed changes

Ying1123 merged commit a8552cb into sgl-project:main Jul 17, 2024

zhyncs deleted the internlm2 branch July 17, 2024 05:42

timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025

feat: support internlm2 (sgl-project#636)

e790eec

michaelzhang-ai mentioned this pull request Mar 20, 2026

[AMD] Fix Grok-2 nightly: safe rope_parameters access + relax MI325 accuracy threshold #20985

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support internlm2#636

feat: support internlm2#636
Ying1123 merged 1 commit intosgl-project:mainfrom
zhyncs:internlm2

zhyncs commented Jul 17, 2024

Uh oh!

Ying1123 left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhyncs commented Jul 17, 2024

Uh oh!

Ying1123 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Ying1123 left a comment •

edited

Loading