Skip to content

Fix gpt-oss yarn with truncate argument#14270

Merged
hnyls2002 merged 3 commits intomainfrom
lsyin/fix-gpt-oss-truncate-rope
Dec 18, 2025
Merged

Fix gpt-oss yarn with truncate argument#14270
hnyls2002 merged 3 commits intomainfrom
lsyin/fix-gpt-oss-truncate-rope

Conversation

@hnyls2002
Copy link
Collaborator

@hnyls2002 hnyls2002 commented Dec 2, 2025

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@harrisonlimh
Copy link
Collaborator

harrisonlimh commented Dec 3, 2025

Tested with gpt-oss-120b, and it seems the changed calculations are not creating meaningful differences. I will try with gpt-oss-20b as well.

Impacted variables and tensors with and without truncation:

  • correction_range: low: 8, high: 18 vs. low: 8.092779115512402, high: 17.39802450158856
  • ramp_func.mean(): 0.5781250596046448 vs. 0.585820198059082
  • inv_freq.mean(): 0.0993923768401146 vs. 0.09938869625329971

GPQA eval scores

with truncate

  • high:

    • {'chars': 128.6344696969697, 'chars:std': 336.4544423341523, 'score': 0.7973484848484849, 'score:std': 0.40197497255216075}
  • medium:

    • {'chars': 267.67108585858585, 'chars:std': 424.53770135412367, 'score': 0.726010101010101, 'score:std': 0.44600385002979953}
    • {'chars': 233.18371212121212, 'chars:std': 402.93011677260085, 'score': 0.7253787878787878, 'score:std': 0.44632320349079807}
  • low:

    • {'chars': 150.23800505050505, 'chars:std': 324.1160777618058, 'score': 0.6496212121212122, 'score:std': 0.4770885587429018}
    • {'chars': 145.9671717171717, 'chars:std': 311.72771065301606, 'score': 0.6357323232323232, 'score:std': 0.48122420598922094}

without truncate

  • high:

    • {'chars': 138.97095959595958, 'chars:std': 367.965799993336, 'score': 0.7847222222222222, 'score:std': 0.4110149099154914}
  • medium:

    • {'chars': 251.08838383838383, 'chars:std': 416.924230715839, 'score': 0.7417929292929293, 'score:std': 0.4376484654879353}
    • {'chars': 230.09911616161617, 'chars:std': 405.0049643033156, 'score': 0.7228535353535354, 'score:std': 0.4475894343932065
  • low:

    • {'chars': 146.62815656565655, 'chars:std': 314.83139425447695, 'score': 0.6590909090909091, 'score:std': 0.4740148548775957}
    • {'chars': 136.94444444444446, 'chars:std': 302.62053603939734, 'score': 0.6470959595959596, 'score:std': 0.477873182623323}

@hnyls2002
Copy link
Collaborator Author

/tag-and-rerun-ci

@github-actions github-actions bot added the run-ci label Dec 8, 2025
@hlu1
Copy link
Collaborator

hlu1 commented Dec 18, 2025

This fix is consistent with the reference implementation from gpt-oss: https://github.com/openai/gpt-oss/blob/main/gpt_oss/torch/model.py#L98-L107

@hnyls2002
Copy link
Collaborator Author

/tag-and-rerun-ci

@hnyls2002 hnyls2002 merged commit 374ad4c into main Dec 18, 2025
53 of 69 checks passed
@hnyls2002 hnyls2002 deleted the lsyin/fix-gpt-oss-truncate-rope branch December 18, 2025 08:31
Liwansi added a commit to iforgetmyname/sglang that referenced this pull request Dec 19, 2025
…n3_pp

* 'main' of https://github.com/sgl-project/sglang: (74 commits)
  [bug fix][pp] fix inconsistent latency between tp (sgl-project#15379)
  Fix warp illegal instruction in kimi k2 thinking PCG (sgl-project#15306)
  Fix gpt-oss yarn with `truncate` argument (sgl-project#14270)
  Monkey patch deepseek-ocr's `v_head_dim` (sgl-project#15384)
  [model-gateway] Replace PolicyRegistry RwLock with DashMap for lock-free policy lookups (sgl-project#15361)
  [PP] Fix dynamic chunking strategy for PP (sgl-project#15372)
  Fix issue: ENABLE_BELOW_SM90 cannot be enabled on aarch64 CPU (sgl-project#12967)
  Split test_piecewise_cuda_graph.py to optimize CI resource usage (sgl-project#15290)
  unified management of environment variables for vlm cuda ipc transport  (sgl-project#14501)
  Mistral Large 3 NVFP4 TRTLLM MoE support (sgl-project#15049)
  fix: adjust time for test_epd_disaggregation.py (sgl-project#15354)
  Add doc for qwen3 next (sgl-project#15337)
  feat: DeepSeek-V3.2 Streaming tool call output (sgl-project#15278)
  Feature/trtllm mha workspace size configurable sgl-project#15089 (sgl-project#15131)
  [VLM] Support cos sin cache for Qwen3-VL & GLM-4.1V (sgl-project#15205)
  [Deepseek V3.2] Support Overlap Spec + NSA (sgl-project#15307)
  Add request-level timestamp for when prefill finishes (sgl-project#14860)
  [CI] Migrate LoRA tests to test/registered/lora/ (sgl-project#15176)
  Reserve more memory for DeepSeekOCR model and adjust server start timeout for DeepGEMM to reduce flakiness (sgl-project#15277)
  Fix condition check for require_gathered_buffer (sgl-project#15328)
  ...
Prozac614 pushed a commit to Prozac614/sglang that referenced this pull request Dec 23, 2025
jiaming1130 pushed a commit to zhuyijie88/sglang that referenced this pull request Dec 25, 2025
YChange01 pushed a commit to YChange01/sglang that referenced this pull request Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants