Probe LoRA Qwen3-8B CUDA fail on plain main (negative control, NOT a fix)#25744
Probe LoRA Qwen3-8B CUDA fail on plain main (negative control, NOT a fix)#25744fzyzcjy wants to merge 1 commit into
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
/rerun-test test/registered/lora/test_lora_qwen3_8b_logprob_diff.py |
|
🚀 |
Result: FAIL ❌ (expected) — bug reproduces on plain
|
| PR | Branch | /rerun-test verdict |
|---|---|---|
| #25743 (revert of #25690) | tom/revert-25690-cutedsl |
PASS ✅ |
| #25744 (this — no-revert, plain main + 1-line touch) | tom/probe-lora-bug-25690 |
FAIL ❌ |
Bidirectional confirmation that b79e4b1e68 (#25690) is the root cause — the bug is on plain upstream/main and is not specific to Tom's #25703–#25728 refactor chain context.
Closing this PR — diagnostic only, no merge value. Full bisect + repro: #25647 (comment).
Second run: FAIL ❌ — reproducibly broken on plain
|
| Run | Result |
|---|---|
| #1 | FAIL ❌ (same fingerprint) |
| #2 | FAIL ❌ (same fingerprint) |
Two-in-two-out for the no-revert branch on plain upstream/main. No flake risk.
This PR does not revert anything. It only adds a one-line sentinel comment to
python/sglang/version.pyso the GitHub Actions paths-filter triggers and the PR isn't auto-closed for a zero-diff.Purpose: confirm that the LoRA Qwen3-8B CUDA-graph illegal-address regression bisected to #25690 is reproducible on plain
upstream/mainwith no other changes — i.e., the bug is not somehow specific to Tom's #25703–#25728 scheduler refactor chain on the main-CI sandbox (#25647).Expected result:
/rerun-test test/registered/lora/test_lora_qwen3_8b_logprob_diff.pyhere should FAIL with the sameCUDBG_EXCEPTION_WARP_ILLEGAL_ADDRESS (14)fingerprint, while the same/rerun-teston the sibling revert PR #25743 should PASS — together that's bidirectional evidence forb79e4b1e68as the root cause.Bisect evidence: #25647 (comment).
cc @Fridge003 @hnyls2002
CI States
Latest PR Test (Base): ❌ Run #26077811447
Latest PR Test (Extra): ❌ Blocked --
run-ciis required first.