Skip to content

Conversation

@yizhang2077
Copy link
Collaborator

@yizhang2077 yizhang2077 commented Sep 12, 2025

Motivation

ref #10344, it is caused by FuseMoE will inplace update hidden_states, which will interfere with the calculation of share expert

python3 -m sglang.launch_server --model Qwen/Qwen2-57B-A14B-Instruct  --tp 2

Accuracy: 0.869
Invalid: 0.002
Latency: 77.286 s
Output throughput: 1813.165 token/s

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@ispobock
Copy link
Collaborator

cc: @Alcanderian May have related issue in #7327

@zhyncs zhyncs merged commit 2777801 into main Sep 12, 2025
9 of 63 checks passed
@zhyncs zhyncs deleted the fix_dual_stream_bug branch September 12, 2025 03:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants