fix load_weights for glm4v_moe with shared_experts fusion by zminglei · Pull Request #14610 · sgl-project/sglang

zminglei · 2025-12-08T03:34:14Z

Motivation

fix load_weights for glm4v_moe with shared_experts fusion

Launch server:
python -m sglang.launch_server --model-path /shared/public/elr-models/zai-org/GLM-4.5V-FP8/ --tp-size 4
Before:
Accuracy is 0 and send_one gives garbage text output
After:

python benchmark/gsm8k/bench_sglang.py --data-path /shared/public/data/gsm8k/test.jsonl
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:16<00:00, 12.42it/s]
Accuracy: 0.930
Invalid: 0.000
Latency: 16.176 s
Output throughput: 1446.905 token/s

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.
Work with maintainers to merge your PR. See the PR Merge Process

gemini-code-assist · 2025-12-08T03:34:17Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

JustinTong0323 · 2025-12-08T04:01:18Z

Works for glm4.5v

Accuracy: 0.960
Invalid: 0.000
Latency: 11.552 s
Output throughput: 1922.723 token/s
metrics={'accuracy': np.float64(0.96), 'invalid': np.float64(0.0), 'latency': 11.551843108143657, 'output_throughput': 1922.7234816184439}

fix load_weights for glm4v_moe with shared_experts fusion

38ac7c4

JustinTong0323 merged commit 0224b17 into sgl-project:glm46v Dec 8, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix load_weights for glm4v_moe with shared_experts fusion#14610

fix load_weights for glm4v_moe with shared_experts fusion#14610
JustinTong0323 merged 1 commit intosgl-project:glm46vfrom
zminglei:glm46v-fix

zminglei commented Dec 8, 2025

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Uh oh!

JustinTong0323 commented Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zminglei commented Dec 8, 2025

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Uh oh!

JustinTong0323 commented Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants