Skip to content

support asymmetric pd-mtp via mock spec hidden#23958

Merged
hnyls2002 merged 3 commits intodsv4-rebasefrom
lsyin/pd-asymmetric-spec-mock
Apr 28, 2026
Merged

support asymmetric pd-mtp via mock spec hidden#23958
hnyls2002 merged 3 commits intodsv4-rebasefrom
lsyin/pd-asymmetric-spec-mock

Conversation

@hnyls2002
Copy link
Copy Markdown
Collaborator

@hnyls2002 hnyls2002 commented Apr 28, 2026

Always size the PD MetadataBuffers hidden_states buffer to model_config.spec_hidden_size on both prefill and decode (instead of conditional on local --speculative-algorithm). This unblocks asymmetric P/D configs where prefill runs no spec module but decode runs EAGLE/MTP: prefill ships a zero-initialized buffer that decode consumes as mock conditioning for the first draft step. Verified spec decoding keeps the output token correct -- only the first decode iteration's accept length is affected (~ < 1% throughput hit on long generations).

Cost: when neither side runs spec, the buffer is allocated but unused (~few MB). The alternative is a wire-protocol size mismatch on asymmetric configs.

Test: test_dsv4_pd_disagg_nixl.py uses asymmetric config (prefill TP-only, decode TP+DP+EAGLE), no new server arg required. Follow-up to #23918.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@hnyls2002 hnyls2002 changed the title support asymmetric pd-mtp via decode-spec-algo flag support asymmetric pd-mtp via mock spec hidden Apr 28, 2026
@hnyls2002 hnyls2002 merged commit ff1f3a0 into dsv4-rebase Apr 28, 2026
6 checks passed
@hnyls2002 hnyls2002 deleted the lsyin/pd-asymmetric-spec-mock branch April 28, 2026 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant