-
Notifications
You must be signed in to change notification settings - Fork 204
[AMD] improve dsr1 fp4 disagg perf on mi355x - rapid follow up PR incoming to quant correction on fp8 combine #1236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+165
−64
Merged
Changes from 47 commits
Commits
Show all changes
75 commits
Select commit
Hold shift + click to select a range
0383696
[AMD] add dsr1 mxfp4 v2 sweep points
billishyahao 18e05b1
fix
billishyahao 32b5d3d
Fix tokenizer mismatch between benchmark client and sglang server on …
ZhaiFeiyue 0bd347f
change mtp model to fp8
billishyahao 754e53c
change fp8 image
billishyahao f29f2d0
bump image to 0327
billishyahao a44c7eb
remove specv2
billishyahao 2514136
consolidate dsr1 fp4 configs
billishyahao 3b4d4ab
Merge remote-tracking branch 'inf/main' into amd/mi355x-dsfp4-march15
billishyahao 682a4ab
bump fp8 image to 0327
billishyahao 64bf100
fix crash
billishyahao c44e175
fix env
billishyahao 0a41f89
cleanup
billishyahao 7282748
add perf change log
billishyahao e6d4b32
add deprecate comments
billishyahao b7dd65f
add spec v2 env
billishyahao 12a4ba0
bump the docker image
billishyahao 597a458
add stream control to eliminate cpu overhead
billishyahao f715e47
tune the config
billishyahao 2ea82d5
bump image
billishyahao 16384e7
tune config
billishyahao 4d733e7
add new exp config
billishyahao 83af743
enable log level info
billishyahao 0c3083e
fix mori env
billishyahao 1c61622
bump image
billishyahao e2d2ac9
fix log
billishyahao d2a7988
bump the image
billishyahao b09ae6c
fix
billishyahao 2c3ee04
fix
billishyahao 69102f7
fix
billishyahao 668068c
fix
billishyahao 776fd42
bump image to 0416
billishyahao 2471379
fix
billishyahao c80997f
set si to 100
billishyahao 616c57d
bump the image
billishyahao 3d62e2c
revert old image
billishyahao 2c4c09d
revert old image
billishyahao 1c9b8d2
increase DISPATCH_TOKENS_PREFILL to 5120
billishyahao 8e6104e
bump image to 0417
billishyahao 7cc5d81
add exp config
billishyahao a1c05da
add exp config
billishyahao a915729
add exp config
billishyahao 44d10a1
add exp config
billishyahao f09820e
add exp configs
billishyahao 5144ca1
add exp configs
billishyahao d9e2eef
bump image
billishyahao ee33925
sync arguments
billishyahao 2b1ff6b
fix
billishyahao 0548773
fix config
billishyahao 724bd61
add exp configs
billishyahao f8f0a3a
enable sdma
billishyahao feb6c7d
fix
billishyahao f501a3e
fix
billishyahao 217d892
cleanup
billishyahao a5a822a
bump image
billishyahao 91e1396
Merge remote-tracking branch 'inf/main' into amd/mi355x-dsfp4-april14
billishyahao 2d2dada
fix yaml
billishyahao 0d84ca0
Merge branch 'main' into amd/mi355x-dsfp4-april14
billishyahao 26e6979
fix eval
billishyahao 3974848
fix eval
billishyahao ace0e0e
fix eval
billishyahao d3a7d1e
add eval only to perflog
billishyahao d90995f
Merge remote-tracking branch 'inf/main' into amd/mi355x-dsfp4-april14
billishyahao 1a007bd
fix eval
billishyahao 3235860
fix
billishyahao 8fbd2ab
Merge remote-tracking branch 'inf/main' into amd/mi355x-dsfp4-april14
billishyahao 3ff0812
fix args diff for eval
billishyahao 2a32c13
add eval only
billishyahao 219cf7a
Merge remote-tracking branch 'inf/main' into amd/mi355x-dsfp4-april14
billishyahao a73f622
fix
billishyahao 8039b5f
Merge remote-tracking branch 'inf/main' into amd/mi355x-dsfp4-april14
billishyahao 25fb9d1
Merge branch 'main' into amd/mi355x-dsfp4-april14
billishyahao 2c48183
Merge remote-tracking branch 'inf/main' into amd/mi355x-dsfp4-april14
billishyahao 9be76be
Update server.sh
Oseltamivir b8ed4a3
Merge branch 'main' into amd/mi355x-dsfp4-april14
Oseltamivir File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 The new perf-changelog entry at lines 1217-1225 has two factual inaccuracies: (1)
pr-linkpoints to #983 but the PR description states this is a 'replacement of pull/983' — i.e. #983 was superseded and this is PR #1236; the convention throughout this file is forpr-linkto reference the actual merging PR. (2) The bullet 'Bump SGL mori image to March 27' is wrong for the keyed FP4 configs — those images are actually bumped torocm/sgl-dev:sglang-0.5.10-rocm720-mi35x-mori-0428(April 28); only the FP8 configs (not listed underconfig-keyshere) use the 0327 (March 27) tag. Suggest updating thepr-linkto #1236 and changing 'March 27' to 'April 28' (or 'mori-0428').Extended reasoning...
Two metadata/text inaccuracies in the new perf-changelog entry (lines 1217-1225)
(1) Wrong pr-link. The PR description explicitly opens with: "replacement of #983". That phrasing identifies #983 as the prior, superseded PR — the one being replaced. This PR is #1236, and the convention throughout
perf-changelog.yamlis thatpr-linkreferences the merging PR (e.g. the immediately preceding entry at line 1215 uses pr-link 973 for that PR's own entry; entries that don't yet know their PR number use the placeholderpull/XXX, used 6 times in the file). Settingpr-link: .../pull/983here breaks that convention and points readers at an unrelated, superseded PR rather than the one that actually merged the changes. The author appears to have copied the link from the superseded PR text rather than substituting the new PR number.(2) Wrong image-bump date. The bullet
"Bump SGL mori image to March 27"is factually wrong for the configs this entry is keyed on. Walking through the diff in.github/configs/amd-master.yaml:dsr1-fp4-mi355x-sglang-disagg(line ~798): image goes fromsglang-0.5.9-rocm720-mi35x-mori-0227-3→sglang-0.5.10-rocm720-mi35x-mori-**0428**(April 28).dsr1-fp4-mi355x-sglang-disagg-mtp(line ~1006): image goes fromsglang-0.5.9-rocm720-mi35x-mori-0227-3→sglang-0.5.10-rocm720-mi35x-mori-**0428**(April 28).dsr1-fp8-mi355x-sglang-disagg(line 488): image goes frommori-0227-2→mori-**0327**(March 27).dsr1-fp8-mi355x-sglang-disagg-mtp(line 644): image goes frommori-0227-2→mori-**0327**(March 27).The
config-keyslisted in this entry are only the two FP4 configs. So the image bump that this entry actually documents is the April 28 (0428) bump, not the March 27 (0327) bump that applies to the FP8 configs (which are not in this entry'sconfig-keys). The text appears to have been copied/adapted from an earlier entry (PR #823 at line 1080, which legitimately documented the Feb 27 bump for FP8/FP4 together) without updating the date to match the actual change in this PR.Impact. Both issues are documentation/metadata only — no runtime behavior is affected. However,
perf-changelog.yamlis the authoritative human-readable record of what changed in each performance release, so an entry that points at the wrong PR and misstates the image version makes future bisection/regression-investigation harder than it needs to be.Fix. Update the entry at lines 1217-1225 to:
(The
pull/XXXplaceholder convention used elsewhere in this file would also be acceptable forpr-linkif the merging PR number is not yet known at write time, but #1236 is known.)Severity: nit — both issues are confined to changelog text/metadata and have no functional impact, but the entry as written is straightforwardly wrong on its face and worth fixing before merge.