-
Notifications
You must be signed in to change notification settings - Fork 5.8k
amd/deepseek_v4 integration 10/N optimize mhc performance #24355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -589,6 +589,8 @@ class Envs: | |||||||||
| SGLANG_TOPK_TRANSFORM_512_TORCH = EnvBool(False) | ||||||||||
| SGLANG_OPT_BF16_FP32_GEMM_ALGO = EnvBool(False) | ||||||||||
| SGLANG_FORCE_TRITON_MOE_FP8 = EnvBool(False) | ||||||||||
| SGLANG_OPT_USE_AITER_MHC_PRE= EnvBool(True) | ||||||||||
| SGLANG_OPT_USE_AITER_MHC_POST= EnvBool(True) | ||||||||||
| # fmt: on | ||||||||||
|
Comment on lines
+593
to
594
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are missing spaces around the assignment operator
Suggested change
|
||||||||||
|
|
||||||||||
| # EPD | ||||||||||
|
|
||||||||||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -1889,6 +1889,23 @@ def hc_pre_torch_impl(x, hc_fn): | |||||
| # returned post should be [n, hc_mult] | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The
Suggested change
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. gemini-code-assist proposed wrong diff, incorrect line #
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are absolutely correct, Since the original comment was anchored to line 1889, I cannot provide a direct code suggestion for line 1892 in the UI. However, the correct modification would be to change line 1892 from: if envs.SGLANG_OPT_USE_AITER_MHC_PRE.get():to: if _is_hip and envs.SGLANG_OPT_USE_AITER_MHC_PRE.get():This ensures that the |
||||||
| return y, post.squeeze(-1), comb | ||||||
|
|
||||||
| if _is_hip and envs.SGLANG_OPT_USE_AITER_MHC_PRE.get(): | ||||||
| from aiter.ops.mhc import mhc_pre | ||||||
|
|
||||||
| post, comb, y = mhc_pre( | ||||||
| residual=x, | ||||||
| fn=hc_fn, | ||||||
| hc_scale=hc_scale, | ||||||
| hc_base=hc_base, | ||||||
| rms_eps=self.rms_norm_eps, | ||||||
| hc_pre_eps=self.hc_eps, | ||||||
| hc_sinkhorn_eps=self.hc_eps, | ||||||
| hc_post_mult_value=2.0, | ||||||
| sinkhorn_repeat=self.hc_sinkhorn_iters, | ||||||
| ) | ||||||
| # returned post should be [n, hc_mult] | ||||||
| return y, post.squeeze(-1), comb | ||||||
|
|
||||||
| if envs.SGLANG_OPT_DEEPGEMM_HC_PRENORM.get(): | ||||||
| # DeepGEMM implementation | ||||||
| import deep_gemm | ||||||
|
|
@@ -1945,6 +1962,14 @@ def hc_post( | |||||
| result = mhc_post(x, residual, post, comb) | ||||||
|
HaiShaw marked this conversation as resolved.
|
||||||
| return result | ||||||
|
|
||||||
| elif _is_hip and envs.SGLANG_OPT_USE_AITER_MHC_POST.get(): | ||||||
| from aiter.ops.mhc import mhc_post | ||||||
|
|
||||||
| result = torch.empty_like(residual) | ||||||
| mhc_post(result, x, residual, post, comb) | ||||||
|
|
||||||
| return result | ||||||
|
|
||||||
| assert residual.shape == (x.shape[0], self.hc_mult, x.shape[-1]) | ||||||
| assert post.shape == (x.shape[0], self.hc_mult) | ||||||
| assert comb.shape == (x.shape[0], self.hc_mult, self.hc_mult) | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
MODELvariable is assigned twice consecutively, making the first assignment redundant. Additionally, these absolute paths are specific to a particular environment. It is recommended to use a single assignment and consider using a more generic path or an environment variable for better flexibility.