Skip to content

Add MOE variants and backend guide#3

Merged
sunway513 merged 1 commit into
mainfrom
docs/moe-variants-guide
Feb 7, 2026
Merged

Add MOE variants and backend guide#3
sunway513 merged 1 commit into
mainfrom
docs/moe-variants-guide

Conversation

@sunway513
Copy link
Copy Markdown
Owner

Summary

  • Add comprehensive user-facing documentation for all MOE variants in AITER
  • Cover Standard FusedMOE, Expert Parallel (EP), DP Shared Expert, Block-Scale FP8, MXFP4, End-to-End (E2E), and all quantization combinations
  • Include routing options (TopK Softmax, TopK Sigmoid), activation support, and model-specific recommendations

Highlights

  • Quick reference table helping users pick the right MOE config for their use case
  • Quantization decision tree (BF16 → FP8 per-token → FP8 block-scale → MXFP4 → INT4)
  • Full data type support matrix across ASM, CK, and Triton backends
  • Pre-tuned model configs for Mixtral, DeepSeek-V2/V3, Qwen3-235B
  • Practical API examples for each MOE variant (standard, EP, DP, block-scale, MXFP4)
  • Performance tuning guide with environment variables and key parameters
  • GPU architecture recommendations (MI300X vs MI350)

Test plan

  • Review report accuracy against current source code
  • Verify all referenced API functions, source files, and config paths exist

🤖 Generated with Claude Code

Document all MOE variants (standard, EP, DP shared expert, block-scale,
MXFP4, E2E, etc.) with quantization support matrices, routing options,
model-specific recommendations, and tuning guidance.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sunway513 sunway513 merged commit 1866187 into main Feb 7, 2026
@sunway513 sunway513 deleted the docs/moe-variants-guide branch February 22, 2026 03:52
@sunway513 sunway513 restored the docs/moe-variants-guide branch February 22, 2026 03:54
sunway513 pushed a commit that referenced this pull request Apr 13, 2026
* Fix precision issue for 32x256, 64x128, 64x256 kernels silu and gelu variants
---------

Co-authored-by: Sergey Solo <ssolovye@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant