-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
AutoDeploy<NV> AutoDeploy Backend<NV> AutoDeploy BackendbugSomething isn't workingSomething isn't working
Description
System Info
With the H100 from cw, the cutlass moe BF16 kernel caused the accuracy drop for gsm8k.
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
pytest tests/integration/defs/accuracy/test_llm_api_autodeploy.py::TestNemotronMOE -s -vv
Expected behavior
Fix the accuracy issue, or If the kernel cannot support the BF16, then let's stick with the triton..
actual behavior
N/A
additional notes
N/A
Before submitting a new issue...
- Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.
Metadata
Metadata
Assignees
Labels
AutoDeploy<NV> AutoDeploy Backend<NV> AutoDeploy BackendbugSomething isn't workingSomething isn't working
Type
Projects
Status
Done