-
Notifications
You must be signed in to change notification settings - Fork 282
fix regression issue and command in mix-precision example #2317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: He, Xin3 <[email protected]>
Signed-off-by: He, Xin3 <[email protected]>
Signed-off-by: He, Xin3 <[email protected]>
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨ |
Signed-off-by: He, Xin3 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better change AR dependency to pip released v0.8 version after AR v0.8 released
...ples/pytorch/nlp/huggingface_models/language-modeling/quantization/mix-precision/quantize.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge can wait binary published
Signed-off-by: He, Xin3 <[email protected]>
Signed-off-by: He, Xin3 <[email protected]>
User description
Type of Change
bug fix
PR Type
Enhancement, Bug fix
Description
Added
mem_per_param_scaleandenable_torch_compileargumentsUpdated dtype handling for
uNVFP4andNVFP4+Fixed regression issues in dtype mapping and layer configuration
Updated README to include
enable_torch_compilein example commandDiagram Walkthrough
File Walkthrough
quantize.py
Add mem_per_param_scale and enable_torch_compileexamples/pytorch/nlp/huggingface_models/language-modeling/quantization/mix-precision/quantize.py
mem_per_param_scaleandenable_torch_compileargumentsuNVFP4andNVFP4+README.md
Update README with enable_torch_compileexamples/pytorch/nlp/huggingface_models/language-modeling/quantization/mix-precision/README.md
enable_torch_compile