Skip to content

convert : remove input_scale for dequantized fp8 modelopt#22356

Merged
CISC merged 3 commits into
masterfrom
cisc/convert-modelopt-input-scale-fp8
Apr 27, 2026
Merged

convert : remove input_scale for dequantized fp8 modelopt#22356
CISC merged 3 commits into
masterfrom
cisc/convert-modelopt-input-scale-fp8

Conversation

@CISC
Copy link
Copy Markdown
Member

@CISC CISC commented Apr 25, 2026

Overview

Fixes #22346

Additional information

Refactors scale tensor writing into reusable methods and removes input_scale for dequantized FP8 modelopt tensors.

Requirements

@github-actions github-actions Bot added the python python script changes label Apr 25, 2026
@CISC CISC changed the title convert : support input_scale for fp8 modelopt convert : remove input_scale for dequantized fp8 modelopt Apr 26, 2026
Copy link
Copy Markdown
Member

@danbev danbev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified locally that the model conversion works 👍

@CISC CISC merged commit d13540b into master Apr 27, 2026
9 checks passed
@CISC CISC deleted the cisc/convert-modelopt-input-scale-fp8 branch April 27, 2026 06:45
IntelNav pushed a commit to IntelNav/llama.cpp that referenced this pull request Apr 29, 2026
IntelNav pushed a commit to IntelNav/llama.cpp that referenced this pull request Apr 29, 2026
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
samuraieng pushed a commit to samuraieng/llama.cpp that referenced this pull request May 6, 2026
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
meh pushed a commit to meh/llama.cpp that referenced this pull request May 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: convert_hf_to_gguf.py no longer supports Nemotron 3 super FP8 or NVFP4

3 participants