Skip to content

vLLM FP8 quantized support for SFT/GRPO#313

Merged
danielhanchen merged 20 commits intounslothai:mainfrom
Datta0:vllm_fp8
Oct 16, 2025
Merged

vLLM FP8 quantized support for SFT/GRPO#313
danielhanchen merged 20 commits intounslothai:mainfrom
Datta0:vllm_fp8

Conversation

@Datta0
Copy link
Copy Markdown
Collaborator

@Datta0 Datta0 commented Oct 6, 2025

if extra_in_new:
for attr in sorted(extra_in_new):
print(f"EXTRA ATTRIBUTE: {name}.{attr} (exists in new model but not original)")
print(f'Found some extra attributes like: {list(extra_in_new)[:5]}...')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're copying over quant method in some places. That would be the difference between expected vs created model. Didn't want to spam the console with every layer showing the same
Extra attribute quant method in model.model.layers.x

Comment thread unsloth_zoo/empty_model.py Outdated
Comment thread unsloth_zoo/empty_model.py Outdated
Comment thread unsloth_zoo/empty_model.py Outdated
Comment thread unsloth_zoo/vllm_utils.py Outdated
Comment thread unsloth_zoo/vllm_utils.py Outdated
Comment thread unsloth_zoo/vllm_utils.py Outdated
Comment thread unsloth_zoo/vllm_utils.py Outdated
Comment thread unsloth_zoo/vllm_utils.py Outdated
Comment thread unsloth_zoo/vllm_utils.py Outdated
Comment thread unsloth_zoo/vllm_utils.py Outdated
Copy link
Copy Markdown
Contributor

@danielhanchen danielhanchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Comment thread unsloth_zoo/vllm_utils.py Outdated
Comment thread unsloth_zoo/vllm_utils.py Outdated
Comment thread unsloth_zoo/vllm_utils.py Outdated
Comment thread unsloth_zoo/vllm_utils.py Outdated
@Datta0 Datta0 changed the title vLLM FP8-E4M3 block quantized support vLLM FP8 quantized support for SFT/GRPO Oct 15, 2025
Comment thread unsloth_zoo/empty_model.py Outdated
Copy link
Copy Markdown
Contributor

@danielhanchen danielhanchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small changes

@danielhanchen danielhanchen merged commit 4613671 into unslothai:main Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants