Skip to content

[Hardware][AMD][Bugfix] Fix PTPC FP8 quantization#32813

Open
mawong-amd wants to merge 1 commit intovllm-project:mainfrom
ROCm:fix_ptpc_quantization
Open

[Hardware][AMD][Bugfix] Fix PTPC FP8 quantization#32813
mawong-amd wants to merge 1 commit intovllm-project:mainfrom
ROCm:fix_ptpc_quantization

Conversation

@mawong-amd
Copy link
Contributor

@mawong-amd mawong-amd commented Jan 21, 2026

Purpose

Fixes PTPC FP8 quantization and thus AMD Quantization Tests after the refactoring done in #32189. PTPCFP8LinearMethod should now inherit from FP8OnlineLinearMethod rather than FP8LinearMethod.

Test Plan

pytest -sv quantization/test_ptpc_fp8.py
The above is implicitly run as part of AMD CI's Quantization Tests group.

Test Result

The test and test group both pass.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mawong-amd mawong-amd changed the title Fix PTPC quantization [Hardware][AMD][Bugfix] Fix PTPC quantization Jan 21, 2026
@mawong-amd mawong-amd changed the title [Hardware][AMD][Bugfix] Fix PTPC quantization [Hardware][AMD][Bugfix] Fix PTPC FP8 quantization Jan 21, 2026
@mergify mergify bot added rocm Related to AMD ROCm bug Something isn't working labels Jan 21, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a bug in the PTPC FP8 quantization implementation. By changing the base class of PTPCFp8LinearMethod from Fp8LinearMethod to Fp8OnlineLinearMethod, the method now correctly inherits the behavior for online quantization, which is its intended purpose. This aligns with the fact that PTPC performs dynamic, on-the-fly quantization of weights rather than loading pre-quantized checkpoints. The change is logical, well-contained, and directly addresses the issue described. I find no issues with this correction.

@mawong-amd
Copy link
Contributor Author

Closing since PTPC FP8 is being deprecated soon: #32700

@mawong-amd mawong-amd closed this Jan 22, 2026
@mawong-amd mawong-amd deleted the fix_ptpc_quantization branch January 22, 2026 06:01
@mawong-amd mawong-amd restored the fix_ptpc_quantization branch March 2, 2026 17:50
@mawong-amd mawong-amd reopened this Mar 2, 2026
Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>
@mawong-amd mawong-amd force-pushed the fix_ptpc_quantization branch from 4e61450 to 61d73ae Compare March 2, 2026 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working rocm Related to AMD ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant