Add hardware check to fp8 quant #1314

jainapurva · 2024-11-19T21:57:34Z

Add hardware check to ensure fp8 quantization only attempts runs on compatible hardware.

Issue: #1188

pytorch-bot · 2024-11-19T21:57:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1314

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[DomainsOnly] Jobs fail with GLIBC version not found

❌ 4 New Failures

As of commit 88b6ba1 with merge base b714026 ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test (CPU 2.5.1, linux.4xlarge, torch==2.5.1 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
test/dtypes/test_affine_quantized_float.py::TestAffineQuantizedFloat8Compile::test_unsupported_granularity
Run Regression Tests / test (CUDA 2.5.1, linux.g5.12xlarge.nvidia.gpu, torch==2.5.1 --index-url https://download.pytorch... / linux-job (gh)
test/dtypes/test_affine_quantized_float.py::TestAffineQuantizedFloat8Compile::test_unsupported_granularity
Run Regression Tests / test-nightly (CPU Nightly, linux.4xlarge, --pre torch==2.6.0.dev20241101 --index-url https://down... / linux-job (gh)
test/dtypes/test_affine_quantized_float.py::TestAffineQuantizedFloat8Compile::test_unsupported_granularity
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch==2.6.0.dev20241101 --index-... / linux-job (gh)
test/dtypes/test_affine_quantized_float.py::TestAffineQuantizedFloat8Compile::test_unsupported_granularity

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torchao/quantization/quant_api.py

drisspg · 2024-11-19T22:50:08Z

torchao/quantization/quant_api.py

@@ -939,6 +940,9 @@ def float8_dynamic_activation_float8_weight(
        mm_config (Float8MMConfig): Configuration for the matrix multiplication. Default uses fast accumulation.

    """
+    assert (
+        is_cuda_8_9
+    ), "Float8 dynamic activation quantization is only supported on CUDA 8.9 and above"


This should also be supported on AMD. We should probably update this check.

cc @jeffdaily

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 19, 2024

jainapurva requested a review from drisspg November 19, 2024 21:57

jainapurva added the topic: bug fix Use this tag for PRs that fix bugs label Nov 19, 2024

Add hardware check to fp8 quant

88b6ba1

jainapurva force-pushed the fp8_check branch from 90b6c38 to 88b6ba1 Compare November 19, 2024 22:02

jerryzh168 reviewed Nov 19, 2024

View reviewed changes

torchao/quantization/quant_api.py Show resolved Hide resolved

drisspg reviewed Nov 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hardware check to fp8 quant #1314

Add hardware check to fp8 quant #1314

jainapurva commented Nov 19, 2024

pytorch-bot bot commented Nov 19, 2024 •

edited

Loading

drisspg Nov 19, 2024

Add hardware check to fp8 quant #1314

Are you sure you want to change the base?

Add hardware check to fp8 quant #1314

Conversation

jainapurva commented Nov 19, 2024

pytorch-bot bot commented Nov 19, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1314

❗ 1 Active SEVs

❌ 4 New Failures

drisspg Nov 19, 2024

Choose a reason for hiding this comment

pytorch-bot bot commented Nov 19, 2024 •

edited

Loading