[FLOAT8] Add Hardware Compatibility Check for FP8 Quantization #1188

drisspg · 2024-10-29T00:02:18Z

Add Hardware Compatibility Check for FP8 Quantization

Issue Summary

In our current implementation, we provide three APIs for model computation in FP8 format. However, for dynamic activation quant these FP8 computations are only supported on NVIDIA GPUs with SM89 and SM90 architectures. When models are quantized to FP8 on unsupported hardware, errors only occur during runtime, which can lead to confusion and wasted resources.

Proposed Solution

Check at the model quantization stage if the target hardware does not support FP8 computations and raise an error accordingly. This way, users are informed immediately if their hardware cannot handle FP8 quantization, rather than discovering it during runtime. Potentially point to weight-only quant which as more supported

Changes where to add errors:

    "float8_dynamic_activation_float8_weight",
    "float8_static_activation_float8_weight"

The text was updated successfully, but these errors were encountered:

petrex · 2024-10-31T02:48:06Z

good idea! I will add arch check for AMD GPUs as well.

drisspg added good first issue Good for newcomers float8 labels Oct 29, 2024

jainapurva self-assigned this Nov 19, 2024

jainapurva mentioned this issue Nov 19, 2024

Add hardware check to fp8 quant #1314

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLOAT8] Add Hardware Compatibility Check for FP8 Quantization #1188

[FLOAT8] Add Hardware Compatibility Check for FP8 Quantization #1188

drisspg commented Oct 29, 2024

petrex commented Oct 31, 2024

[FLOAT8] Add Hardware Compatibility Check for FP8 Quantization #1188

[FLOAT8] Add Hardware Compatibility Check for FP8 Quantization #1188

Comments

drisspg commented Oct 29, 2024

Add Hardware Compatibility Check for FP8 Quantization

Issue Summary

Proposed Solution

petrex commented Oct 31, 2024