-
Notifications
You must be signed in to change notification settings - Fork 203
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Move and rename GranularityType -> Granularity (#1038)
* Make module swap the main QAT flow again Summary: Following #987, this commit makes module swap the main QAT flow today. We remove all tensor subclass fake quantize injection logic since this is not needed in both the long term and the short term plans for QAT. In the short term, we will continue to use a full module swap flow, and only migrate to the long term flow once there is general distributed support for tensor subclasses and when tensor subclass composability provides meaningful benefits. Test Plan: python test/quantization/test_qat.py [ghstack-poisoned] * Move and rename GranularityType -> Granularity Summary: Move GranularityType to quant_primitives.py to be consistent with other similar fields like MappingType and ZeroPointDomain. Test Plan: CI [ghstack-poisoned] * Update on "Move and rename GranularityType -> Granularity" Summary: Move GranularityType to quant_primitives.py to be consistent with other similar fields like MappingType and ZeroPointDomain. Test Plan: CI [ghstack-poisoned] * Update on "Move and rename GranularityType -> Granularity" Summary: Move GranularityType to quant_primitives.py to be consistent with other similar fields like MappingType and ZeroPointDomain. Test Plan: CI [ghstack-poisoned] * Update on "Move and rename GranularityType -> Granularity" Summary: Move GranularityType to quant_primitives.py to be consistent with other similar fields like MappingType and ZeroPointDomain. Test Plan: CI [ghstack-poisoned] * Update base for Update on "Move and rename GranularityType -> Granularity" Summary: Move GranularityType to quant_primitives.py to be consistent with other similar fields like MappingType and ZeroPointDomain. Test Plan: CI [ghstack-poisoned]
- Loading branch information
1 parent
107e378
commit 0f6bae5
Showing
15 changed files
with
143 additions
and
111 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Copyright (c) Meta Platforms, Inc. and affiliates. | ||
# All rights reserved. | ||
|
||
# This source code is licensed under the license found in the | ||
# LICENSE file in the root directory of this source tree. | ||
|
||
from dataclasses import dataclass | ||
|
||
|
||
@dataclass(frozen=True) | ||
class Granularity: | ||
""" | ||
Base class for representing the granularity of quantization. | ||
This class serves as a parent for specific granularity types used in | ||
quantization operations, such as per-tensor or per-axis quantization. | ||
""" | ||
pass | ||
|
||
@dataclass(frozen=True) | ||
class PerTensor(Granularity): | ||
""" | ||
Represents per-tensor granularity in quantization. | ||
This granularity type calcualtes the quantization parameters | ||
based off the entire tensor. | ||
""" | ||
pass | ||
|
||
@dataclass(frozen=True) | ||
class PerAxis(Granularity): | ||
""" | ||
Represents per-axis granularity in quantization. | ||
This granularity type calcualtes different quantization parameters | ||
along a specified axis of the tensor. | ||
For example if the input tensor is shape [8, 16] and axis=0, then | ||
the quantization parameters are calculated for each row of the tensor. | ||
Giving a total of 8 quantization parameters. | ||
Attributes: | ||
axis (int): The axis along which reduction is performed. | ||
""" | ||
axis: int | ||
|
||
@dataclass(frozen=True) | ||
|
||
class PerGroup(Granularity): | ||
""" | ||
Represents per-channel group granularity in quantization. | ||
This granularity type calcualtes different quantization parameters | ||
for each group of <group_size> elements. | ||
For example if the input tensor is shape [8, 16], and the group size is 4, then | ||
the input tensor is reshaped to [64, 4] | ||
quantization parameters are calculated for each group of 4 elements, | ||
giving a total of 64 quantization parameters. | ||
Attributes: | ||
group_size (int): The size of each quantization group | ||
""" | ||
group_size: int | ||
|
||
class PerRow(Granularity): | ||
""" | ||
Represents row-wise granularity in quantization. | ||
This is a special case of per-axis quantization and is unique to Float8 matmuls | ||
where the input is quantized with a block_size of (1, ..., input.shape[-1]). And the weight | ||
is quantized with a block_size of (1, weight.shape[1]). | ||
""" | ||
pass |
Oops, something went wrong.