[be] Add SimpleFakeQuantize for QAT #114

andrewor14 · 2024-04-02T22:51:29Z

Summary: This commit adds a simpler version of toq.FakeQuantize to be used for various flavors of QAT. In the future we should deprecate toq.FakeQuantize in favor of this new class.

Test Plan:
python test/quantization/test_qat_quant_api.py

Reviewers: jerryzh168

Subscribers: jerryzh168, supriyar

torchao/quantization/_qat_quant_api.py

Summary: This commit adds a simpler version of toq.FakeQuantize to be used for various flavors of QAT. In the future we should deprecate toq.FakeQuantize in favor of this new class. Test Plan: python test/quantization/test_qat_quant_api.py Reviewers: jerryzh168 Subscribers: jerryzh168, supriyar

jerryzh168 · 2024-04-03T16:47:16Z

torchao/quantization/_qat_quant_api.py

+        observer_attrs = [
+            "ch_axis", "dtype", "qscheme", "quant_min", "quant_max",
+            "eps", "is_dynamic", "scale", "zero_point",
+        ]
+        if name in observer_attrs:
+            return getattr(self.observer, name)


maybe just use if hasattr(self,observer, name) instead of hardcoding

jerryzh168 · 2024-04-03T16:47:53Z

torchao/quantization/_qat_quant_api.py

+    def calculate_qparams(self) -> Tuple[torch.Tensor, torch.Tensor]:
+        return self.observer.calculate_qparams()


we could potentially reroute function calls as well I think, maybe add a TODO here?

cpuhrsch · 2024-04-03T16:58:45Z

torchao/quantization/_qat_quant_api.py

+)
+
+
+class SimpleFakeQuantize(FakeQuantizeBase):


Do we have docs on what an Observer and a fake quant op is?

cpuhrsch

Let's spend some time chatting about the higher level design behind this. Do you have some docs I could read up on?

andrewor14 · 2024-04-04T14:29:56Z

Closing after some offline discussions. We'll put all observation/fq logic in the linear module itself so we don't need another class.

Update readme Update README.md (pytorch#113) update README.md Update README.md (pytorch#114) Update README.md (pytorch#115) Update Readme.md

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 2, 2024

andrewor14 requested a review from jerryzh168 April 2, 2024 22:51

andrewor14 force-pushed the simple_fq branch from e4100b7 to 43586e6 Compare April 2, 2024 23:01

jerryzh168 reviewed Apr 3, 2024

View reviewed changes

torchao/quantization/_qat_quant_api.py Outdated Show resolved Hide resolved

andrewor14 force-pushed the simple_fq branch from 43586e6 to 77f70a3 Compare April 3, 2024 16:11

andrewor14 requested a review from jerryzh168 April 3, 2024 16:12

jerryzh168 reviewed Apr 3, 2024

View reviewed changes

cpuhrsch reviewed Apr 3, 2024

View reviewed changes

cpuhrsch requested changes Apr 3, 2024

View reviewed changes

andrewor14 closed this Apr 4, 2024

andrewor14 deleted the simple_fq branch April 4, 2024 15:25

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Update README.md (pytorch#112)

31c8cbd

Update readme Update README.md (pytorch#113) update README.md Update README.md (pytorch#114) Update README.md (pytorch#115) Update Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[be] Add SimpleFakeQuantize for QAT #114

[be] Add SimpleFakeQuantize for QAT #114

andrewor14 commented Apr 2, 2024

jerryzh168 Apr 3, 2024

jerryzh168 Apr 3, 2024

cpuhrsch Apr 3, 2024

cpuhrsch left a comment

andrewor14 commented Apr 4, 2024

		def calculate_qparams(self) -> Tuple[torch.Tensor, torch.Tensor]:
		return self.observer.calculate_qparams()

		)


		class SimpleFakeQuantize(FakeQuantizeBase):

[be] Add SimpleFakeQuantize for QAT #114

[be] Add SimpleFakeQuantize for QAT #114

Conversation

andrewor14 commented Apr 2, 2024

jerryzh168 Apr 3, 2024

Choose a reason for hiding this comment

jerryzh168 Apr 3, 2024

Choose a reason for hiding this comment

cpuhrsch Apr 3, 2024

Choose a reason for hiding this comment

cpuhrsch left a comment

Choose a reason for hiding this comment

andrewor14 commented Apr 4, 2024