Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[be] Add SimpleFakeQuantize for QAT #114

Closed
wants to merge 1 commit into from
Closed

[be] Add SimpleFakeQuantize for QAT #114

wants to merge 1 commit into from

Conversation

andrewor14
Copy link
Contributor

Summary: This commit adds a simpler version of toq.FakeQuantize to be used for various flavors of QAT. In the future we should deprecate toq.FakeQuantize in favor of this new class.

Test Plan:
python test/quantization/test_qat_quant_api.py

Reviewers: jerryzh168

Subscribers: jerryzh168, supriyar

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 2, 2024
@andrewor14 andrewor14 requested a review from jerryzh168 April 2, 2024 22:51
Summary: This commit adds a simpler version of toq.FakeQuantize
to be used for various flavors of QAT. In the future we should
deprecate toq.FakeQuantize in favor of this new class.

Test Plan:
python test/quantization/test_qat_quant_api.py

Reviewers: jerryzh168

Subscribers: jerryzh168, supriyar
Comment on lines +61 to +66
observer_attrs = [
"ch_axis", "dtype", "qscheme", "quant_min", "quant_max",
"eps", "is_dynamic", "scale", "zero_point",
]
if name in observer_attrs:
return getattr(self.observer, name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just use if hasattr(self,observer, name) instead of hardcoding

Comment on lines +54 to +55
def calculate_qparams(self) -> Tuple[torch.Tensor, torch.Tensor]:
return self.observer.calculate_qparams()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could potentially reroute function calls as well I think, maybe add a TODO here?

)


class SimpleFakeQuantize(FakeQuantizeBase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have docs on what an Observer and a fake quant op is?

Copy link
Contributor

@cpuhrsch cpuhrsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's spend some time chatting about the higher level design behind this. Do you have some docs I could read up on?

@andrewor14
Copy link
Contributor Author

Closing after some offline discussions. We'll put all observation/fq logic in the linear module itself so we don't need another class.

@andrewor14 andrewor14 closed this Apr 4, 2024
@andrewor14 andrewor14 deleted the simple_fq branch April 4, 2024 15:25
yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024
Update readme

Update README.md (pytorch#113)

update README.md

Update README.md (pytorch#114)

Update README.md (pytorch#115)

Update Readme.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants