What is the purpose of weight quantization and activation quantization? #3176
Unanswered
CoinCheung
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I did not find enough specification of weight/activation quantization. I can thought of two usage of this: the first one is to speed up training with 8 bit or 4 bit computation, in which process we need to quantize the weights or activation. The second usage is the so-called quantized-aware-training(QAT), in which we want the model to be adapted to the precision of quantization during training process. Would you tell me which one is the purpose of weight/activation quantization in deepspeed?
Beta Was this translation helpful? Give feedback.
All reactions