-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantize vit_b_16 tutorial - Part 1 #60
Conversation
tutorials/quantize_vit/bfloat16_code
Outdated
V0315 17:55:08.077000 140041589225280 torch/_inductor/graph.py:1258] [0/0] [__output_code] from torch._inductor.wrapper_benchmark import compiled_module_main | ||
V0315 17:55:08.077000 140041589225280 torch/_inductor/graph.py:1258] [0/0] [__output_code] compiled_module_main('None', benchmark_compiled_module) | ||
V0315 17:55:08.077000 140041589225280 torch/_inductor/graph.py:1258] [0/0] [__output_code] | ||
I0315 17:55:08.079000 140041589225280 torch/_inductor/graph.py:1264] [0/0] [__output_code] Output code written to: /tmp/torchinductor_cpuhrsch/2i/c2ixftylrwvvc3swfutdqklg6xb2w47xlwmfdmtgktp4yb4kzkro.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe check in this file instead?
/tmp/torchinductor_cpuhrsch/2i/c2ixftylrwvvc3swfutdqklg6xb2w47xlwmfdmtgktp4yb4kzkro.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can make it a .py
file. I think the name isn't great. torch should have a way of outputting these in a neat location more officially.
tutorials/quantize_vit/run.sh
Outdated
|
||
# Store the output code for further inspection | ||
TORCH_LOGS='output_code' python run_vit_b.py 2> bfloat16_code | ||
TORCH_LOGS='output_code' python run_vit_b_quant.py 2> quant_code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what people might expect to see here is the fused quant and dequant in the generated code so we just add some comments to the checked in code so people can inspect or least that's what I'd expect people to see if they want to make sure torch.compile and quantization compose together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to add more pictures and also isolate the kernel. It's much clearer from the traces.
|
||
## Quantization code - start | ||
from torchao.quantization import quant_api | ||
quant_api.change_linear_weights_to_int8_dqtensors(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cpuhrsch not sure if we discussed about this, do you have any thoughts/comments on putting these under the unified quantization API (https://github.com/pytorch-labs/ao/blob/main/torchao/quantization/quant_api.py#L52)?
No description provided.