Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add layout option to woq int4 api #670

Merged
merged 3 commits into from
Aug 14, 2024

Conversation

Diogo-V
Copy link
Contributor

@Diogo-V Diogo-V commented Aug 13, 2024

Summary

This PR updates the int4_weight_only API by introducing a layout_type parameter. This will allow users to quantize models to int4 while being able to choose between available layouts.

This change also breaks backwards compatibility for all users that explicitly define the inner_k_tiles parameter:

# old
def int4_weight_only(group_size=128, inner_k_tiles=8):
...

quantize_(my_model, int4_weight_only(inner_k_tiles=8))

# new
def int4_weight_only(group_size=128, layout_type=TensorCoreTiledLayoutType(inner_k_tiles=8)):
...

quantize_(my_model, int4_weight_only(layout_type=TensorCoreTiledLayoutType(inner_k_tiles=8)))

BC Breaking notes for Release statement

  • inner_k_tiles was deprecated in favor of layout_type, enabling users to select from various layout options #670
# for torchao 0.5
from torchao.quantization import quantize, int4_weight_only
quantize_(my_model, int4_weight_only(layout_type=TensorCoreTiledLayoutType(inner_k_tiles=8)))

# for torchao 0.4
from torchao.quantization import quantize_, int4_weight_only
quantize_(my_model, int4_weight_only(inner_k_tiles=8))

Tasks

  • Update the function's interface
  • Update the documentation
  • Update the tests

PRs pending on this one

Questions for Reviewers

  • I am not familiar with how releases and backwards compatibility is maintained in pytorch and since this is a breaking change, how should we handle delivering it? Is there a specific flag/if statement that I should add somewhere?

Let me know if there is anything that I need to update. I would be happy to perform any necessary changes.

Copy link

pytorch-bot bot commented Aug 13, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/670

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 500a456 with merge base 88a263a (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 13, 2024
@Diogo-V Diogo-V changed the title feat: add layout option to woq int4 api Add layout option to woq int4 api Aug 13, 2024
@@ -21,7 +21,11 @@
import torch.nn.functional as F
from typing import Any, Callable, Union, Dict, Optional

from torchao.dtypes import PlainLayoutType
from torchao.dtypes import (
to_affine_quantized,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, if there is no circular dep you can remove the import from other functions as well, e.g. int8_weight_only

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jerryzh168 jerryzh168 added the topic: bc-breaking Use this tag if this PR breaks backward compatibility label Aug 13, 2024
@jerryzh168
Copy link
Contributor

jerryzh168 commented Aug 13, 2024

thanks, although this is bc-breaking, but I don't think there are many callsites changing inner_k_tiles so it probably won't affect many people I think

@jerryzh168
Copy link
Contributor

jerryzh168 commented Aug 14, 2024

could you write a bc-breaking notes for this? see format in BC-breaking section of https://github.com/pytorch/ao/releases, something like:

# for torchao 0.5
...

# for torchao 0.4
...

@Diogo-V Diogo-V marked this pull request as ready for review August 14, 2024 00:21
@Diogo-V
Copy link
Contributor Author

Diogo-V commented Aug 14, 2024

on it! where should I place the bc-breaking notes? as a comment on this PR or someplace else?

@Diogo-V
Copy link
Contributor Author

Diogo-V commented Aug 14, 2024

  • inner_k_tiles was deprecated in favor of layout_type, enabling users to select from various layout options #670
# for torchao 0.5
from torchao.quantization import quantize_, int4_weight_only
quantize_(my_model, int4_weight_only(inner_k_tiles=8))

# for torchao 0.4
from torchao.quantization import quantize, int4_weight_only
quantize_(my_model, int4_weight_only(layout_type=TensorCoreTiledLayoutType(inner_k_tiles=8)))

@Diogo-V Diogo-V requested a review from jerryzh168 August 14, 2024 00:36
@jerryzh168
Copy link
Contributor

on it! where should I place the bc-breaking notes? as a comment on this PR or someplace else?

you can put this in PR summary

Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks!

@Diogo-V
Copy link
Contributor Author

Diogo-V commented Aug 14, 2024

done ✅
feel free to merge it or let me know if there is anything else that I might need to change

@msaroufim
Copy link
Member

@jerryzh168 do we need to communicate this change to anyone?

@jerryzh168
Copy link
Contributor

@jerryzh168 do we need to communicate this change to anyone?

seems fine I think, I didn't see people using inner_k_tiles yet, but we can probably update huggingface when 0.5 is released

@jerryzh168 jerryzh168 merged commit 009f55f into pytorch:main Aug 14, 2024
14 checks passed
msaroufim added a commit that referenced this pull request Aug 14, 2024
yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024
* add ET runner to benchmark

* remove spurios end

* add mps runner and groupsize kludge

* adjust groupsize

* fortify runners

* handle device for export_et
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: bc-breaking Use this tag if this PR breaks backward compatibility
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants