-
Notifications
You must be signed in to change notification settings - Fork 23.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add uint1 to uint7 dtypes #117208
Add uint1 to uint7 dtypes #117208
Conversation
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/117208
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d382027 with merge base f70aeb4 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test that the Python bindings are working and you can make a wrapper subclass with it
Should there also be opaque dtypes for 2-bit/3-bit/4-bit sub-bytes? I think, the quantization efforts now go even there... Regarding 1-bit, there also exists "One-bit Adam" (I think even implemented somewhere in Meta experimental optimizer repos) which use the 1-bit - can be a good showcase for uint1 or for bitmap/bittensor dtype in the related: One nasty thing is that if you google uint1 or uint4, 1-byte and 4-byte exotic namings come out: https://people.montefiore.uliege.be/boigelot/research/lash/man/uint.html which is not very nice... What is the pre-supposed indexing semantics for these sub-byte types? (and if some standardized indexing is supposed at all?) e.g. uint1tensor[3] should retrieve the 3-rd bit of the first byte? or the third encompassing byte? |
yeah we have these in the PR, this PR adds all dtypes from uint1 to uint7
do you mean the lash-dtype naming?
I don't think we want to support sub byte indexing or non-byte aligned indexing. but we can support byte aligned indexing by unpacking, see: https://github.com/pytorch-labs/ao/pull/13/files#diff-109a7f01577eb57b0d9facb5e1c17c23158f544b7203cda513075487a389b2f6R160-R165 |
Yeah, the indexing / virtual vs actual byte shape is important for the usecase like BitTensor/BitMap where semantically they want to be like compressed BoolTensor Regarding the naming, I mean that uint1/uint4 probably would google badly, as currently there exist some trash mentions in other contexts where 1/4 stand for byte and not bit, and uint4_t does not exist in C context. Regarding the uint1, I would also suggest to have some alias or a subclass in core like torch.bit or torch.bitmap or torch.bitset or similar which suggest also a higher-level usage - the compressing BoolTensor might a relatively frequent high-level usecase, also with pack/unpack and RoaringBitmap-like ops, and then maybe some classical morphological / binary image processing ops, also maybe with some LSH/hashing ops |
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
I think uint1 to uint7 is consistent with C/C++ naming of uint8, uint16, uint32 dtypes
can you write some quick code example of what you want to do (what is higher-level usage)? I think these may be built in tensor subclass in general |
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
For non-aligned indexing, I think the most plausible implementation strategy in this direction is an introduction of a 1/8 SymInt (similar to SingletonSymNode we use to represent ragged dimension). Then, if I have a 8 element uint1 tensor, the storage offset of the 1-index element is 1/8, 2-index is 2/8, and so forth. TBH, @jerryzh168 and co are not that interested in the packed bool tensor use case, so someone else is probably going to have to implement it.
My vote is for torch.bit |
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
e.g. as
|
Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: As a follow up from #117208, this PR added a UInt4Tensor in python, it can be used to construct a uint4 tensor and supports some basic operations like view, slice etc. We can extend this to support different quantized tensors as mentioned in the previous PR. Later * tensor factory support for uint4, and other sub byte dtypes * other sub byte tensor subclass support Test Plan: python test/test_tensors.py -k test_constructor Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 7461060077376810b3036adaa984ee01a5110705 Pull Request resolved: #117557
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization (https://www.internalfb.com/diff/D62464487) Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization (https://www.internalfb.com/diff/D62464487) Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization (https://www.internalfb.com/diff/D62464487) Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization (https://www.internalfb.com/diff/D62464487) Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: #136301 Approved by: https://github.com/ezyang
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D64344944](https://our.internmc.facebook.com/intern/diff/D64344944) [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D64344944](https://our.internmc.facebook.com/intern/diff/D64344944) [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D64344944](https://our.internmc.facebook.com/intern/diff/D64344944) [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D64344944](https://our.internmc.facebook.com/intern/diff/D64344944) [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D64344944](https://our.internmc.facebook.com/intern/diff/D64344944) [ghstack-poisoned]
Summary: Similar to #117208, we want to add int1 to int7 for edge use cases for weight quantization Test Plan: python test/test_quantization.py -k test_uint4_int4_dtype Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D64344944](https://our.internmc.facebook.com/intern/diff/D64344944) Pull Request resolved: #137928 Approved by: https://github.com/malfet
Stack from ghstack (oldest at bottom):
Summary:
These dtypes are added since we see more demand for these sub byte dtypes, especially with
the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks)
Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass.
e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later)
Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well.
e.g
Int4GroupedQuantization(torch.Tensor)
will be implemented with torch.uint4 Tensors (see pytorch/ao#13 as an example)Test Plan:
CIs
python test/test_quantization.py -k test_uint1_7_dtype
Reviewers:
Subscribers:
Tasks:
Tags: