[NF4] Add quantize_()
API support for NF4
#1216
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
When I was working on INT8 training stuff a while back, I realized there was no issue to make NF4 training work with
quantize_()
API (i.e. swap plain tensor with tensor subclass). This is because autograd+compile still works correctly for__torch_function__
(might not be the case in the past when NF4 was implemented) -> eliminate the need for customnn.Module
for downstream users as well as need for explicitly call tolinear_nf4()
.Usage
This would also compose nicely with LoRA, making QLoRA implementation more seamless. e.g.