-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status of prototype features #1807
Comments
I don't think this should be a separate feature. If we decide to polish this, IMO it should be a setting in the regular float8/int8/mx flows to use fused eager mode kernels. I'd want to make sure overall UX does not regress with respect to activation checkpointing and that performance is actually compelling on real models important today (large enough gemms) before shipping this. |
IMO the main blocker to promoting this from prototype is better composability with AC, for which we need to implement the feature request here pytorch/pytorch#144928 From my conversations with Jeffrey my understanding is he agrees it would be a useful feature, has some ideas in mind about how to implement it, and is planning to do it some time this half (cc @soulitzer please correct me if I'm mistaken about this).
This is an interesting idea as well, I'd be interested in exploring that once the AC API described in the feature request has landed.
+1 |
Agree with you on the things we should deprecate/delete. We can perhaps do them during our next BE day/week? cc @andrewor14 For the rest like quantization algorithms or sparsity I'll defer to @jerryzh168 and @jcaip to share their thoughts. |
cc @msaroufim For sparsity: 2:4, marlin, and BSR all have been promoted out of prototype, the only things that remain are:
|
I was parsing through our
prototype
folder and wanted to give my take on what should be promoted, deleted or requires further discussioncorrect
and pass reference checks relative to the original repos we should promote those to out of prototype. In particular the benefits we should lean into are accelerated performance usingtorch.compile
and serialization support with the HF hub @jerryzh168I'd love to hear more on folks especially if you disagree with anything!
cc @supriyar @jerryzh168 @drisspg @vkuzo @gau-nernst
The text was updated successfully, but these errors were encountered: