-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for T5 Architecture #384
Comments
Hi @niranjanakella! Thank you for opening this issue. Just to clarify, would this be a quantized or nonquantized implementation? |
@EricLBuehler Non-Quantized f16,32 implementation currently holds more precedence. But if possible, would also like to have a quantized implementation too. Also I wish to know if LoRA adapters can be loaded at runtime without merging them into the model. It would be a huge game changer for most applications given the fact that many developers train multiple adapters. Would be great to attach multiple adapters during runtime. |
Sounds great, I'll get started on an implementation.
We actually have this feature already! There are 2 ways to do this:
|
Hi @niranjanakella! Sorry for the delay; I have been busy with the Idefics 2 implementation (#309). I should have a prototype ready tonight, though! |
@EricLBuehler No problem sounds good. I am looking forward to trying it out soon. |
See: #432. |
Hello @EricLBuehler, opening this issue as part of T5 Seq2Seq model architecture support in mistral.rs. (As discussed)
Relates to: #156
The text was updated successfully, but these errors were encountered: