Skip to content

Conversation

@keithschacht
Copy link
Contributor

@keithschacht keithschacht commented Jul 2, 2024

docs: update description of preparing quantized model in usage section.

Correct some some outdated references to files within the llama.cpp repo and update the example to use a smaller model.
@keithschacht
Copy link
Contributor Author

@yoshoku I don't understand your commitlint. I get the line length, but it's referring to some subject and type which eludes me. But you're welcome to reject this PR if you don't care about updating this. I mostly was keeping these notes for myself as I was trying to get your project working since some of the references had changed within llama_cpp

@yoshoku
Copy link
Owner

yoshoku commented Jul 3, 2024

@krschacht Thank you for your contribution. llama_cpp.rb adopts conventional commits: https://www.conventionalcommits.org/en/v1.0.0/ For example, the commit message for this change might be docs: update description of preparing quantized model in usage section.

@keithschacht keithschacht changed the title Correct README.md docs: update README on preparing quantized model Jul 3, 2024
@keithschacht
Copy link
Contributor Author

keithschacht commented Jul 3, 2024

@yoshoku Ah, got it. Somehow I never knew about conventional commits! TIL :) I just updated the PR so hopefully it's ready to go.

BTW, do you have other plans for this project? I was very excited to find this. I found it while looking for the ruby equivalent of Python's https://github.com/jncraton/languagemodels. I actually really like that this PIP package uses CTranslate2 as the backend and I haven't found a Ruby gem doing the bindings for CTranslate2.

The one issue I ran into with your project was when I tried to load a model like LaMini-Flan-T5-248M. I'm new to this space, but apparently Llama is it's own model architecture whereas T5 is a different architecture, so I can't use llama_cpp.rb to run a T5 model.

@yoshoku
Copy link
Owner

yoshoku commented Jul 4, 2024

@krschacht I wanted you to fix the git commit message, not the pull request description. But, I understood the gist of the pull request, so I fixed README with you as a co-author 8edfd6d. I am going to close this pull request, but please do not take it personally.

@yoshoku yoshoku closed this Jul 4, 2024
@keithschacht
Copy link
Contributor Author

@yoshoku I don't mind at all! Linters... I'm not looking for points. :) I also run a project and I jump in to help get PRs over the line all the time. It's often easier.

I really am interested in if you have other plans for this project? I was thinking about trying to create a version of what you did but for CTranslate2. But maybe I'm wrong and your llama_cpp bindings can also work for T5 models? Anyway, curious where you plan to take your project

@yoshoku
Copy link
Owner

yoshoku commented Jul 5, 2024

@krschacht I think it would be a good idea to create bindings for CTranslate2, but I am pretty busy these days so I probably will not have time to do it.
llama.cpp only recently added support for the T5 architecture, so llama_cpp.rb does not yet support it: ggml-org/llama.cpp#8141. I plan to add bindings for newly added functions such as llama_model_has_encoder, but I cannot guarantee that the example scripts will also support the T5 architecture, as this depends on my free time.

@keithschacht
Copy link
Contributor Author

I didn't realize llama_cpp had recently added support for T5. I found an interesting thread talking about the performance of CTranslate2 vs llama_cpp and it sounds like some of the performance optimizations within CTranslate2 are already in llama_cpp, so the performance may not be that different.

Anyway, I appreciate you creating this project and for sharing this insight. Consider me a motivated & interested "user" in case you ever need help with testing, bugs, implementing specific pieces, etc. I'll keep playing with llama_cpp.rb now that I have it all working!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants