docs: update README on preparing quantized model #19

keithschacht · 2024-07-02T00:45:38Z

docs: update description of preparing quantized model in usage section.

Correct some some outdated references to files within the llama.cpp repo and update the example to use a smaller model.

keithschacht · 2024-07-03T11:36:23Z

@yoshoku I don't understand your commitlint. I get the line length, but it's referring to some subject and type which eludes me. But you're welcome to reject this PR if you don't care about updating this. I mostly was keeping these notes for myself as I was trying to get your project working since some of the references had changed within llama_cpp

yoshoku · 2024-07-03T15:15:16Z

@krschacht Thank you for your contribution. llama_cpp.rb adopts conventional commits: https://www.conventionalcommits.org/en/v1.0.0/ For example, the commit message for this change might be docs: update description of preparing quantized model in usage section.

keithschacht · 2024-07-03T15:57:56Z

@yoshoku Ah, got it. Somehow I never knew about conventional commits! TIL :) I just updated the PR so hopefully it's ready to go.

BTW, do you have other plans for this project? I was very excited to find this. I found it while looking for the ruby equivalent of Python's https://github.com/jncraton/languagemodels. I actually really like that this PIP package uses CTranslate2 as the backend and I haven't found a Ruby gem doing the bindings for CTranslate2.

The one issue I ran into with your project was when I tried to load a model like LaMini-Flan-T5-248M. I'm new to this space, but apparently Llama is it's own model architecture whereas T5 is a different architecture, so I can't use llama_cpp.rb to run a T5 model.

yoshoku · 2024-07-04T15:14:12Z

@krschacht I wanted you to fix the git commit message, not the pull request description. But, I understood the gist of the pull request, so I fixed README with you as a co-author 8edfd6d. I am going to close this pull request, but please do not take it personally.

keithschacht · 2024-07-04T23:51:21Z

@yoshoku I don't mind at all! Linters... I'm not looking for points. :) I also run a project and I jump in to help get PRs over the line all the time. It's often easier.

I really am interested in if you have other plans for this project? I was thinking about trying to create a version of what you did but for CTranslate2. But maybe I'm wrong and your llama_cpp bindings can also work for T5 models? Anyway, curious where you plan to take your project

yoshoku · 2024-07-05T15:43:10Z

@krschacht I think it would be a good idea to create bindings for CTranslate2, but I am pretty busy these days so I probably will not have time to do it.
llama.cpp only recently added support for the T5 architecture, so llama_cpp.rb does not yet support it: ggml-org/llama.cpp#8141. I plan to add bindings for newly added functions such as llama_model_has_encoder, but I cannot guarantee that the example scripts will also support the T5 architecture, as this depends on my free time.

keithschacht · 2024-07-05T17:05:19Z

I didn't realize llama_cpp had recently added support for T5. I found an interesting thread talking about the performance of CTranslate2 vs llama_cpp and it sounds like some of the performance optimizations within CTranslate2 are already in llama_cpp, so the performance may not be that different.

Anyway, I appreciate you creating this project and for sharing this insight. Consider me a motivated & interested "user" in case you ever need help with testing, bugs, implementing specific pieces, etc. I'll keep playing with llama_cpp.rb now that I have it all working!

Update README.md

27a519c

Correct some some outdated references to files within the llama.cpp repo and update the example to use a smaller model.

fix line length

1ca5305

keithschacht changed the title ~~Correct README.md~~ docs: update README on preparing quantized model Jul 3, 2024

yoshoku closed this Jul 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

docs: update README on preparing quantized model #19

docs: update README on preparing quantized model #19

Uh oh!

keithschacht commented Jul 2, 2024 •

edited

Loading

Uh oh!

keithschacht commented Jul 3, 2024

Uh oh!

yoshoku commented Jul 3, 2024

Uh oh!

keithschacht commented Jul 3, 2024 •

edited

Loading

Uh oh!

yoshoku commented Jul 4, 2024

Uh oh!

keithschacht commented Jul 4, 2024

Uh oh!

yoshoku commented Jul 5, 2024

Uh oh!

keithschacht commented Jul 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

docs: update README on preparing quantized model #19

docs: update README on preparing quantized model #19

Uh oh!

Conversation

keithschacht commented Jul 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

keithschacht commented Jul 3, 2024

Uh oh!

yoshoku commented Jul 3, 2024

Uh oh!

keithschacht commented Jul 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yoshoku commented Jul 4, 2024

Uh oh!

keithschacht commented Jul 4, 2024

Uh oh!

yoshoku commented Jul 5, 2024

Uh oh!

keithschacht commented Jul 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

keithschacht commented Jul 2, 2024 •

edited

Loading

keithschacht commented Jul 3, 2024 •

edited

Loading