Skip to content

Conversation

cebtenzzre
Copy link
Member

Building on ggml-org/llama.cpp#4978 and ggml-org/llama.cpp#5650, I was finally able to implement a version of ggml-org/llama.cpp#3626 that upstream was satisfied by in ggml-org/llama.cpp#5670.

Now MPT Chat has gone from 3.64 GiB to 3.54 GiB on disk, without breaking upstream compatibility in either direction.

@cebtenzzre cebtenzzre requested a review from manyoso February 22, 2024 22:20
@cebtenzzre
Copy link
Member Author

Due to the model3.json change, I'll hold off on merging this until we're ready to make a new release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants