Skip to content

Conversation

@rampa3
Copy link
Contributor

@rampa3 rampa3 commented Jan 17, 2026

Description
This PR adds GGUF for Q4_K_M quantization of Qwen3-Coder-30B-A3B-Instruct from Unsloth to the model gallery, based on request in the #model-requests channel of LocalAI Discord.

Notes for Reviewers
Model installation was tested on locally hosted gallery, and syntax of merge was compared to other Qwen 3 merges structure-wise before PR.

Signed commits

  • Yes, I signed my commits.

… request

Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>
@netlify
Copy link

netlify bot commented Jan 17, 2026

Deploy Preview for localai ready!

Name Link
🔨 Latest commit 37375b7
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/696bf7d766049f0008d6ab5f
😎 Deploy Preview https://deploy-preview-8082--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

uri: huggingface://mradermacher/boomerang-qwen3-4.9B-GGUF/boomerang-qwen3-4.9B.Q4_K_M.gguf
- !!merge <<: *qwen3
name: "qwen3-coder-30b-a3b-instruct"
icon: https://cdn-avatars.huggingface.co/v1/production/uploads/620760a26e3b7210c2ff1943/-s1gyJfvbE1RgO5iBeNOi.png
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a model config url is missing here. Take a look at #8088 for an example of a model import

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops! Let me fix it quickly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added url: "github:mudler/LocalAI/gallery/qwen3.yaml@master".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this carries over from the main Qwen 3 entry, since it is a merged entry. My bad.

Copy link
Owner

@mudler mudler Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch. you are actually right. I have been misleaded when I saw the description - I thought you were adding the model without the anchor. You are right, if there is an anchor everything is carried over, and other entries are overridden, sorry!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I undo it then?

Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>
@mudler mudler merged commit 897ad17 into mudler:master Jan 18, 2026
32 checks passed
@mudler
Copy link
Owner

mudler commented Jan 18, 2026

Thank you @rampa3 !

@rampa3 rampa3 deleted the add_Qwen3-Coder-30B-A3B-Instruct branch January 18, 2026 08:32
@reneleonhardt
Copy link
Contributor

@rampa3 Thank you very much!! Is the awesome Qwen3-Next in the model-requests channel by any chance? 😄

MLX 4-bit is only 45 GB for example, runs smoothly as butter.
https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct

@rampa3
Copy link
Contributor Author

rampa3 commented Jan 18, 2026

@rampa3 Thank you very much!! Is the awesome Qwen3-Next in the model-requests channel by any chance? 😄

MLX 4-bit is only 45 GB for example, runs smoothly as butter. https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct

Not sure... Though you will have to ask someone else to add this one even in GGUF Q4_K_M if it is not, as that thing has 48.5 GB in that quantization, so I won't be able to test it if I wrote the model config for it due to my system RAM being 32 GB. Already for qwen3-coder-30b-a3b-instruct I was waiting for kernel to just pull the plug on it due to it almost 100% RAM together with rest of stuff I run in the background on my laptop.

Also, I am PC user, so I cannot do anything MLX. If I test models, I do so against cpu-llama-cpp, so I hope there won't be any hitches that don't happen on the PC side.

@rampa3
Copy link
Contributor Author

rampa3 commented Jan 18, 2026

Just checked - it is not there. @reneleonhardt
I have submitted request for it in there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants