Skip to content

fix: update GGUF save paths to use ~/.unsloth/llama.cpp with Windows support#4138

Merged
danielhanchen merged 8 commits into
mainfrom
feature/llama-cpp-windows-support
Mar 3, 2026
Merged

fix: update GGUF save paths to use ~/.unsloth/llama.cpp with Windows support#4138
danielhanchen merged 8 commits into
mainfrom
feature/llama-cpp-windows-support

Conversation

@rolandtannous
Copy link
Copy Markdown
Contributor

Description

Aligns save_pretrained_gguf and push_to_hub_gguf in unsloth/save.py with the updated unsloth_zoo/llama_cpp.py, which now builds and installs llama.cpp components into ~/.unsloth/llama.cpp instead of the current working directory.

Changes

  • Import LLAMA_CPP_DEFAULT_DIR and IS_WINDOWS from unsloth_zoo.llama_cpp to reference the correct llama.cpp install path
  • Update example usage paths in save_pretrained_gguf to use os.path.join with platform-correct binary locations (build/bin/Release/ on Windows, root dir on Linux) and .exe suffix on Windows
  • Update error message in save_to_gguf to provide platform-appropriate manual build instructions (cmake on Windows, make on Linux)
  • Update README template in push_to_hub_gguf to use platform-neutral binary names without hardcoded path prefixes

What did NOT change

The core GGUF conversion logic (save_to_ggufcheck_llama_cpp / install_llama_cpp / convert_to_gguf / quantize_gguf) already delegates to unsloth_zoo.llama_cpp functions, which transparently pick up the new ~/.unsloth/llama.cpp path. No logic changes were needed — only user-facing strings.

Related

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the GGUF saving and pushing functionalities to seamlessly integrate with a new, standardized llama.cpp installation path located at ~/.unsloth/llama.cpp. It significantly improves cross-platform compatibility by introducing Windows-specific handling for binary paths and compilation instructions, ensuring a more robust user experience for GGUF model conversion and sharing.

Highlights

  • llama.cpp path alignment: Updated GGUF save and push functions to align with the new ~/.unsloth/llama.cpp installation directory for llama.cpp components.
  • Windows compatibility: Introduced Windows-specific logic for llama.cpp binary paths, .exe suffixes, and manual compilation instructions.
  • Improved error messages: Enhanced error messages for save_to_gguf to provide platform-appropriate manual build commands for llama.cpp.
  • README template updates: Modified the README template in push_to_hub_gguf to use platform-neutral llama.cpp binary names without hardcoded path prefixes.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • unsloth/save.py
    • Imported LLAMA_CPP_DEFAULT_DIR and IS_WINDOWS constants.
    • Modified the save_to_gguf error message to provide platform-specific llama.cpp compilation instructions (cmake for Windows, make for Linux).
    • Updated example usage strings in unsloth_save_pretrained_gguf to correctly reference llama.cpp binaries based on the operating system and the new default installation directory.
    • Adjusted the README template in unsloth_push_to_hub_gguf to remove hardcoded llama.cpp/ prefixes from example commands, making them platform-agnostic.
Activity
  • No specific activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request updates GGUF save paths to use ~/.unsloth/llama.cpp and adds Windows support, which is a great improvement for cross-platform compatibility. The changes are well-aligned with the description, updating error messages and example usage paths to be platform-aware. I've found a small area for improvement regarding code duplication in the error handling logic, which can be refactored for better maintainability. Overall, this is a solid contribution.

Comment thread unsloth/save.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9241e66be9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread unsloth/save.py Outdated
Copy link
Copy Markdown
Collaborator

@Datta0 Datta0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

danielhanchen and others added 2 commits March 3, 2026 10:47
- H4: Add defensive try/except for LLAMA_CPP_DEFAULT_DIR and IS_WINDOWS imports
  with fallback defaults, so save.py works even if zoo PR #526 is not merged yet
- H5: Fix Kaggle error path using plain "Error: {e}" instead of f"Error: {e}",
  so the actual exception is shown to users
@danielhanchen
Copy link
Copy Markdown
Member

Review: Fixes applied

Pushed fixes for the issues identified during review of the companion zoo PR #526:

Fixes in this commit

  • H4: Defensive try/except for LLAMA_CPP_DEFAULT_DIR and IS_WINDOWS imports, with fallback defaults so save.py works even if zoo PR Support for Cohere 23 models #526 is not yet merged
  • H5: Fixed Kaggle error path using plain "Error: {e}" instead of f"Error: {e}" -- the actual exception was never shown to users

Testing

  • Llama-3.2-1B: train (21 steps) + GGUF export (q8_0) -- PASS (1260 MB .gguf file)
  • Gemma3-4B Vision: train (21 steps) + GGUF export (q8_0) -- PASS (3939 MB text + 812 MB mmproj)
  • Both correctly use ~/.unsloth/llama.cpp as install location
  • Both correctly show the new path in example usage output

Note

The Gemma3 compiler has a pre-existing syntax error bug (unmatched ) at line 843 in generated unsloth_compiled_module_gemma3.py). This is a main-branch issue unrelated to this PR and requires UNSLOTH_COMPILE_DISABLE=1 as workaround.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5915d43742

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread unsloth/save.py
@danielhanchen danielhanchen merged commit 5738d36 into main Mar 3, 2026
4 checks passed
@danielhanchen danielhanchen deleted the feature/llama-cpp-windows-support branch March 3, 2026 14:34
abiswas-realadvice pushed a commit to abiswas-realadvice/unsloth that referenced this pull request May 14, 2026
…support (unslothai#4138)

* fix: update GGUF save paths to use ~/.unsloth/llama.cpp with Windows support

* fix: quote LLAMA_CPP_DEFAULT_DIR in fallback shell commands to handle paths with spaces

* refactor: deduplicate platform-specific build instructions in quantization error message

* chore: remove accidentally committed PR description file

* Fix import safety and f-string bugs in save.py

- H4: Add defensive try/except for LLAMA_CPP_DEFAULT_DIR and IS_WINDOWS imports
  with fallback defaults, so save.py works even if zoo PR unslothai#526 is not merged yet
- H5: Fix Kaggle error path using plain "Error: {e}" instead of f"Error: {e}",
  so the actual exception is shown to users

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants