Skip to content

ci: stop a partial mmproj cache from poisoning Mac Studio GGUF CI#5459

Merged
danielhanchen merged 1 commit into
mainfrom
ci-bump-mmproj-cache-version-after-partial
May 15, 2026
Merged

ci: stop a partial mmproj cache from poisoning Mac Studio GGUF CI#5459
danielhanchen merged 1 commit into
mainfrom
ci-bump-mmproj-cache-version-after-partial

Conversation

@danielhanchen

Copy link
Copy Markdown
Member

Summary

  • The "JSON, images" Mac Studio GGUF CI job hit a stale cache for the multimodal gemma-4-E2B-it bundle that contains only the main GGUF, not the mmproj sibling. `cache-hit == 'true'` so the download step was skipped, then the next step's `ls` exits non-zero on the missing `mmproj-F16.gguf`. This is reproducing across PRs because the cache key only depends on env vars, not branch.
  • Three layered guards: bump cache key v1 -> v2 to invalidate the poisoned entry; add a `verify-cache` step that requires BOTH files before trusting cache-hit, falling through to download otherwise; add a `hashFiles` mmproj-presence check on the save step so a partial mmproj download never lands back in the cache.

Test plan

  • Mac Studio GGUF CI "JSON, images" job passes on dependent PRs.

The "JSON, images" Mac Studio GGUF CI job hit a stale cache for
${{ runner.os }}-gguf-...-mmproj-F16.gguf-v1 that contains only the
main GGUF, not the mmproj sibling. cache-hit==true so the download
step was skipped, then the post-load \`ls\` failed:
  ls: ...gguf-cache/mmproj-F16.gguf: No such file or directory

Three guards layered:

1) Bump cache key v1 -> v2 to invalidate the poisoned entry on the
   GitHub-hosted side.
2) New verify-cache step explicitly checks BOTH files are present
   before trusting cache-hit. If not, fall through to download.
3) Save step gains a hashFiles() check on the mmproj path so a
   partial mmproj download cannot land back in the cache.

Behaviour on a clean run is unchanged; cache hit + verify ok skips
the re-download, partial-hit triggers fresh download, success
saves a complete archive.
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Note

Gemini is unable to generate a review for this pull request due to the file types involved not being currently supported.

@danielhanchen danielhanchen merged commit 90ac4c8 into main May 15, 2026
7 of 9 checks passed
@danielhanchen danielhanchen deleted the ci-bump-mmproj-cache-version-after-partial branch May 15, 2026 18:02

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 319dcb3103

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

# poisoning the cache for the next run.
- name: Save GGUF + mmproj files
if: always() && steps.download-gguf.outcome != 'skipped' && hashFiles('gguf-cache/**/*.gguf') != ''
if: always() && steps.download-gguf.outcome != 'skipped' && hashFiles('gguf-cache/**/*.gguf') != '' && hashFiles(format('gguf-cache/{0}', env.MMPROJ_FILE)) != ''

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Require the model GGUF before saving the cache

In the JSON/images workflow, a run where mmproj-F16.gguf downloads successfully but the main GGUF download fails or is cancelled still satisfies this if: the mmproj file itself matches hashFiles('gguf-cache/**/*.gguf'), and the added check only proves the mmproj exists. That allows actions/cache/save to publish a v2 cache with only the mmproj, which the new verify step will reject on every later run and force a full re-download instead of ever getting a usable cache; require GGUF_FILE explicitly here as well.

Useful? React with 👍 / 👎.

Stanley00 pushed a commit to stanley-fork/unsloth that referenced this pull request May 16, 2026
…othai#5475)

The `JSON, images` job in `studio-mac-inference-smoke.yml` (Job 3
of Mac Studio GGUF CI) downloads ~4 GB on a cache miss: 3 GB
gemma-4-E2B-it-UD-Q4_K_XL.gguf + ~1 GB mmproj-F16.gguf. The 30 min
cap was tight even with `HF_HUB_ENABLE_HF_TRANSFER=1` and parallel
downloads, and timed out the cache-miss run on PR unslothai#5430 mid-download
(run 25950714888) before Studio install or the smoke assertions ran.

Once the actions/cache restore hits, the job comes in under 10 min,
so 45 min only costs runner time on the first run after a cache
key bump (v1->v2 was just bumped in unslothai#5459, which is what produced
this failure). Jobs 1 (openai-anthropic, 270M model) and 2
(tool-calling, ~1.5 GB model) are not bumped -- their 25 min cap
has been comfortable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant