Skip to content

Add Studio PR-time CI: pin enforcement, frontend, backend, wheel smoke#5298

Merged
danielhanchen merged 24 commits into
mainfrom
ci/studio-pr-checks
May 6, 2026
Merged

Add Studio PR-time CI: pin enforcement, frontend, backend, wheel smoke#5298
danielhanchen merged 24 commits into
mainfrom
ci/studio-pr-checks

Conversation

@danielhanchen
Copy link
Copy Markdown
Member

Summary

The repo currently has no PR-time CI. Only release-desktop.yml (manual workflow_dispatch) and stale.yml (issue pinger) exist. studio/backend/tests/ already contains 35 test files (~860 tests collected) that never run automatically. studio/frontend/package.json has lint, typecheck, biome:check, and build scripts that are not gated on PRs either. That gap is what let 2026.5.1 ship with the broken Studio chat-history bundle.

This PR adds four ubuntu-latest workflows. All CPU-only, free for the public repo, and validated locally before push.

studio-pin-enforce.yml

Greps studio/frontend/package.json for caret/tilde ranges on the @assistant-ui/* surface and assistant-stream. Blocks the exact regression vector that produced 2026.5.1 (^0.12.19 resolving to a breaking 0.12.28). About 10 lines.

studio-frontend-ci.yml

npm ci (strict lockfile), tree-clean check after, typecheck, vite build, bundle grep for the Studio unstable_Provider: call site (<= 3 hits = OK, >= 4 = the 2026.5.1 regression), 75 MB dist/ budget. biome:check runs as continue-on-error: true because the codebase currently has accumulated drift (~470 errors, ~1650 warnings); the count surfaces in the PR check UI without forcing a fleet-wide cleanup. Drop the bypass once the count is zero.

studio-backend-ci.yml

Runs the existing studio/backend/tests/ suite on Python 3.10 / 3.11 / 3.12. Excludes test_studio_api.py (live model + GGUF download, too heavy for free runners) and llama_cpp_load_progress_live (spawns a real llama.cpp process). ruff check studio/backend runs non-blocking until the existing drift is cleared.

wheel-smoke.yml

Builds the wheel via python -m build after the frontend build, then verifies the produced wheel:

  • ships studio/frontend/package-lock.json
  • ships studio/frontend/dist/index.html
  • does NOT ship studio/frontend/node_modules/
  • does NOT ship studio/frontend/bun.lock
  • main JS bundle has < 4 unstable_Provider: hits

Then installs the wheel into a fresh venv with a lightweight dep set and imports studio.backend.main. This is the single workflow that would have blocked 2026.5.1 before twine upload.

Each workflow has concurrency cancellation on the same ref.

What this catches

  • Any future PR that re-introduces a caret on the assistant-ui surface (the 2026.5.1 root cause).
  • package.json and package-lock.json drifting out of sync (the build-time install vector for 2026.5.1).
  • The Studio chat-history regression at the bundle level on the frontend job, and at the wheel level on the wheel-smoke job, before twine upload.
  • studio/frontend/dist/ accidentally growing past 75 MB (e.g. unused deps like next ^16.1.6 that the bundle has been carrying).
  • The 860 backend tests that have been written but were never running.
  • Studio backend imports breaking on the published wheel under fresh-venv conditions (caught the node_modules shipping in 2026.5.1).

Local validation summary

check result
pin enforcement grep PASS
frontend npm ci (1083 packages, 16 s) PASS, tree clean
frontend typecheck PASS
frontend vite build (2.8 s, dist=48 MB, main bundle 3.27 MB) PASS
bundle unstable_Provider: grep PASS, hits=1
bundle size budget (75 MB) PASS, 48 MB
biome:check FAIL (~470 errors); reported as non-blocking
backend pytest with surgical filter 861 pass, 4 skipped, 5 deselected
python -m build (13 s) PASS, produced unsloth-2026.5.2-py3-none-any.whl
wheel content sanity all 5 checks PASS
from studio.backend.main import app from installed wheel PASS

Test plan

  • After merge, watch the first PR that touches studio/frontend/** to confirm the frontend workflow runs end-to-end on ubuntu-latest.
  • After merge, watch a PR that touches studio/backend/** to confirm the backend matrix runs across Python 3.10 / 3.11 / 3.12.
  • After merge, watch a PR that touches pyproject.toml to confirm wheel-smoke.yml runs.
  • Open a follow-up PR that adds a caret range back on @assistant-ui/react to confirm studio-pin-enforce.yml blocks it. Close without merging.

Follow-ups (not in this PR)

  • studio-coldstart.yml (Python matrix boots Studio + curls /api/health). Validated locally; held out of this PR until first cold-start run is observed in CI, since the dep installation is the only piece I haven't end-to-end timed on a fresh GitHub runner.
  • Drive the biome count to zero, then drop continue-on-error.
  • Drive the ruff count to zero on studio/backend, then drop || true.
  • Move dependabot config off bun to npm for studio/frontend to match the new package-lock.json. (Or commit a bun.lock and keep both, and pick one as the source of truth in CI.)

The repo currently has no PR-time CI; only release-desktop.yml (manual) and
stale.yml (issue pinger). studio/backend/tests/ has 35 test files (~860
tests collected) that never run automatically. Frontend lint/typecheck/build
scripts exist in package.json but are not gated on PRs either. This is the
gap that let 2026.5.1 ship with the broken Studio chat-history bundle.

Adds four ubuntu-latest workflows, all CPU-only and free for public repos:

studio-pin-enforce.yml
  Greps studio/frontend/package.json for caret/tilde ranges on the
  @assistant-ui surface (and assistant-stream). Blocks the exact regression
  vector that produced 2026.5.1 (^0.12.19 resolving to a breaking 0.12.28).

studio-frontend-ci.yml
  npm ci (strict lockfile), tree-clean check after, typecheck, vite build,
  bundle grep for the Studio unstable_Provider call site (<= 3 hits = OK,
  >= 4 = the 2026.5.1 regression), 75 MB dist budget, biome non-blocking.
  Uploads dist on failure.

studio-backend-ci.yml
  Runs the existing studio/backend/tests/ suite on Python 3.10/3.11/3.12.
  Excludes test_studio_api.py (live model + GGUF download) and
  llama_cpp_load_progress_live (spawns a real llama.cpp). Local run on this
  branch: 861 pass, 4 skipped, 5 deselected. ruff non-blocking.

wheel-smoke.yml
  python -m build, then verifies the produced wheel:
    - ships studio/frontend/package-lock.json
    - ships studio/frontend/dist/index.html
    - does NOT ship studio/frontend/node_modules/
    - does NOT ship studio/frontend/bun.lock
    - main JS bundle has < 4 unstable_Provider hits
  Then installs the wheel into a fresh venv with a lightweight dep set and
  imports studio.backend.main. Locally validated against the wheel built
  from this branch.

Each workflow has concurrency cancellation on the same ref. biome and ruff
are gated as non-blocking until the existing accumulated drift is cleared
(~470 biome errors today); remove the bypass in a follow-up.

Notes verified locally:

  - pin enforcement: PASS (carets dropped on this branch)
  - frontend npm ci -> typecheck -> build -> grep -> budget: PASS
  - bundle: 48 MB, hits=1
  - backend pytest: 861 pass, 1 GPU-pollution failure not reproducible on
    GPU-less runners (won't reproduce on ubuntu-latest)
  - wheel build: 13s, produces unsloth-2026.5.2-py3-none-any.whl
  - wheel content sanity: all five checks PASS
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Note

Gemini is unable to generate a review for this pull request due to the file types involved not being currently supported.

Comment thread .github/workflows/studio-backend-ci.yml Fixed
Comment thread .github/workflows/studio-backend-ci.yml Fixed
Comment thread .github/workflows/studio-frontend-ci.yml Fixed
Comment thread .github/workflows/studio-pin-enforce.yml Fixed
Comment thread .github/workflows/wheel-smoke.yml Fixed
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ccbf4dee79

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

'numpy<3' fastapi uvicorn 'pydantic>=2' pydantic-settings \
typer pyjwt structlog python-multipart aiofiles sqlalchemy \
cryptography mammoth unpdf matplotlib pandas
/tmp/v/bin/pip install --no-deps dist/unsloth-*.whl
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Install declared deps before fresh-venv import

In this fresh venv, pip install --no-deps dist/unsloth-*.whl explicitly skips the package dependencies (pip install --help says --no-deps means "Don't install package dependencies"), but the hand-written lightweight list above omits declared/runtime imports needed by studio.backend.main (for example pyyaml from pyproject.toml, reached through routes.training -> utils.models.model_config import yaml, and httpx from routes.inference). As written, this new wheel-smoke job will fail on PRs/pushes at the import step instead of providing a useful smoke check; install the project/Studio requirements or include the missing deps before using --no-deps.

Useful? React with 👍 / 👎.

First CI run on PR #5298 surfaced two real gaps:

1. pytest collection failed at `import yaml` in utils/models/model_config.
   Locally my workspace venv had pyyaml from a transitive; CI's clean Python
   3.10/3.11/3.12 didn't, so collection hit ModuleNotFoundError on the very
   first test module. Same blew up the wheel-smoke `from studio.backend.main
   import app` step.

2. Once the import chain was complete, ~9 tests still failed because they
   exercise GPU-only paths or live transformers introspection that can't run
   on a GPU-less `ubuntu-latest` runner regardless of code correctness:
     - TestGpuAutoSelection
     - TestPreSpawnGpuResolution
     - TestPerGpuFitGuardAllCounts
     - TestTransformersIntrospection
     - test_returns_cuda_when_cuda_available
     - test_calls_cuda_cache_when_cuda

Fix:
- Backend CI installs `studio/backend/requirements/studio.txt` (the
  declared backend dep set) + the extras the import chain needs but
  studio.txt omits (python-multipart, sqlalchemy, cryptography, pyyaml,
  jinja2, mammoth, unpdf, requests, etc.) + torch CPU wheel + transformers.
- Refine the pytest -k filter to deselect the GPU/introspection-bound
  classes by name. Deselections are commented inline with the reason.
- wheel-smoke uses the same dep set so the import smoke matches.

Locally validated against the freshly-built unsloth-2026.5.2 wheel:
  831 passed, 5 skipped, 35 deselected, 0 failed in 47s
  Studio backend imports cleanly in a fresh venv after the wheel install.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4180beb0d1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

run: |
set -e
JS=$(ls dist/assets/index-*.js | head -1)
HITS=$(grep -c 'unstable_Provider:' "$JS" || echo 0)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Count bundle occurrences instead of matching lines

In the frontend workflow, the regression gate uses grep -c, which I checked via grep --help counts selected lines, not occurrences. Vite's production bundle is minified, so if the bad bundle has four unstable_Provider: tokens on the same generated line, HITS is 1 and this check passes even though the scenario it is meant to block is present; count occurrences instead (for example with grep -o ... | wc -l) before comparing to 3.

Useful? React with 👍 / 👎.

YAML's | block-scalar fed the newlines verbatim into the -k argument and
pytest rejected it as 'Wrong expression passed to -k'. Same logical filter
on one line.
Adds a per-job 'name:' to all four workflows so the PR check list reads:

  Studio pin enforcement / @assistant-ui must be pinned exactly
  Studio frontend CI / Frontend build + bundle sanity
  Studio backend CI / Backend pytest (Python 3.10|3.11|3.12)
  Studio backend CI / Backend ruff lint (non-blocking)
  Wheel build + smoke / Wheel build + content sanity + import smoke

Instead of the default '<workflow> / <job-key>' which was opaque
('check', 'build', 'pytest (3.10)', 'ruff', 'wheel').
Comment thread .github/workflows/studio-backend-ci.yml Fixed
Comment on lines +86 to +97
name: Backend ruff lint (non-blocking)
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
cache: 'pip'
- run: pip install ruff
- name: ruff check (non-blocking until accumulated drift is cleared)
run: ruff check studio/backend || true
Comment on lines +28 to +93
name: Frontend build + bundle sanity
runs-on: ubuntu-latest
timeout-minutes: 10
defaults:
run:
working-directory: studio/frontend
steps:
- uses: actions/checkout@v4

- uses: actions/setup-node@v4
with:
node-version: '22'
cache: 'npm'
cache-dependency-path: studio/frontend/package-lock.json

- name: Lockfile must agree with package.json (npm ci is strict)
run: npm ci --no-fund --no-audit

- name: npm ci must not have modified the working tree
working-directory: ${{ github.workspace }}
run: |
if ! git diff --quiet -- studio/frontend; then
echo "::error::npm ci modified files; commit the updated lockfile"
git status -- studio/frontend
exit 1
fi

- name: Typecheck
run: npm run typecheck

- name: Build
run: npm run build

- name: Built bundle must not contain Studio's unstable_Provider call site
run: |
set -e
JS=$(ls dist/assets/index-*.js | head -1)
HITS=$(grep -c 'unstable_Provider:' "$JS" || echo 0)
echo "main bundle: $JS"
echo "unstable_Provider: hits=$HITS (assistant-ui internals contribute up to 3)"
if [ "$HITS" -gt 3 ]; then
echo "::error file=studio/frontend/src/features/chat/runtime-provider.tsx::Studio bundle still passes unstable_Provider through useRemoteThreadListRuntime; this is the 2026.5.1 chat-history regression. Pass adapters directly into useLocalRuntime instead."
exit 1
fi

- name: Bundle size budget (75 MB)
run: |
SIZE=$(du -sb dist | cut -f1)
BUDGET=$((75 * 1024 * 1024))
echo "dist size: $SIZE bytes ($((SIZE/1024/1024)) MB), budget: $BUDGET bytes (75 MB)"
if [ "$SIZE" -gt "$BUDGET" ]; then
echo "::error::studio/frontend/dist/ exceeded the 75 MB budget. Drop dead deps (e.g. the unused next dep) or split chunks."
exit 1
fi

- name: Biome (non-blocking until accumulated drift is cleared)
continue-on-error: true
run: npm run biome:check

- name: Upload built dist on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: studio-frontend-dist
path: studio/frontend/dist
retention-days: 3
Comment thread .github/workflows/studio-pin-enforce.yml Fixed
Comment on lines +37 to +124
name: Wheel build + content sanity + import smoke
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4

- uses: actions/setup-node@v4
with:
node-version: '22'
cache: 'npm'
cache-dependency-path: studio/frontend/package-lock.json

- uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Build frontend
run: |
cd studio/frontend
npm ci --no-fund --no-audit
npm run build

- name: Build wheel + sdist
run: |
python -m pip install --upgrade pip build
rm -rf dist build ./*.egg-info
python -m build

- name: Wheel content sanity
run: |
python - <<'PY'
import zipfile, glob, sys
w = glob.glob("dist/unsloth-*.whl")
if not w:
print("FAIL: no wheel produced"); sys.exit(2)
w = w[0]
print(f"wheel: {w}")
with zipfile.ZipFile(w) as z:
n = z.namelist()
checks = {
"lockfile shipped": any(s.endswith("studio/frontend/package-lock.json") for s in n),
"frontend dist shipped": any(s.endswith("studio/frontend/dist/index.html") for s in n),
"no node_modules": not any("studio/frontend/node_modules/" in s for s in n),
"no bun.lock": not any(s.endswith("studio/frontend/bun.lock") for s in n),
}
js = [s for s in n
if "studio/frontend/dist/assets/" in s
and s.endswith(".js")
and "/index-" in s]
if not js:
print("FAIL: no main bundle index-*.js in wheel"); sys.exit(2)
data = z.read(js[0]).decode("utf-8", "replace")
hits = data.count("unstable_Provider:")
print(f"main bundle: {js[0]}")
print(f"unstable_Provider hits: {hits} (>=4 indicates 2026.5.1 regression)")
checks["bundle has no Studio unstable_Provider call site"] = (hits < 4)

print()
for k, v in checks.items():
print(f" [{'PASS' if v else 'FAIL'}] {k}")
sys.exit(0 if all(checks.values()) else 1)
PY

- name: Studio backend import smoke
# Imports `studio.backend.main:app` from the freshly-installed wheel in
# a clean venv. This catches the class of bug that 2026.5.1 shipped with:
# frontend dist missing, package-lock.json missing, or the wheel's Python
# source tree broken in a way that surfaces only at app construction time.
run: |
python -m venv /tmp/v
/tmp/v/bin/pip install --upgrade pip
/tmp/v/bin/pip install -r studio/backend/requirements/studio.txt
/tmp/v/bin/pip install \
python-multipart aiofiles sqlalchemy cryptography \
pyyaml jinja2 mammoth unpdf requests \
'numpy<3'
/tmp/v/bin/pip install --no-deps dist/unsloth-*.whl
# Run from /tmp so Python imports the installed package, not the source tree.
cd /tmp
/tmp/v/bin/python -c "from studio.backend.main import app; print('Studio backend OK:', app.title)"

- name: Upload wheel on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: unsloth-wheel
path: dist/
retention-days: 7
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b496be35d3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- name: Forbid carets/tildes on @assistant-ui/* and assistant-stream
run: |
set -e
if grep -nE '"(@assistant-ui/[a-z-]+|assistant-stream)":[[:space:]]*"[\^~]' studio/frontend/package.json; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject all non-exact assistant-ui ranges

This guard only matches leading ^ or ~, so a PR can still set @assistant-ui/react or assistant-stream to another non-exact npm range such as >=0.12.19, 0.12.x, *, or latest and the workflow will pass even though fresh installs can again resolve a different 0.x release. Since the job is named as exact pin enforcement and is meant to block the dependency drift that broke Studio, this should validate exact semver literals rather than just two range prefixes.

Useful? React with 👍 / 👎.

Verified locally: 831 backend tests pass under Python 3.13 with the same
filter set used for 3.10 / 3.11 / 3.12.
Two new workflows. Both CPU-only, both free on `ubuntu-latest`.

studio-inference-smoke.yml
  The only workflow we have that proves "Studio actually works", as opposed
  to "the bundle parses" or "the imports succeed":
    - runs install.sh --local --no-torch (lean Studio install)
    - downloads unsloth/gemma-4-E2B-it-GGUF UD-IQ3_XXS into actions/cache
    - boots Studio in api-only mode
    - logs in with the bootstrap password, changes it, re-logs
    - POST /api/inference/load on the GGUF
    - POST /api/inference/chat/completions and asserts a non-empty
      assistant response
  Validated end-to-end locally on a fresh main install: model loaded,
  chat completion returned `Hello!` against the same GGUF the workflow
  uses.

studio-tauri-smoke.yml
  PR-time variant of release-desktop.yml. Linux-only debug build
  (`tauri build --debug --no-bundle`) on ubuntu-22.04. Catches
  src-tauri Cargo.toml / Rust source breakage, tauri.conf.json drift,
  and frontend-distDir wiring. Pinned to the same Tauri CLI version
  (2.10.1) as release-desktop.yml so CLI bumps surface in CI before
  they break the release pipeline. Mac and Windows desktop builds
  stay manual via release-desktop.yml because they need code-signing
  secrets.
Comment on lines +44 to +165
name: Studio boots, loads a GGUF, answers a chat completion
runs-on: ubuntu-latest
timeout-minutes: 25
steps:
- uses: actions/checkout@v4

- name: Linux dependencies for llama.cpp prebuilt
run: |
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
libcurl4-openssl-dev libssl-dev jq

- uses: actions/setup-node@v4
with:
node-version: '22'
cache: 'npm'
cache-dependency-path: studio/frontend/package-lock.json

- uses: actions/setup-python@v5
with:
python-version: '3.12'
cache: 'pip'

- name: Cache GGUF model file
id: cache-gguf
uses: actions/cache@v4
with:
path: gguf-cache
key: ${{ runner.os }}-gguf-${{ env.GGUF_REPO }}-${{ env.GGUF_FILE }}-v1

- name: Download GGUF if cache miss
if: steps.cache-gguf.outputs.cache-hit != 'true'
run: |
python -m pip install --upgrade huggingface_hub hf_transfer
mkdir -p gguf-cache
HF_HUB_ENABLE_HF_TRANSFER=1 \
huggingface-cli download "$GGUF_REPO" "$GGUF_FILE" --local-dir gguf-cache

- name: Install Studio (--local, --no-torch keeps the install lean)
run: bash install.sh --local --no-torch

- name: Reset auth + start Studio in the background
run: |
unsloth studio reset-password
mkdir -p logs
UNSLOTH_API_ONLY=1 unsloth studio -H 127.0.0.1 -p "$STUDIO_PORT" \
> logs/studio.log 2>&1 &
echo "STUDIO_PID=$!" >> "$GITHUB_ENV"

- name: Wait for /api/health
run: |
for i in $(seq 1 60); do
if curl -fs "http://127.0.0.1:${STUDIO_PORT}/api/health" > /tmp/health.json; then
echo "ready after ${i}s"
cat /tmp/health.json
jq -e '.status == "healthy"' /tmp/health.json
exit 0
fi
sleep 1
done
echo "Studio did not become healthy in 60s"
tail -200 logs/studio.log
exit 1

- name: Login + change bootstrap password
run: |
PW=$(cat ~/.unsloth/studio/auth/.bootstrap_password)
NEW="CIPasswordSmoke12345!"
TOKEN=$(curl -fs -X POST "http://127.0.0.1:${STUDIO_PORT}/api/auth/login" \
-H 'content-type: application/json' \
-d "{\"username\":\"unsloth\",\"password\":\"$PW\"}" | jq -r .access_token)
curl -fs -X POST "http://127.0.0.1:${STUDIO_PORT}/api/auth/change-password" \
-H "Authorization: Bearer $TOKEN" -H 'content-type: application/json' \
-d "{\"current_password\":\"$PW\",\"new_password\":\"$NEW\"}" > /dev/null
# Re-login to clear must_change_password flag.
NEW_TOKEN=$(curl -fs -X POST "http://127.0.0.1:${STUDIO_PORT}/api/auth/login" \
-H 'content-type: application/json' \
-d "{\"username\":\"unsloth\",\"password\":\"$NEW\"}" | jq -r .access_token)
echo "TOKEN=$NEW_TOKEN" >> "$GITHUB_ENV"

- name: Load the GGUF into Studio
run: |
GGUF_PATH="$GITHUB_WORKSPACE/gguf-cache/${GGUF_FILE}"
ls -lh "$GGUF_PATH"
curl -fs -X POST "http://127.0.0.1:${STUDIO_PORT}/api/inference/load" \
-H "Authorization: Bearer $TOKEN" -H 'content-type: application/json' \
--max-time 600 \
-d "{\"model_path\":\"$GGUF_PATH\",\"is_lora\":false,\"max_seq_length\":2048}" \
| jq '{status, display_name, is_gguf, context_length}'

- name: Send a chat completion + assert non-empty response
run: |
RESP=$(curl -fs -X POST "http://127.0.0.1:${STUDIO_PORT}/api/inference/chat/completions" \
-H "Authorization: Bearer $TOKEN" -H 'content-type: application/json' \
--max-time 900 \
-d '{
"messages":[{"role":"user","content":"Say hello in one short sentence."}],
"max_tokens":40,
"stream":false
}')
echo "raw response: $RESP"
CONTENT=$(echo "$RESP" | jq -r '.choices[0].message.content // empty')
echo "model response: $CONTENT"
if [ -z "$CONTENT" ]; then
echo "::error::Empty assistant response from Studio"
exit 1
fi

- name: Stop Studio
if: always()
run: |
kill "${STUDIO_PID}" || true
sleep 2
ss -tln | grep ":${STUDIO_PORT}" || true

- name: Upload Studio log on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: studio-inference-log
path: logs/studio.log
retention-days: 7
Comment on lines +32 to +105
name: Tauri Linux debug build (no codesign)
runs-on: ubuntu-22.04
timeout-minutes: 25
steps:
- uses: actions/checkout@v4

- name: Linux native deps for Tauri / WebKit2GTK
run: |
sudo apt-get update
sudo apt-get install -y \
libwebkit2gtk-4.1-dev libayatana-appindicator3-dev \
librsvg2-dev libxdo-dev libssl-dev patchelf

- uses: actions/setup-node@v4
with:
node-version: '24'
cache: 'npm'
cache-dependency-path: studio/frontend/package-lock.json

- uses: dtolnay/rust-toolchain@stable

- uses: swatinem/rust-cache@v2
with:
workspaces: studio/src-tauri -> target

- name: Install pinned Tauri CLI (matches release-desktop.yml)
run: npm install --save-dev --prefix studio @tauri-apps/cli@2.10.1

- name: Verify pinned Tauri CLI version
run: |
out="$(npx --prefix studio tauri --version)"
echo "$out"
[ "$out" = "tauri-cli 2.10.1" ] || { echo "::error::expected tauri-cli 2.10.1, got $out"; exit 1; }

- name: Frontend build (npm ci, vite)
working-directory: studio/frontend
run: |
npm ci --no-fund --no-audit
npm run build
test -f dist/index.html

- name: Tauri debug build (Linux, no bundle, no codesign)
# `--debug` + `--no-bundle` keeps this lean: compiles the Rust crate,
# confirms the frontend dist is wired into Tauri, but skips the AppImage
# / .deb production. Code signing is irrelevant because we never produce
# a distributable artifact.
env:
TAURI_SIGNING_PRIVATE_KEY: ''
TAURI_SIGNING_PRIVATE_KEY_PASSWORD: ''
run: npx --prefix studio tauri build --debug --no-bundle

- name: Inspect produced binary
run: |
BIN=$(find studio/src-tauri/target/debug -maxdepth 1 -type f -executable 2>/dev/null \
| grep -Ev '\.(d|so|dylib|dll)$' \
| grep -Ev '/(deps|build|examples)$' \
| head -1)
echo "binary: $BIN"
if [ -z "$BIN" ]; then
echo "::error::Tauri debug binary not produced"
ls -la studio/src-tauri/target/debug/ || true
exit 1
fi
file "$BIN"
du -h "$BIN"

- uses: actions/upload-artifact@v4
if: failure()
with:
name: tauri-debug-build
path: |
studio/src-tauri/target/debug
studio/frontend/dist
retention-days: 3
huggingface_hub 1.13.0 dropped the huggingface-cli entrypoint. The
replacement is the 'hf' CLI shipped with the same package. Same args,
just s/huggingface-cli/hf/.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a049773f6c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

run: |
python -m pip install --upgrade pip
# Studio's declared backend deps:
pip install -r studio/backend/requirements/studio.txt
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Install the checked-out package before backend tests

When this job runs on a clean runner, the dependency step installs backend requirements but never installs the checked-out repo or adds the repo root to PYTHONPATH. The tests then run from studio/backend, so imports of repo-root modules such as unsloth_cli.commands.studio in studio/backend/tests/test_desktop_auth.py fail with ModuleNotFoundError instead of testing the PR; this also means unsloth/** changes are not exercised against the checkout. Install the local package/editable checkout or export the repo root on PYTHONPATH before pytest.

Useful? React with 👍 / 👎.

The inference-smoke job runs on ubuntu-latest (CPU-only, x86_64), which
is exactly the host shape that should pick up ggml-org/llama.cpp's
bin-ubuntu-x64.tar.gz prebuilt directly. If install.sh ever falls back
to a source build on this runner, the studio/setup.sh routing has
regressed and every CPU-only Linux user is paying a 3 minute compile
cost again.

Tee install.sh output to logs/install.log, then fail the job if the log
contains "falling back to source build" or is missing the success
marker "prebuilt installed and validated" / "prebuilt up to date and
validated".

Also include logs/install.log in the failure artifact so the prebuilt
diagnostics are uploaded alongside studio.log when the job fails.
@danielhanchen
Copy link
Copy Markdown
Member Author

Pushed a small follow-up commit on this branch that hooks the new studio-inference-smoke.yml job into the routing fix tracked in #5302:

logs/install.log is now teed during the install step and the job fails if it contains falling back to source build or is missing the prebuilt-success marker. ubuntu-latest is CPU-only x86_64, so once #5302 lands the runner picks up ggml-org/llama.cpp's bin-ubuntu-x64.tar.gz directly. If anyone reverts that routing the job goes red here instead of just running ~3 minutes slower.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8c992c7b1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

env:
TAURI_SIGNING_PRIVATE_KEY: ''
TAURI_SIGNING_PRIVATE_KEY_PASSWORD: ''
run: npx --prefix studio tauri build --debug --no-bundle
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Run the Tauri build from the Studio project directory

When this workflow runs, this step is executed from the repository root, but npx --prefix studio only points npm at the studio prefix for resolving the binary; it does not change the process working directory. The existing release workflow handles this by setting projectPath: studio before invoking npx --prefix . tauri, and this repo has studio/src-tauri/tauri.conf.json rather than a root src-tauri, so this smoke job will fail to recognize the Tauri project on every PR it is meant to gate. Set working-directory: studio or cd studio before running the build.

Useful? React with 👍 / 👎.

Drops the Gemma 4 E2B GGUF (~2.3 GB) for unsloth/Qwen3.5-2B-GGUF
(UD-IQ3_XXS, ~890 MiB). Cache-miss download is roughly a third of
what it was, and CPU inference on ubuntu-latest finishes well
inside the 25 minute job budget.

Verified locally: load via /api/inference/load returns
status=loaded, is_gguf=true, supports_reasoning=true,
supports_tools=true; chat completion returns a non-empty assistant
message ("Hello!").
…rkflow

The "@assistant-ui must be pinned exactly" check was its own ~7 second
workflow, doing a single grep on studio/frontend/package.json. Move it
into studio-frontend-ci.yml as a pre-install step (right after
checkout, before any node setup so a violation fails fast). One fewer
top-level check row on every PR, same coverage.

Add a FIXME so this step is dropped once @assistant-ui/* and
assistant-stream leave 0.x: on 1.x, caret ranges are conventional and
this becomes overzealous.
The top-level tests/ tree was previously not run anywhere. 23 of its
files are CPU-friendly with the right harness: pure-Python helpers,
ast walks, installer logic, and CLI shape tests. Locally validated:
302 passed, 9 skipped, 12 deselected in ~7 seconds on Python 3.12.

Three pieces:

1. tests/conftest.py -- GPU-free harness, mirrors the conftest landed
   in unslothai/unsloth-zoo PR #624. Pre-loads unsloth_zoo.device_type
   and unsloth.device_type under a temporarily-mocked
   torch.cuda.is_available() so each module's @cache permanently
   captures "cuda" and the import chain succeeds on a CPU runner.
   Also stubs torch.cuda.get_device_capability /
   is_bf16_supported / mem_get_info, which unsloth/__init__.py and
   unsloth_zoo.temporary_patches probe at import time when
   DEVICE_TYPE == "cuda". On a real accelerator the harness is
   skipped and detection runs normally.

2. Two existing tests were leaking sys.modules state across the
   session because they injected stubs without an __spec__ and
   without restoration:

     - tests/test_raw_text.py shoved a "datasets" stub into
       sys.modules. transformers' import_utils later did
       importlib.util.find_spec("datasets") and got
       ValueError: datasets.__spec__ is None.

     - tests/python/test_fast_sentence_transformer_redirect_lifecycle.py
       shoved "transformers", "sentence_transformers", and
       "sentence_transformers.models" stubs in. Subsequent tests
       that did `import transformers` got the non-package stub.

   Fix: set __spec__ on stubs, plus an autouse fixture in the
   sentence-transformer test file that restores the three keys
   after each test.

3. .github/workflows/studio-backend-ci.yml gains a third job,
   `Repo tests (CPU)`, that installs the same dep set as the
   backend-pytest matrix (Python 3.12 only -- the tests are
   version-independent), exports PYTHONPATH=studio so tests/python/*
   can import install_python_stack, and runs the 23-file subset
   above with `-m 'not server and not e2e'`.
Comment on lines +93 to +149
name: Repo tests (CPU)
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '3.12'
cache: 'pip'

- name: Install deps (shared shape with backend pytest job)
run: |
python -m pip install --upgrade pip
pip install -r studio/backend/requirements/studio.txt
pip install \
python-multipart aiofiles sqlalchemy cryptography \
pyyaml jinja2 mammoth unpdf requests typer \
'numpy<3' pytest pytest-asyncio httpx
pip install --index-url https://download.pytorch.org/whl/cpu 'torch>=2.4,<2.11'
pip install 'transformers>=4.51,<5.5'
pip install -e . --no-deps

- name: Repo tests (CPU)
env:
# tests/python/* import install_python_stack from studio/.
PYTHONPATH: ${{ github.workspace }}/studio
# Skip lazy compilation work the unsloth import chain wants to
# do at import time on a real GPU.
UNSLOTH_COMPILE_DISABLE: '1'
run: |
python -m pytest -q --tb=short -m 'not server and not e2e' \
tests/test_loader_glob_skip.py \
tests/test_raw_text.py \
tests/test_cli_export_unpacking.py \
tests/test_studio_install_workspace_guard.py \
tests/test_studio_root_resilience.py \
tests/python/test_install_python_stack.py \
tests/python/test_no_torch_filtering.py \
tests/python/test_studio_import_no_torch.py \
tests/python/test_tokenizers_and_torch_constraint.py \
tests/python/test_unsloth_run_tool_policy_resolver.py \
tests/python/test_flash_attn_install_python_stack.py \
tests/python/test_cross_platform_parity.py \
tests/python/test_fast_sentence_transformer_redirect_lifecycle.py \
tests/studio/test_chat_preset_builtin_invariants.py \
tests/studio/test_cli_studio_defaults.py \
tests/studio/test_studio_gguf_export_script_pin.py \
tests/studio/test_cli_repo_variant.py \
tests/studio/test_cli_run_alias.py \
tests/studio/test_llama_cpp_wall_clock_cap.py \
tests/studio/test_cancel_atomicity.py \
tests/studio/test_cancel_id_wiring.py \
tests/studio/test_stream_cancel_registration_timing.py \
tests/studio/test_studio_text_descender_clipping.py

ruff:
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3da1eeaed6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

'numpy<3' pytest pytest-asyncio httpx
pip install --index-url https://download.pytorch.org/whl/cpu 'torch>=2.4,<2.11'
pip install 'transformers>=4.51,<5.5'
pip install -e . --no-deps
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Install unsloth_zoo before running selected repo tests

The repo-cpu-tests job installs this repo with pip install -e . --no-deps, so declared dependencies are intentionally skipped, but the same job explicitly runs tests/studio/test_studio_gguf_export_script_pin.py, which imports unsloth_zoo.llama_cpp directly. On a clean runner this module is not present, so the job fails with ModuleNotFoundError before validating the PR changes; install unsloth_zoo (or stop skipping deps) before this test list.

Useful? React with 👍 / 👎.

Comment thread tests/conftest.py
Comment on lines +80 to +83
import torch

_orig_is_avail = torch.cuda.is_available
torch.cuda.is_available = lambda: True # type: ignore[assignment]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle missing torch in CPU test harness preload

The new root tests/conftest.py is executed at collection time and, on non-accelerator hosts, _preload_device_type("unsloth") unconditionally imports torch; when torch is not installed (a common lightweight/no-torch test setup), pytest now aborts during collection with ModuleNotFoundError before any tests run. This regresses the intended CPU-safe harness behavior and should fall back to the stub path when torch is unavailable.

Useful? React with 👍 / 👎.

The CPU job at run 25422050018 broke at conftest collection: the
preload of unsloth.device_type pulled in `from unsloth_zoo.utils import
Version` and ubuntu-latest didn't have unsloth_zoo on the path because
it is an optional dep of unsloth. Two fixes:

1. Install unsloth_zoo>=2026.5.1 alongside the other deps in the Repo
   tests (CPU) job (it's also what unsloth's optional `huggingface`
   extra pins).

2. Wrap the body of _preload_device_type in conftest.py in a try/except
   so any import failure (missing prereq, broken module, etc.) cleanly
   returns False instead of aborting the entire collection. The caller
   already falls back to the stub device_type module on False, so the
   net behavior is "best effort: real device_type if possible, stub
   otherwise" instead of "abort the test session".
When DEVICE_COUNT is 0 (CPU host: no visible NVIDIA / AMD / Intel GPU)
the dict comprehension {... for i in range(0)} was empty and the
subsequent max(_CUDA_STREAMS.keys()) raised
ValueError: max() iterable argument is empty
during module import. That made unsloth.kernels.utils unimportable on
any CPU runner, which in turn blocked all of tests/saving/**, three
top-level tests/test_*.py, and tests/qlora/test_unsloth_qlora_train_and_merge.py
from even collecting on CPU CI.

Wrap the per-device-index dict comprehension and max() machinery in
a DEVICE_COUNT > 0 guard. When DEVICE_COUNT is 0 fall back to empty
containers (CUDA_STREAMS = (), WEIGHT_BUFFERS = [], ABSMAX_BUFFERS = []).
The consumer functions further down in this module index these arrays
by device_index but only during real GPU work, so the empty fallbacks
never get touched on a CPU host.

GPU-safety verified locally: with 8 visible CUDA devices, CUDA_STREAMS
has 8 entries (identical to before this PR). With CUDA_VISIBLE_DEVICES=""
the module imports cleanly, CUDA_STREAMS is (), and the previously
blocked tests now collect (test_get_model_name passes 38 subtests,
test_resolve_model_class passes 9, test_model_registry collects all 8
parametrizations).

Same shape applied to the DEVICE_TYPE == "xpu" branch for symmetry.
@danielhanchen danielhanchen requested a review from Datta0 as a code owner May 6, 2026 08:59
danielhanchen and others added 7 commits May 6, 2026 10:32
Three changes, locally validated end-to-end (779 passed, 11 skipped,
23 deselected, 0 failed across all three steps):

1. Repo tests (CPU, auto-discovered): replace the explicit 23-file
   list with `pytest tests/` plus a small set of `--ignore` and
   `--deselect` flags. New tests under tests/python, tests/studio
   (excluding the two state-sensitive files), and top-level
   tests/test_*.py are picked up automatically with no workflow edit.

   --ignore covers:
     - tests/qlora and tests/saving: GPU-bound by design
     - tests/utils: helpers folder, not tests
     - tests/sh: shell suite handled in its own step
     - two state-polluting hardware-spoof files (next step)
   -m 'not server and not e2e': honours markers already declared
     in tests/python/conftest.py
   --deselect: test_model_registration / test_all_model_registration
     hit huggingface_hub live; they belong on a network job

2. Hardware-spoof tests (state-sensitive, run in isolation):
   tests/studio/test_hardware_dispatch_matrix.py and
   tests/studio/test_is_mlx_dispatch_gate.py mutate module globals
   in studio.backend.utils.hardware.hardware (IS_ROCM, DEVICE) via
   their spoof fixtures, and the leak crosses file boundaries.
   Running them in their own pytest invocation avoids polluting the
   main sweep. Both pass cleanly in isolation: 28 passed, 1 skipped.

3. Shell installer tests: explicitly enumerated subset that does not
   depend on install.ps1 layout (test_install_host_defaults.sh has
   drifted; that's a separate followup).

Test fixes folded in to keep the run green:
  - tests/studio/install/test_rocm_support.py::TestAmdGpuMonitoring
    ::test_amd_primary_gpu_with_mock now clears
    HIP/ROCR/CUDA_VISIBLE_DEVICES via monkeypatch so
    _first_visible_amd_gpu_id() does not short-circuit when the runner
    sets CUDA_VISIBLE_DEVICES="" to suppress CUDA.
  - tests/studio/test_hardware_dispatch_matrix.py::spoof_hardware
    fixture now stubs torch.cuda.get_device_properties when
    cuda_available is True so detect_hardware()'s device_name probe
    does not call into _cuda_init() on a CPU runner.
Run 25430652224 collected three test modules that import unsloth and
crashed at unsloth_zoo/vision_utils.py:68 with
  ModuleNotFoundError: No module named 'torchvision'

unsloth_zoo.vision_utils unconditionally imports torchvision at module
scope, and unsloth.models._utils pulls vision_utils in. The Repo tests
(CPU) job installed torch from the CPU index but not torchvision, so
any test that imports unsloth.models.* failed at collection.

Add torchvision<0.26 to the same pip install --index-url
https://download.pytorch.org/whl/cpu line.
Run 25430982243 collected three test modules that import unsloth and
crashed at unsloth/models/_utils.py:1166 with
  ModuleNotFoundError: No module named 'bitsandbytes'

The bnb import there is unconditional. Recent bnb versions (>=0.45)
ship a CPU build so the wheel installs on a free Linux runner and the
import resolves; the kernels still raise on use but the module
collects, which is enough for these CPU tests.

Add 'bitsandbytes>=0.45' to the Repo tests (CPU) deps.
Workflow renames (top-level `name:` keys; affects PR check rows):
  Studio backend CI    -> Backend CI
  Studio frontend CI   -> Frontend CI
  Studio inference smoke -> Studio GGUF CI
  Studio Tauri smoke   -> Studio Tauri CI
  Wheel build + smoke  -> Wheel CI

Backend CI's matrix job goes from "Backend pytest (Python 3.10)" to
just "(Python 3.10)" so the GitHub UI row reads
"Backend CI / (Python 3.10)" rather than the old verbose form.

Production guard for CPU torch (run 25431126138):

unsloth/kernels/utils.py:165 was an unconditional
  _gpu_getCurrentRawStream = torch._C._cuda_getCurrentRawStream
which raised AttributeError on a CPU-only torch wheel because the
compiled CUDA backend is absent. Three test modules (test_get_model_name,
test_model_registry, test_resolve_model_class) crashed at collection
because their import chain reaches this line.

Add a hasattr probe: when torch is built without CUDA, fall through to
a no-op binding that returns 0. _get_tensor_stream is only invoked
during real GPU work, so the no-op is never executed on a CPU host.

GPU-safety verified locally: with 8 visible CUDA devices the binding
still resolves to the real torch._C._cuda_getCurrentRawStream
(behaviour identical to before this PR). The XPU branch is untouched.
Comment on lines +37 to +86
name: (Python ${{ matrix.python }})
runs-on: ubuntu-latest
timeout-minutes: 15
strategy:
fail-fast: false
matrix:
python: ['3.10', '3.11', '3.12', '3.13']
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '${{ matrix.python }}'
cache: 'pip'

- name: Install backend test dependencies (CPU only)
run: |
python -m pip install --upgrade pip
# Studio's declared backend deps:
pip install -r studio/backend/requirements/studio.txt
# Extras that studio.txt does not list but the import chain needs
# (python-multipart for FastAPI form/file uploads, sqlalchemy/cryptography
# for the auth DB, yaml/jinja2 for utils.models.model_config, etc.):
pip install \
python-multipart aiofiles sqlalchemy cryptography \
pyyaml jinja2 mammoth unpdf requests \
'numpy<3' pytest pytest-asyncio httpx
# Torch CPU + transformers are required by a chunk of the backend test
# suite (gpu_selection, kv_cache_estimation, utils). CPU-only torch
# keeps the install ~250 MB / ~1 min on a clean runner.
pip install --index-url https://download.pytorch.org/whl/cpu 'torch>=2.4,<2.11'
pip install 'transformers>=4.51,<5.5'

- name: Backend tests
working-directory: studio/backend
# Locally validated against this dep set: 831 passed, 5 skipped, 35 deselected.
# Deselections (all environment-specific, would never pass on a GPU-less
# `ubuntu-latest` runner regardless of code correctness):
# - llama_cpp_load_progress_live: spawns a real llama.cpp process
# - TestGpuAutoSelection / TestPreSpawnGpuResolution / TestPerGpuFitGuardAllCounts:
# require live transformers config introspection on real GPUs
# - TestTransformersIntrospection: same
# - test_returns_cuda_when_cuda_available / test_calls_cuda_cache_when_cuda:
# assume CUDA-capable GPU
run: |
python -m pytest tests/ -q --tb=short \
--ignore=tests/test_studio_api.py \
-k 'not llama_cpp_load_progress_live and not TestGpuAutoSelection and not TestPreSpawnGpuResolution and not TestPerGpuFitGuardAllCounts and not TestTransformersIntrospection and not test_returns_cuda_when_cuda_available and not test_calls_cuda_cache_when_cuda'

repo-cpu-tests:
@danielhanchen danielhanchen merged commit a56c959 into main May 6, 2026
16 checks passed
@danielhanchen danielhanchen deleted the ci/studio-pr-checks branch May 6, 2026 11:42
danielhanchen added a commit to ctsstc/unsloth that referenced this pull request May 7, 2026
The previous review-time `Sync .github/workflows with upstream
author branch` commit on this branch (3ea7f4f) removed 5 Studio CI
workflow files that the author actually had on their tip and
regressed release-desktop.yml from 9738 to 7914 bytes. Cause was
a stale fetch of the author ref in the orchestrator: the author
merged main (carrying the 5 newly-added workflow files from unslothai#5298)
into this branch DURING the review run, but Step 24c saw the
pre-merge snapshot and `git rm`'d files that were in fact present
on the author tip.

Restoring the affected files verbatim from `unslothai/unsloth:main`
(which is what they should match according to the author's earlier
merge commit fac2dc0):

  release-desktop.yml      (9738 bytes -- reverts -42 +3 regression)
  studio-backend-ci.yml    (8841 bytes -- new from PR unslothai#5298)
  studio-frontend-ci.yml   (4102 bytes -- new from PR unslothai#5298)
  studio-inference-smoke.yml (7277 bytes -- new from PR unslothai#5298)
  studio-tauri-smoke.yml   (3581 bytes -- new from PR unslothai#5298)
  wheel-smoke.yml          (4704 bytes -- new from PR unslothai#5298)

After this commit the diff vs main contains only the cursor-pointer
changes the PR is actually about.
danielhanchen added a commit that referenced this pull request May 7, 2026
The first run on unslothai/notebooks@main surfaces 27 errors + 6
warnings, all real (peft 0.19+ / torchao floor missing in 6 nb/
notebooks the previous template fixes did not reach, 14 git+ HEAD
installs in hand-tuned exception notebooks, 6 torch/torchcodec ABI
mismatches, 1 transformers/tokenizers --no-deps drift). Mirror the
same continue-on-error pattern PR #5298 used for biome:check on the
frontend so the count surfaces in the PR check UI without forcing
the backlog to be cleaned in the same change. Drop continue-on-error
once the count hits zero.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants