Add Studio PR-time CI: pin enforcement, frontend, backend, wheel smoke by danielhanchen · Pull Request #5298 · unslothai/unsloth

danielhanchen · 2026-05-06T04:15:59Z

Summary

The repo currently has no PR-time CI. Only release-desktop.yml (manual workflow_dispatch) and stale.yml (issue pinger) exist. studio/backend/tests/ already contains 35 test files (~860 tests collected) that never run automatically. studio/frontend/package.json has lint, typecheck, biome:check, and build scripts that are not gated on PRs either. That gap is what let 2026.5.1 ship with the broken Studio chat-history bundle.

This PR adds four ubuntu-latest workflows. All CPU-only, free for the public repo, and validated locally before push.

`studio-pin-enforce.yml`

Greps studio/frontend/package.json for caret/tilde ranges on the @assistant-ui/* surface and assistant-stream. Blocks the exact regression vector that produced 2026.5.1 (^0.12.19 resolving to a breaking 0.12.28). About 10 lines.

`studio-frontend-ci.yml`

npm ci (strict lockfile), tree-clean check after, typecheck, vite build, bundle grep for the Studio unstable_Provider: call site (<= 3 hits = OK, >= 4 = the 2026.5.1 regression), 75 MB dist/ budget. biome:check runs as continue-on-error: true because the codebase currently has accumulated drift (~470 errors, ~1650 warnings); the count surfaces in the PR check UI without forcing a fleet-wide cleanup. Drop the bypass once the count is zero.

`studio-backend-ci.yml`

Runs the existing studio/backend/tests/ suite on Python 3.10 / 3.11 / 3.12. Excludes test_studio_api.py (live model + GGUF download, too heavy for free runners) and llama_cpp_load_progress_live (spawns a real llama.cpp process). ruff check studio/backend runs non-blocking until the existing drift is cleared.

`wheel-smoke.yml`

Builds the wheel via python -m build after the frontend build, then verifies the produced wheel:

ships studio/frontend/package-lock.json
ships studio/frontend/dist/index.html
does NOT ship studio/frontend/node_modules/
does NOT ship studio/frontend/bun.lock
main JS bundle has < 4 unstable_Provider: hits

Then installs the wheel into a fresh venv with a lightweight dep set and imports studio.backend.main. This is the single workflow that would have blocked 2026.5.1 before twine upload.

Each workflow has concurrency cancellation on the same ref.

What this catches

Any future PR that re-introduces a caret on the assistant-ui surface (the 2026.5.1 root cause).
package.json and package-lock.json drifting out of sync (the build-time install vector for 2026.5.1).
The Studio chat-history regression at the bundle level on the frontend job, and at the wheel level on the wheel-smoke job, before twine upload.
studio/frontend/dist/ accidentally growing past 75 MB (e.g. unused deps like next ^16.1.6 that the bundle has been carrying).
The 860 backend tests that have been written but were never running.
Studio backend imports breaking on the published wheel under fresh-venv conditions (caught the node_modules shipping in 2026.5.1).

Local validation summary

check	result
pin enforcement grep	PASS
frontend `npm ci` (1083 packages, 16 s)	PASS, tree clean
frontend `typecheck`	PASS
frontend `vite build` (2.8 s, dist=48 MB, main bundle 3.27 MB)	PASS
bundle `unstable_Provider:` grep	PASS, hits=1
bundle size budget (75 MB)	PASS, 48 MB
`biome:check`	FAIL (~470 errors); reported as non-blocking
backend pytest with surgical filter	861 pass, 4 skipped, 5 deselected
`python -m build` (13 s)	PASS, produced `unsloth-2026.5.2-py3-none-any.whl`
wheel content sanity	all 5 checks PASS
`from studio.backend.main import app` from installed wheel	PASS

Test plan

After merge, watch the first PR that touches studio/frontend/** to confirm the frontend workflow runs end-to-end on ubuntu-latest.
After merge, watch a PR that touches studio/backend/** to confirm the backend matrix runs across Python 3.10 / 3.11 / 3.12.
After merge, watch a PR that touches pyproject.toml to confirm wheel-smoke.yml runs.
Open a follow-up PR that adds a caret range back on @assistant-ui/react to confirm studio-pin-enforce.yml blocks it. Close without merging.

Follow-ups (not in this PR)

studio-coldstart.yml (Python matrix boots Studio + curls /api/health). Validated locally; held out of this PR until first cold-start run is observed in CI, since the dep installation is the only piece I haven't end-to-end timed on a fresh GitHub runner.
Drive the biome count to zero, then drop continue-on-error.
Drive the ruff count to zero on studio/backend, then drop || true.
Move dependabot config off bun to npm for studio/frontend to match the new package-lock.json. (Or commit a bun.lock and keep both, and pick one as the source of truth in CI.)

The repo currently has no PR-time CI; only release-desktop.yml (manual) and stale.yml (issue pinger). studio/backend/tests/ has 35 test files (~860 tests collected) that never run automatically. Frontend lint/typecheck/build scripts exist in package.json but are not gated on PRs either. This is the gap that let 2026.5.1 ship with the broken Studio chat-history bundle. Adds four ubuntu-latest workflows, all CPU-only and free for public repos: studio-pin-enforce.yml Greps studio/frontend/package.json for caret/tilde ranges on the @assistant-ui surface (and assistant-stream). Blocks the exact regression vector that produced 2026.5.1 (^0.12.19 resolving to a breaking 0.12.28). studio-frontend-ci.yml npm ci (strict lockfile), tree-clean check after, typecheck, vite build, bundle grep for the Studio unstable_Provider call site (<= 3 hits = OK, >= 4 = the 2026.5.1 regression), 75 MB dist budget, biome non-blocking. Uploads dist on failure. studio-backend-ci.yml Runs the existing studio/backend/tests/ suite on Python 3.10/3.11/3.12. Excludes test_studio_api.py (live model + GGUF download) and llama_cpp_load_progress_live (spawns a real llama.cpp). Local run on this branch: 861 pass, 4 skipped, 5 deselected. ruff non-blocking. wheel-smoke.yml python -m build, then verifies the produced wheel: - ships studio/frontend/package-lock.json - ships studio/frontend/dist/index.html - does NOT ship studio/frontend/node_modules/ - does NOT ship studio/frontend/bun.lock - main JS bundle has < 4 unstable_Provider hits Then installs the wheel into a fresh venv with a lightweight dep set and imports studio.backend.main. Locally validated against the wheel built from this branch. Each workflow has concurrency cancellation on the same ref. biome and ruff are gated as non-blocking until the existing accumulated drift is cleared (~470 biome errors today); remove the bypass in a follow-up. Notes verified locally: - pin enforcement: PASS (carets dropped on this branch) - frontend npm ci -> typecheck -> build -> grep -> budget: PASS - bundle: 48 MB, hits=1 - backend pytest: 861 pass, 1 GPU-pollution failure not reproducible on GPU-less runners (won't reproduce on ubuntu-latest) - wheel build: 13s, produces unsloth-2026.5.2-py3-none-any.whl - wheel content sanity: all five checks PASS

gemini-code-assist · 2026-05-06T04:16:04Z

Note

Gemini is unable to generate a review for this pull request due to the file types involved not being currently supported.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ccbf4dee79

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T04:19:56Z

+            'numpy<3' fastapi uvicorn 'pydantic>=2' pydantic-settings \
+            typer pyjwt structlog python-multipart aiofiles sqlalchemy \
+            cryptography mammoth unpdf matplotlib pandas
+          /tmp/v/bin/pip install --no-deps dist/unsloth-*.whl


Install declared deps before fresh-venv import

In this fresh venv, pip install --no-deps dist/unsloth-*.whl explicitly skips the package dependencies (pip install --help says --no-deps means "Don't install package dependencies"), but the hand-written lightweight list above omits declared/runtime imports needed by studio.backend.main (for example pyyaml from pyproject.toml, reached through routes.training -> utils.models.model_config import yaml, and httpx from routes.inference). As written, this new wheel-smoke job will fail on PRs/pushes at the import step instead of providing a useful smoke check; install the project/Studio requirements or include the missing deps before using --no-deps.

Useful? React with 👍 / 👎.

First CI run on PR #5298 surfaced two real gaps: 1. pytest collection failed at `import yaml` in utils/models/model_config. Locally my workspace venv had pyyaml from a transitive; CI's clean Python 3.10/3.11/3.12 didn't, so collection hit ModuleNotFoundError on the very first test module. Same blew up the wheel-smoke `from studio.backend.main import app` step. 2. Once the import chain was complete, ~9 tests still failed because they exercise GPU-only paths or live transformers introspection that can't run on a GPU-less `ubuntu-latest` runner regardless of code correctness: - TestGpuAutoSelection - TestPreSpawnGpuResolution - TestPerGpuFitGuardAllCounts - TestTransformersIntrospection - test_returns_cuda_when_cuda_available - test_calls_cuda_cache_when_cuda Fix: - Backend CI installs `studio/backend/requirements/studio.txt` (the declared backend dep set) + the extras the import chain needs but studio.txt omits (python-multipart, sqlalchemy, cryptography, pyyaml, jinja2, mammoth, unpdf, requests, etc.) + torch CPU wheel + transformers. - Refine the pytest -k filter to deselect the GPU/introspection-bound classes by name. Deselections are commented inline with the reason. - wheel-smoke uses the same dep set so the import smoke matches. Locally validated against the freshly-built unsloth-2026.5.2 wheel: 831 passed, 5 skipped, 35 deselected, 0 failed in 47s Studio backend imports cleanly in a fresh venv after the wheel install.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4180beb0d1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T04:38:00Z

+        run: |
+          set -e
+          JS=$(ls dist/assets/index-*.js | head -1)
+          HITS=$(grep -c 'unstable_Provider:' "$JS" || echo 0)


Count bundle occurrences instead of matching lines

In the frontend workflow, the regression gate uses grep -c, which I checked via grep --help counts selected lines, not occurrences. Vite's production bundle is minified, so if the bad bundle has four unstable_Provider: tokens on the same generated line, HITS is 1 and this check passes even though the scenario it is meant to block is present; count occurrences instead (for example with grep -o ... | wc -l) before comparing to 3.

Useful? React with 👍 / 👎.

YAML's | block-scalar fed the newlines verbatim into the -k argument and pytest rejected it as 'Wrong expression passed to -k'. Same logical filter on one line.

Adds a per-job 'name:' to all four workflows so the PR check list reads: Studio pin enforcement / @assistant-ui must be pinned exactly Studio frontend CI / Frontend build + bundle sanity Studio backend CI / Backend pytest (Python 3.10|3.11|3.12) Studio backend CI / Backend ruff lint (non-blocking) Wheel build + smoke / Wheel build + content sanity + import smoke Instead of the default '<workflow> / <job-key>' which was opaque ('check', 'build', 'pytest (3.10)', 'ruff', 'wheel').

+    name: Backend ruff lint (non-blocking)
+    runs-on: ubuntu-latest
+    timeout-minutes: 5
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+          cache: 'pip'
+      - run: pip install ruff
+      - name: ruff check (non-blocking until accumulated drift is cleared)
+        run: ruff check studio/backend || true


+    name: Frontend build + bundle sanity
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+    defaults:
+      run:
+        working-directory: studio/frontend
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '22'
+          cache: 'npm'
+          cache-dependency-path: studio/frontend/package-lock.json
+
+      - name: Lockfile must agree with package.json (npm ci is strict)
+        run: npm ci --no-fund --no-audit
+
+      - name: npm ci must not have modified the working tree
+        working-directory: ${{ github.workspace }}
+        run: |
+          if ! git diff --quiet -- studio/frontend; then
+            echo "::error::npm ci modified files; commit the updated lockfile"
+            git status -- studio/frontend
+            exit 1
+          fi
+
+      - name: Typecheck
+        run: npm run typecheck
+
+      - name: Build
+        run: npm run build
+
+      - name: Built bundle must not contain Studio's unstable_Provider call site
+        run: |
+          set -e
+          JS=$(ls dist/assets/index-*.js | head -1)
+          HITS=$(grep -c 'unstable_Provider:' "$JS" || echo 0)
+          echo "main bundle: $JS"
+          echo "unstable_Provider: hits=$HITS (assistant-ui internals contribute up to 3)"
+          if [ "$HITS" -gt 3 ]; then
+            echo "::error file=studio/frontend/src/features/chat/runtime-provider.tsx::Studio bundle still passes unstable_Provider through useRemoteThreadListRuntime; this is the 2026.5.1 chat-history regression. Pass adapters directly into useLocalRuntime instead."
+            exit 1
+          fi
+
+      - name: Bundle size budget (75 MB)
+        run: |
+          SIZE=$(du -sb dist | cut -f1)
+          BUDGET=$((75 * 1024 * 1024))
+          echo "dist size: $SIZE bytes ($((SIZE/1024/1024)) MB), budget: $BUDGET bytes (75 MB)"
+          if [ "$SIZE" -gt "$BUDGET" ]; then
+            echo "::error::studio/frontend/dist/ exceeded the 75 MB budget. Drop dead deps (e.g. the unused next dep) or split chunks."
+            exit 1
+          fi
+
+      - name: Biome (non-blocking until accumulated drift is cleared)
+        continue-on-error: true
+        run: npm run biome:check
+
+      - name: Upload built dist on failure
+        if: failure()
+        uses: actions/upload-artifact@v4
+        with:
+          name: studio-frontend-dist
+          path: studio/frontend/dist
+          retention-days: 3


+    name: Wheel build + content sanity + import smoke
+    runs-on: ubuntu-latest
+    timeout-minutes: 15
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '22'
+          cache: 'npm'
+          cache-dependency-path: studio/frontend/package-lock.json
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+
+      - name: Build frontend
+        run: |
+          cd studio/frontend
+          npm ci --no-fund --no-audit
+          npm run build
+
+      - name: Build wheel + sdist
+        run: |
+          python -m pip install --upgrade pip build
+          rm -rf dist build ./*.egg-info
+          python -m build
+
+      - name: Wheel content sanity
+        run: |
+          python - <<'PY'
+          import zipfile, glob, sys
+          w = glob.glob("dist/unsloth-*.whl")
+          if not w:
+              print("FAIL: no wheel produced"); sys.exit(2)
+          w = w[0]
+          print(f"wheel: {w}")
+          with zipfile.ZipFile(w) as z:
+              n = z.namelist()
+              checks = {
+                "lockfile shipped":      any(s.endswith("studio/frontend/package-lock.json") for s in n),
+                "frontend dist shipped": any(s.endswith("studio/frontend/dist/index.html")    for s in n),
+                "no node_modules":       not any("studio/frontend/node_modules/" in s for s in n),
+                "no bun.lock":           not any(s.endswith("studio/frontend/bun.lock")       for s in n),
+              }
+              js = [s for s in n
+                    if "studio/frontend/dist/assets/" in s
+                    and s.endswith(".js")
+                    and "/index-" in s]
+              if not js:
+                  print("FAIL: no main bundle index-*.js in wheel"); sys.exit(2)
+              data = z.read(js[0]).decode("utf-8", "replace")
+              hits = data.count("unstable_Provider:")
+              print(f"main bundle: {js[0]}")
+              print(f"unstable_Provider hits: {hits} (>=4 indicates 2026.5.1 regression)")
+              checks["bundle has no Studio unstable_Provider call site"] = (hits < 4)
+
+              print()
+              for k, v in checks.items():
+                  print(f"  [{'PASS' if v else 'FAIL'}] {k}")
+              sys.exit(0 if all(checks.values()) else 1)
+          PY
+
+      - name: Studio backend import smoke
+        # Imports `studio.backend.main:app` from the freshly-installed wheel in
+        # a clean venv. This catches the class of bug that 2026.5.1 shipped with:
+        # frontend dist missing, package-lock.json missing, or the wheel's Python
+        # source tree broken in a way that surfaces only at app construction time.
+        run: |
+          python -m venv /tmp/v
+          /tmp/v/bin/pip install --upgrade pip
+          /tmp/v/bin/pip install -r studio/backend/requirements/studio.txt
+          /tmp/v/bin/pip install \
+            python-multipart aiofiles sqlalchemy cryptography \
+            pyyaml jinja2 mammoth unpdf requests \
+            'numpy<3'
+          /tmp/v/bin/pip install --no-deps dist/unsloth-*.whl
+          # Run from /tmp so Python imports the installed package, not the source tree.
+          cd /tmp
+          /tmp/v/bin/python -c "from studio.backend.main import app; print('Studio backend OK:', app.title)"
+
+      - name: Upload wheel on failure
+        if: failure()
+        uses: actions/upload-artifact@v4
+        with:
+          name: unsloth-wheel
+          path: dist/
+          retention-days: 7


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b496be35d3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T05:02:19Z

+      - name: Forbid carets/tildes on @assistant-ui/* and assistant-stream
+        run: |
+          set -e
+          if grep -nE '"(@assistant-ui/[a-z-]+|assistant-stream)":[[:space:]]*"[\^~]' studio/frontend/package.json; then


Reject all non-exact assistant-ui ranges

This guard only matches leading ^ or ~, so a PR can still set @assistant-ui/react or assistant-stream to another non-exact npm range such as >=0.12.19, 0.12.x, *, or latest and the workflow will pass even though fresh installs can again resolve a different 0.x release. Since the job is named as exact pin enforcement and is meant to block the dependency drift that broke Studio, this should validate exact semver literals rather than just two range prefixes.

Useful? React with 👍 / 👎.

Verified locally: 831 backend tests pass under Python 3.13 with the same filter set used for 3.10 / 3.11 / 3.12.

Two new workflows. Both CPU-only, both free on `ubuntu-latest`. studio-inference-smoke.yml The only workflow we have that proves "Studio actually works", as opposed to "the bundle parses" or "the imports succeed": - runs install.sh --local --no-torch (lean Studio install) - downloads unsloth/gemma-4-E2B-it-GGUF UD-IQ3_XXS into actions/cache - boots Studio in api-only mode - logs in with the bootstrap password, changes it, re-logs - POST /api/inference/load on the GGUF - POST /api/inference/chat/completions and asserts a non-empty assistant response Validated end-to-end locally on a fresh main install: model loaded, chat completion returned `Hello!` against the same GGUF the workflow uses. studio-tauri-smoke.yml PR-time variant of release-desktop.yml. Linux-only debug build (`tauri build --debug --no-bundle`) on ubuntu-22.04. Catches src-tauri Cargo.toml / Rust source breakage, tauri.conf.json drift, and frontend-distDir wiring. Pinned to the same Tauri CLI version (2.10.1) as release-desktop.yml so CLI bumps surface in CI before they break the release pipeline. Mac and Windows desktop builds stay manual via release-desktop.yml because they need code-signing secrets.

+    name: Studio boots, loads a GGUF, answers a chat completion
+    runs-on: ubuntu-latest
+    timeout-minutes: 25
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Linux dependencies for llama.cpp prebuilt
+        run: |
+          sudo apt-get update
+          sudo apt-get install -y --no-install-recommends \
+            libcurl4-openssl-dev libssl-dev jq
+
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '22'
+          cache: 'npm'
+          cache-dependency-path: studio/frontend/package-lock.json
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+          cache: 'pip'
+
+      - name: Cache GGUF model file
+        id: cache-gguf
+        uses: actions/cache@v4
+        with:
+          path: gguf-cache
+          key: ${{ runner.os }}-gguf-${{ env.GGUF_REPO }}-${{ env.GGUF_FILE }}-v1
+
+      - name: Download GGUF if cache miss
+        if: steps.cache-gguf.outputs.cache-hit != 'true'
+        run: |
+          python -m pip install --upgrade huggingface_hub hf_transfer
+          mkdir -p gguf-cache
+          HF_HUB_ENABLE_HF_TRANSFER=1 \
+            huggingface-cli download "$GGUF_REPO" "$GGUF_FILE" --local-dir gguf-cache
+
+      - name: Install Studio (--local, --no-torch keeps the install lean)
+        run: bash install.sh --local --no-torch
+
+      - name: Reset auth + start Studio in the background
+        run: |
+          unsloth studio reset-password
+          mkdir -p logs
+          UNSLOTH_API_ONLY=1 unsloth studio -H 127.0.0.1 -p "$STUDIO_PORT" \
+            > logs/studio.log 2>&1 &
+          echo "STUDIO_PID=$!" >> "$GITHUB_ENV"
+
+      - name: Wait for /api/health
+        run: |
+          for i in $(seq 1 60); do
+            if curl -fs "http://127.0.0.1:${STUDIO_PORT}/api/health" > /tmp/health.json; then
+              echo "ready after ${i}s"
+              cat /tmp/health.json
+              jq -e '.status == "healthy"' /tmp/health.json
+              exit 0
+            fi
+            sleep 1
+          done
+          echo "Studio did not become healthy in 60s"
+          tail -200 logs/studio.log
+          exit 1
+
+      - name: Login + change bootstrap password
+        run: |
+          PW=$(cat ~/.unsloth/studio/auth/.bootstrap_password)
+          NEW="CIPasswordSmoke12345!"
+          TOKEN=$(curl -fs -X POST "http://127.0.0.1:${STUDIO_PORT}/api/auth/login" \
+            -H 'content-type: application/json' \
+            -d "{\"username\":\"unsloth\",\"password\":\"$PW\"}" | jq -r .access_token)
+          curl -fs -X POST "http://127.0.0.1:${STUDIO_PORT}/api/auth/change-password" \
+            -H "Authorization: Bearer $TOKEN" -H 'content-type: application/json' \
+            -d "{\"current_password\":\"$PW\",\"new_password\":\"$NEW\"}" > /dev/null
+          # Re-login to clear must_change_password flag.
+          NEW_TOKEN=$(curl -fs -X POST "http://127.0.0.1:${STUDIO_PORT}/api/auth/login" \
+            -H 'content-type: application/json' \
+            -d "{\"username\":\"unsloth\",\"password\":\"$NEW\"}" | jq -r .access_token)
+          echo "TOKEN=$NEW_TOKEN" >> "$GITHUB_ENV"
+
+      - name: Load the GGUF into Studio
+        run: |
+          GGUF_PATH="$GITHUB_WORKSPACE/gguf-cache/${GGUF_FILE}"
+          ls -lh "$GGUF_PATH"
+          curl -fs -X POST "http://127.0.0.1:${STUDIO_PORT}/api/inference/load" \
+            -H "Authorization: Bearer $TOKEN" -H 'content-type: application/json' \
+            --max-time 600 \
+            -d "{\"model_path\":\"$GGUF_PATH\",\"is_lora\":false,\"max_seq_length\":2048}" \
+            | jq '{status, display_name, is_gguf, context_length}'
+
+      - name: Send a chat completion + assert non-empty response
+        run: |
+          RESP=$(curl -fs -X POST "http://127.0.0.1:${STUDIO_PORT}/api/inference/chat/completions" \
+            -H "Authorization: Bearer $TOKEN" -H 'content-type: application/json' \
+            --max-time 900 \
+            -d '{
+              "messages":[{"role":"user","content":"Say hello in one short sentence."}],
+              "max_tokens":40,
+              "stream":false
+            }')
+          echo "raw response: $RESP"
+          CONTENT=$(echo "$RESP" | jq -r '.choices[0].message.content // empty')
+          echo "model response: $CONTENT"
+          if [ -z "$CONTENT" ]; then
+            echo "::error::Empty assistant response from Studio"
+            exit 1
+          fi
+
+      - name: Stop Studio
+        if: always()
+        run: |
+          kill "${STUDIO_PID}" || true
+          sleep 2
+          ss -tln | grep ":${STUDIO_PORT}" || true
+
+      - name: Upload Studio log on failure
+        if: failure()
+        uses: actions/upload-artifact@v4
+        with:
+          name: studio-inference-log
+          path: logs/studio.log
+          retention-days: 7


+    name: Tauri Linux debug build (no codesign)
+    runs-on: ubuntu-22.04
+    timeout-minutes: 25
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Linux native deps for Tauri / WebKit2GTK
+        run: |
+          sudo apt-get update
+          sudo apt-get install -y \
+            libwebkit2gtk-4.1-dev libayatana-appindicator3-dev \
+            librsvg2-dev libxdo-dev libssl-dev patchelf
+
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '24'
+          cache: 'npm'
+          cache-dependency-path: studio/frontend/package-lock.json
+
+      - uses: dtolnay/rust-toolchain@stable
+
+      - uses: swatinem/rust-cache@v2
+        with:
+          workspaces: studio/src-tauri -> target
+
+      - name: Install pinned Tauri CLI (matches release-desktop.yml)
+        run: npm install --save-dev --prefix studio @tauri-apps/cli@2.10.1
+
+      - name: Verify pinned Tauri CLI version
+        run: |
+          out="$(npx --prefix studio tauri --version)"
+          echo "$out"
+          [ "$out" = "tauri-cli 2.10.1" ] || { echo "::error::expected tauri-cli 2.10.1, got $out"; exit 1; }
+
+      - name: Frontend build (npm ci, vite)
+        working-directory: studio/frontend
+        run: |
+          npm ci --no-fund --no-audit
+          npm run build
+          test -f dist/index.html
+
+      - name: Tauri debug build (Linux, no bundle, no codesign)
+        # `--debug` + `--no-bundle` keeps this lean: compiles the Rust crate,
+        # confirms the frontend dist is wired into Tauri, but skips the AppImage
+        # / .deb production. Code signing is irrelevant because we never produce
+        # a distributable artifact.
+        env:
+          TAURI_SIGNING_PRIVATE_KEY: ''
+          TAURI_SIGNING_PRIVATE_KEY_PASSWORD: ''
+        run: npx --prefix studio tauri build --debug --no-bundle
+
+      - name: Inspect produced binary
+        run: |
+          BIN=$(find studio/src-tauri/target/debug -maxdepth 1 -type f -executable 2>/dev/null \
+                | grep -Ev '\.(d|so|dylib|dll)$' \
+                | grep -Ev '/(deps|build|examples)$' \
+                | head -1)
+          echo "binary: $BIN"
+          if [ -z "$BIN" ]; then
+            echo "::error::Tauri debug binary not produced"
+            ls -la studio/src-tauri/target/debug/ || true
+            exit 1
+          fi
+          file "$BIN"
+          du -h "$BIN"
+
+      - uses: actions/upload-artifact@v4
+        if: failure()
+        with:
+          name: tauri-debug-build
+          path: |
+            studio/src-tauri/target/debug
+            studio/frontend/dist
+          retention-days: 3


huggingface_hub 1.13.0 dropped the huggingface-cli entrypoint. The replacement is the 'hf' CLI shipped with the same package. Same args, just s/huggingface-cli/hf/.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a049773f6c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T05:22:43Z

+        run: |
+          python -m pip install --upgrade pip
+          # Studio's declared backend deps:
+          pip install -r studio/backend/requirements/studio.txt


Install the checked-out package before backend tests

When this job runs on a clean runner, the dependency step installs backend requirements but never installs the checked-out repo or adds the repo root to PYTHONPATH. The tests then run from studio/backend, so imports of repo-root modules such as unsloth_cli.commands.studio in studio/backend/tests/test_desktop_auth.py fail with ModuleNotFoundError instead of testing the PR; this also means unsloth/** changes are not exercised against the checkout. Install the local package/editable checkout or export the repo root on PYTHONPATH before pytest.

Useful? React with 👍 / 👎.

The inference-smoke job runs on ubuntu-latest (CPU-only, x86_64), which is exactly the host shape that should pick up ggml-org/llama.cpp's bin-ubuntu-x64.tar.gz prebuilt directly. If install.sh ever falls back to a source build on this runner, the studio/setup.sh routing has regressed and every CPU-only Linux user is paying a 3 minute compile cost again. Tee install.sh output to logs/install.log, then fail the job if the log contains "falling back to source build" or is missing the success marker "prebuilt installed and validated" / "prebuilt up to date and validated". Also include logs/install.log in the failure artifact so the prebuilt diagnostics are uploaded alongside studio.log when the job fails.

danielhanchen · 2026-05-06T06:13:23Z

Pushed a small follow-up commit on this branch that hooks the new studio-inference-smoke.yml job into the routing fix tracked in #5302:

logs/install.log is now teed during the install step and the job fails if it contains falling back to source build or is missing the prebuilt-success marker. ubuntu-latest is CPU-only x86_64, so once #5302 lands the runner picks up ggml-org/llama.cpp's bin-ubuntu-x64.tar.gz directly. If anyone reverts that routing the job goes red here instead of just running ~3 minutes slower.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8c992c7b1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T06:15:24Z

+        env:
+          TAURI_SIGNING_PRIVATE_KEY: ''
+          TAURI_SIGNING_PRIVATE_KEY_PASSWORD: ''
+        run: npx --prefix studio tauri build --debug --no-bundle


Run the Tauri build from the Studio project directory

When this workflow runs, this step is executed from the repository root, but npx --prefix studio only points npm at the studio prefix for resolving the binary; it does not change the process working directory. The existing release workflow handles this by setting projectPath: studio before invoking npx --prefix . tauri, and this repo has studio/src-tauri/tauri.conf.json rather than a root src-tauri, so this smoke job will fail to recognize the Tauri project on every PR it is meant to gate. Set working-directory: studio or cd studio before running the build.

Useful? React with 👍 / 👎.

Drops the Gemma 4 E2B GGUF (~2.3 GB) for unsloth/Qwen3.5-2B-GGUF (UD-IQ3_XXS, ~890 MiB). Cache-miss download is roughly a third of what it was, and CPU inference on ubuntu-latest finishes well inside the 25 minute job budget. Verified locally: load via /api/inference/load returns status=loaded, is_gguf=true, supports_reasoning=true, supports_tools=true; chat completion returns a non-empty assistant message ("Hello!").

…rkflow The "@assistant-ui must be pinned exactly" check was its own ~7 second workflow, doing a single grep on studio/frontend/package.json. Move it into studio-frontend-ci.yml as a pre-install step (right after checkout, before any node setup so a violation fails fast). One fewer top-level check row on every PR, same coverage. Add a FIXME so this step is dropped once @assistant-ui/* and assistant-stream leave 0.x: on 1.x, caret ranges are conventional and this becomes overzealous.

The top-level tests/ tree was previously not run anywhere. 23 of its files are CPU-friendly with the right harness: pure-Python helpers, ast walks, installer logic, and CLI shape tests. Locally validated: 302 passed, 9 skipped, 12 deselected in ~7 seconds on Python 3.12. Three pieces: 1. tests/conftest.py -- GPU-free harness, mirrors the conftest landed in unslothai/unsloth-zoo PR #624. Pre-loads unsloth_zoo.device_type and unsloth.device_type under a temporarily-mocked torch.cuda.is_available() so each module's @cache permanently captures "cuda" and the import chain succeeds on a CPU runner. Also stubs torch.cuda.get_device_capability / is_bf16_supported / mem_get_info, which unsloth/__init__.py and unsloth_zoo.temporary_patches probe at import time when DEVICE_TYPE == "cuda". On a real accelerator the harness is skipped and detection runs normally. 2. Two existing tests were leaking sys.modules state across the session because they injected stubs without an __spec__ and without restoration: - tests/test_raw_text.py shoved a "datasets" stub into sys.modules. transformers' import_utils later did importlib.util.find_spec("datasets") and got ValueError: datasets.__spec__ is None. - tests/python/test_fast_sentence_transformer_redirect_lifecycle.py shoved "transformers", "sentence_transformers", and "sentence_transformers.models" stubs in. Subsequent tests that did `import transformers` got the non-package stub. Fix: set __spec__ on stubs, plus an autouse fixture in the sentence-transformer test file that restores the three keys after each test. 3. .github/workflows/studio-backend-ci.yml gains a third job, `Repo tests (CPU)`, that installs the same dep set as the backend-pytest matrix (Python 3.12 only -- the tests are version-independent), exports PYTHONPATH=studio so tests/python/* can import install_python_stack, and runs the 23-file subset above with `-m 'not server and not e2e'`.

for more information, see https://pre-commit.ci

+    name: Repo tests (CPU)
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+          cache: 'pip'
+
+      - name: Install deps (shared shape with backend pytest job)
+        run: |
+          python -m pip install --upgrade pip
+          pip install -r studio/backend/requirements/studio.txt
+          pip install \
+            python-multipart aiofiles sqlalchemy cryptography \
+            pyyaml jinja2 mammoth unpdf requests typer \
+            'numpy<3' pytest pytest-asyncio httpx
+          pip install --index-url https://download.pytorch.org/whl/cpu 'torch>=2.4,<2.11'
+          pip install 'transformers>=4.51,<5.5'
+          pip install -e . --no-deps
+
+      - name: Repo tests (CPU)
+        env:
+          # tests/python/* import install_python_stack from studio/.
+          PYTHONPATH: ${{ github.workspace }}/studio
+          # Skip lazy compilation work the unsloth import chain wants to
+          # do at import time on a real GPU.
+          UNSLOTH_COMPILE_DISABLE: '1'
+        run: |
+          python -m pytest -q --tb=short -m 'not server and not e2e' \
+            tests/test_loader_glob_skip.py \
+            tests/test_raw_text.py \
+            tests/test_cli_export_unpacking.py \
+            tests/test_studio_install_workspace_guard.py \
+            tests/test_studio_root_resilience.py \
+            tests/python/test_install_python_stack.py \
+            tests/python/test_no_torch_filtering.py \
+            tests/python/test_studio_import_no_torch.py \
+            tests/python/test_tokenizers_and_torch_constraint.py \
+            tests/python/test_unsloth_run_tool_policy_resolver.py \
+            tests/python/test_flash_attn_install_python_stack.py \
+            tests/python/test_cross_platform_parity.py \
+            tests/python/test_fast_sentence_transformer_redirect_lifecycle.py \
+            tests/studio/test_chat_preset_builtin_invariants.py \
+            tests/studio/test_cli_studio_defaults.py \
+            tests/studio/test_studio_gguf_export_script_pin.py \
+            tests/studio/test_cli_repo_variant.py \
+            tests/studio/test_cli_run_alias.py \
+            tests/studio/test_llama_cpp_wall_clock_cap.py \
+            tests/studio/test_cancel_atomicity.py \
+            tests/studio/test_cancel_id_wiring.py \
+            tests/studio/test_stream_cancel_registration_timing.py \
+            tests/studio/test_studio_text_descender_clipping.py
+
+  ruff:


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3da1eeaed6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T07:29:36Z

+            'numpy<3' pytest pytest-asyncio httpx
+          pip install --index-url https://download.pytorch.org/whl/cpu 'torch>=2.4,<2.11'
+          pip install 'transformers>=4.51,<5.5'
+          pip install -e . --no-deps


Install unsloth_zoo before running selected repo tests

The repo-cpu-tests job installs this repo with pip install -e . --no-deps, so declared dependencies are intentionally skipped, but the same job explicitly runs tests/studio/test_studio_gguf_export_script_pin.py, which imports unsloth_zoo.llama_cpp directly. On a clean runner this module is not present, so the job fails with ModuleNotFoundError before validating the PR changes; install unsloth_zoo (or stop skipping deps) before this test list.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-06T07:29:36Z

+        import torch
+
+        _orig_is_avail = torch.cuda.is_available
+        torch.cuda.is_available = lambda: True  # type: ignore[assignment]


Handle missing torch in CPU test harness preload

The new root tests/conftest.py is executed at collection time and, on non-accelerator hosts, _preload_device_type("unsloth") unconditionally imports torch; when torch is not installed (a common lightweight/no-torch test setup), pytest now aborts during collection with ModuleNotFoundError before any tests run. This regresses the intended CPU-safe harness behavior and should fall back to the stub path when torch is unavailable.

Useful? React with 👍 / 👎.

The CPU job at run 25422050018 broke at conftest collection: the preload of unsloth.device_type pulled in `from unsloth_zoo.utils import Version` and ubuntu-latest didn't have unsloth_zoo on the path because it is an optional dep of unsloth. Two fixes: 1. Install unsloth_zoo>=2026.5.1 alongside the other deps in the Repo tests (CPU) job (it's also what unsloth's optional `huggingface` extra pins). 2. Wrap the body of _preload_device_type in conftest.py in a try/except so any import failure (missing prereq, broken module, etc.) cleanly returns False instead of aborting the entire collection. The caller already falls back to the stub device_type module on False, so the net behavior is "best effort: real device_type if possible, stub otherwise" instead of "abort the test session".

When DEVICE_COUNT is 0 (CPU host: no visible NVIDIA / AMD / Intel GPU) the dict comprehension {... for i in range(0)} was empty and the subsequent max(_CUDA_STREAMS.keys()) raised ValueError: max() iterable argument is empty during module import. That made unsloth.kernels.utils unimportable on any CPU runner, which in turn blocked all of tests/saving/**, three top-level tests/test_*.py, and tests/qlora/test_unsloth_qlora_train_and_merge.py from even collecting on CPU CI. Wrap the per-device-index dict comprehension and max() machinery in a DEVICE_COUNT > 0 guard. When DEVICE_COUNT is 0 fall back to empty containers (CUDA_STREAMS = (), WEIGHT_BUFFERS = [], ABSMAX_BUFFERS = []). The consumer functions further down in this module index these arrays by device_index but only during real GPU work, so the empty fallbacks never get touched on a CPU host. GPU-safety verified locally: with 8 visible CUDA devices, CUDA_STREAMS has 8 entries (identical to before this PR). With CUDA_VISIBLE_DEVICES="" the module imports cleanly, CUDA_STREAMS is (), and the previously blocked tests now collect (test_get_model_name passes 38 subtests, test_resolve_model_class passes 9, test_model_registry collects all 8 parametrizations). Same shape applied to the DEVICE_TYPE == "xpu" branch for symmetry.

Three changes, locally validated end-to-end (779 passed, 11 skipped, 23 deselected, 0 failed across all three steps): 1. Repo tests (CPU, auto-discovered): replace the explicit 23-file list with `pytest tests/` plus a small set of `--ignore` and `--deselect` flags. New tests under tests/python, tests/studio (excluding the two state-sensitive files), and top-level tests/test_*.py are picked up automatically with no workflow edit. --ignore covers: - tests/qlora and tests/saving: GPU-bound by design - tests/utils: helpers folder, not tests - tests/sh: shell suite handled in its own step - two state-polluting hardware-spoof files (next step) -m 'not server and not e2e': honours markers already declared in tests/python/conftest.py --deselect: test_model_registration / test_all_model_registration hit huggingface_hub live; they belong on a network job 2. Hardware-spoof tests (state-sensitive, run in isolation): tests/studio/test_hardware_dispatch_matrix.py and tests/studio/test_is_mlx_dispatch_gate.py mutate module globals in studio.backend.utils.hardware.hardware (IS_ROCM, DEVICE) via their spoof fixtures, and the leak crosses file boundaries. Running them in their own pytest invocation avoids polluting the main sweep. Both pass cleanly in isolation: 28 passed, 1 skipped. 3. Shell installer tests: explicitly enumerated subset that does not depend on install.ps1 layout (test_install_host_defaults.sh has drifted; that's a separate followup). Test fixes folded in to keep the run green: - tests/studio/install/test_rocm_support.py::TestAmdGpuMonitoring ::test_amd_primary_gpu_with_mock now clears HIP/ROCR/CUDA_VISIBLE_DEVICES via monkeypatch so _first_visible_amd_gpu_id() does not short-circuit when the runner sets CUDA_VISIBLE_DEVICES="" to suppress CUDA. - tests/studio/test_hardware_dispatch_matrix.py::spoof_hardware fixture now stubs torch.cuda.get_device_properties when cuda_available is True so detect_hardware()'s device_name probe does not call into _cuda_init() on a CPU runner.

for more information, see https://pre-commit.ci

Run 25430652224 collected three test modules that import unsloth and crashed at unsloth_zoo/vision_utils.py:68 with ModuleNotFoundError: No module named 'torchvision' unsloth_zoo.vision_utils unconditionally imports torchvision at module scope, and unsloth.models._utils pulls vision_utils in. The Repo tests (CPU) job installed torch from the CPU index but not torchvision, so any test that imports unsloth.models.* failed at collection. Add torchvision<0.26 to the same pip install --index-url https://download.pytorch.org/whl/cpu line.

Run 25430982243 collected three test modules that import unsloth and crashed at unsloth/models/_utils.py:1166 with ModuleNotFoundError: No module named 'bitsandbytes' The bnb import there is unconditional. Recent bnb versions (>=0.45) ship a CPU build so the wheel installs on a free Linux runner and the import resolves; the kernels still raise on use but the module collects, which is enough for these CPU tests. Add 'bitsandbytes>=0.45' to the Repo tests (CPU) deps.

Workflow renames (top-level `name:` keys; affects PR check rows): Studio backend CI -> Backend CI Studio frontend CI -> Frontend CI Studio inference smoke -> Studio GGUF CI Studio Tauri smoke -> Studio Tauri CI Wheel build + smoke -> Wheel CI Backend CI's matrix job goes from "Backend pytest (Python 3.10)" to just "(Python 3.10)" so the GitHub UI row reads "Backend CI / (Python 3.10)" rather than the old verbose form. Production guard for CPU torch (run 25431126138): unsloth/kernels/utils.py:165 was an unconditional _gpu_getCurrentRawStream = torch._C._cuda_getCurrentRawStream which raised AttributeError on a CPU-only torch wheel because the compiled CUDA backend is absent. Three test modules (test_get_model_name, test_model_registry, test_resolve_model_class) crashed at collection because their import chain reaches this line. Add a hasattr probe: when torch is built without CUDA, fall through to a no-op binding that returns 0. _get_tensor_stream is only invoked during real GPU work, so the no-op is never executed on a CPU host. GPU-safety verified locally: with 8 visible CUDA devices the binding still resolves to the real torch._C._cuda_getCurrentRawStream (behaviour identical to before this PR). The XPU branch is untouched.

for more information, see https://pre-commit.ci

+    name: (Python ${{ matrix.python }})
+    runs-on: ubuntu-latest
+    timeout-minutes: 15
+    strategy:
+      fail-fast: false
+      matrix:
+        python: ['3.10', '3.11', '3.12', '3.13']
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '${{ matrix.python }}'
+          cache: 'pip'
+
+      - name: Install backend test dependencies (CPU only)
+        run: |
+          python -m pip install --upgrade pip
+          # Studio's declared backend deps:
+          pip install -r studio/backend/requirements/studio.txt
+          # Extras that studio.txt does not list but the import chain needs
+          # (python-multipart for FastAPI form/file uploads, sqlalchemy/cryptography
+          #  for the auth DB, yaml/jinja2 for utils.models.model_config, etc.):
+          pip install \
+            python-multipart aiofiles sqlalchemy cryptography \
+            pyyaml jinja2 mammoth unpdf requests \
+            'numpy<3' pytest pytest-asyncio httpx
+          # Torch CPU + transformers are required by a chunk of the backend test
+          # suite (gpu_selection, kv_cache_estimation, utils). CPU-only torch
+          # keeps the install ~250 MB / ~1 min on a clean runner.
+          pip install --index-url https://download.pytorch.org/whl/cpu 'torch>=2.4,<2.11'
+          pip install 'transformers>=4.51,<5.5'
+
+      - name: Backend tests
+        working-directory: studio/backend
+        # Locally validated against this dep set: 831 passed, 5 skipped, 35 deselected.
+        # Deselections (all environment-specific, would never pass on a GPU-less
+        # `ubuntu-latest` runner regardless of code correctness):
+        #   - llama_cpp_load_progress_live: spawns a real llama.cpp process
+        #   - TestGpuAutoSelection / TestPreSpawnGpuResolution / TestPerGpuFitGuardAllCounts:
+        #       require live transformers config introspection on real GPUs
+        #   - TestTransformersIntrospection: same
+        #   - test_returns_cuda_when_cuda_available / test_calls_cuda_cache_when_cuda:
+        #       assume CUDA-capable GPU
+        run: |
+          python -m pytest tests/ -q --tb=short \
+            --ignore=tests/test_studio_api.py \
+            -k 'not llama_cpp_load_progress_live and not TestGpuAutoSelection and not TestPreSpawnGpuResolution and not TestPerGpuFitGuardAllCounts and not TestTransformersIntrospection and not test_returns_cuda_when_cuda_available and not test_calls_cuda_cache_when_cuda'
+
+  repo-cpu-tests:


The previous review-time `Sync .github/workflows with upstream author branch` commit on this branch (3ea7f4f) removed 5 Studio CI workflow files that the author actually had on their tip and regressed release-desktop.yml from 9738 to 7914 bytes. Cause was a stale fetch of the author ref in the orchestrator: the author merged main (carrying the 5 newly-added workflow files from unslothai#5298) into this branch DURING the review run, but Step 24c saw the pre-merge snapshot and `git rm`'d files that were in fact present on the author tip. Restoring the affected files verbatim from `unslothai/unsloth:main` (which is what they should match according to the author's earlier merge commit fac2dc0): release-desktop.yml (9738 bytes -- reverts -42 +3 regression) studio-backend-ci.yml (8841 bytes -- new from PR unslothai#5298) studio-frontend-ci.yml (4102 bytes -- new from PR unslothai#5298) studio-inference-smoke.yml (7277 bytes -- new from PR unslothai#5298) studio-tauri-smoke.yml (3581 bytes -- new from PR unslothai#5298) wheel-smoke.yml (4704 bytes -- new from PR unslothai#5298) After this commit the diff vs main contains only the cursor-pointer changes the PR is actually about.

The first run on unslothai/notebooks@main surfaces 27 errors + 6 warnings, all real (peft 0.19+ / torchao floor missing in 6 nb/ notebooks the previous template fixes did not reach, 14 git+ HEAD installs in hand-tuned exception notebooks, 6 torch/torchcodec ABI mismatches, 1 transformers/tokenizers --no-deps drift). Mirror the same continue-on-error pattern PR #5298 used for biome:check on the frontend so the count surfaces in the PR check UI without forcing the backlog to be cleaned in the same change. Drop continue-on-error once the count hits zero.

github-advanced-security AI found potential problems May 6, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 6, 2026

View reviewed changes

danielhanchen added 2 commits May 6, 2026 04:38

CI: collapse multiline pytest -k expression to a single line

c14c016

YAML's | block-scalar fed the newlines verbatim into the -k argument and pytest rejected it as 'Wrong expression passed to -k'. Same logical filter on one line.

github-advanced-security AI found potential problems May 6, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 6, 2026

View reviewed changes

danielhanchen added 2 commits May 6, 2026 05:09

CI: add Python 3.13 to backend pytest matrix

d598173

Verified locally: 831 backend tests pass under Python 3.13 with the same filter set used for 3.10 / 3.11 / 3.12.

github-advanced-security AI found potential problems May 6, 2026

View reviewed changes

CI: use 'hf download' instead of deprecated 'huggingface-cli download'

a049773

huggingface_hub 1.13.0 dropped the huggingface-cli entrypoint. The replacement is the 'hf' CLI shipped with the same package. Same args, just s/huggingface-cli/hf/.

chatgpt-codex-connector Bot reviewed May 6, 2026

View reviewed changes

danielhanchen added 6 commits May 6, 2026 06:17

Tighten prebuilt-assertion comment in studio-inference-smoke

a7a221b

Merge remote-tracking branch 'origin/main' into ci/studio-pr-checks

3ace480

CI: add workflow_dispatch to inference-smoke for manual cache pre-warm

fb3b798

danielhanchen requested a review from rolandtannous as a code owner May 6, 2026 07:24

[pre-commit.ci] auto fixes from pre-commit.com hooks

3da1eea

for more information, see https://pre-commit.ci

github-advanced-security AI found potential problems May 6, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 6, 2026

View reviewed changes

danielhanchen added 2 commits May 6, 2026 07:39

danielhanchen requested a review from Datta0 as a code owner May 6, 2026 08:59

danielhanchen and others added 7 commits May 6, 2026 10:32

Merge remote-tracking branch 'origin/main' into ci/studio-pr-checks

855995d

[pre-commit.ci] auto fixes from pre-commit.com hooks

b6d916d

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

541627c

for more information, see https://pre-commit.ci

github-advanced-security AI found potential problems May 6, 2026

View reviewed changes

danielhanchen merged commit a56c959 into main May 6, 2026
16 checks passed

danielhanchen deleted the ci/studio-pr-checks branch May 6, 2026 11:42

danielhanchen mentioned this pull request May 6, 2026

CI: scope GITHUB_TOKEN permissions, add MLX CI, unblock ~60 skipped tests #5312

Merged

4 tasks

Uh oh!

Conversation

danielhanchen commented May 6, 2026

Summary

studio-pin-enforce.yml

studio-frontend-ci.yml

studio-backend-ci.yml

wheel-smoke.yml

What this catches

Local validation summary

Test plan

Follow-ups (not in this PR)

Uh oh!

gemini-code-assist Bot commented May 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

danielhanchen commented May 6, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`studio-pin-enforce.yml`

`studio-frontend-ci.yml`

`studio-backend-ci.yml`

`wheel-smoke.yml`