PR #697 Studio probe on macos-14 (do not merge) by danielhanchen · Pull Request #157 · danielhanchen/unsloth-staging-2

danielhanchen · 2026-05-27T13:43:42Z

Companion to staging-2#156 (cross-OS shim path). This run drives Unsloth Studio on real Apple Silicon macos-14 with unslothai/unsloth-zoo#697 overlaid on top of the install -- the highest-fidelity validation we can get without local Mac hardware.

What this CI does

bash install.sh --local --no-torch -- canonical Studio install on macOS.
Force-reinstall unsloth-zoo from PR docker images unslothai/unsloth#697 head (Lyxot:fix/mlx-save-gguf-export-parity).
pip install mlx mlx-lm mlx-vlm -- skipped by --no-torch, needed for real-MLX probes.
tests/pr697/probe_real_mlx.py -- 17 probes exercising every PR-697 helper against real Apple Silicon mlx kernels (not the torch shim).
Boot Studio (UNSLOTH_API_ONLY=1) and confirm /api/health stays healthy with PR-697 overlaid.

Why this complements staging-2#156

	staging-2#156 (closed, green)	this PR
Repo	unsloth-zoo PR head	unsloth main (Studio)
MLX backend	torch-spoof shim	real mlx / mlx-lm / mlx-vlm wheels
Tests	23 author tests	17 contract probes against real symbols
Studio	not exercised	install + boot + `/api/health`
OS	macos-14 + ubuntu + windows	macos-14 only (high-signal)

Throwaway branch. Do not merge.

Companion to the pr697-cross-os branch (which exercises the 23 author tests via the torch shim). This branch drives Unsloth Studio on Apple Silicon: 1. install.sh --local --no-torch (Studio install + venv). 2. Force-reinstall unsloth-zoo from PR unslothai#697 head (Lyxot:fix/mlx-save-gguf-export-parity). 3. Install mlx / mlx-lm / mlx-vlm wheels (skipped by --no-torch). 4. Run tests/pr697/probe_real_mlx.py against REAL Apple Silicon mlx kernels -- 17 probes covering subpackage imports, every PR-697 helper, and the contract of each of the 11 fixes. 5. Boot Studio (UNSLOTH_API_ONLY=1) and confirm /api/health stays healthy with the PR-697 unsloth-zoo overlaid -- no boot regression. Sweep all 26 existing staging-2 workflows to keep the run focused. studio_test_kit/ vendored from workspace root for future Playwright UI walkthroughs (not exercised in this iteration to fit the macos-14 budget). Throwaway branch. Do not merge.

gemini-code-assist

Code Review

This pull request introduces the studio_test_kit package, which provides a comprehensive suite of Playwright-based tools and examples for driving Unsloth Studio end-to-end. The review feedback focuses on improving cross-platform compatibility (specifically removing setsid to support macOS), enhancing CI robustness (using force checkout and streaming long-running installation logs to prevent memory buffering), and improving error diagnostics by capturing and printing stderr on subprocess and network failures.

gemini-code-assist · 2026-05-27T13:45:26Z

+    cmd = ["setsid", "-f", "bash", "-c",
+           f'{shlex.quote(bin_path)} studio -p {port} '
+           f'2>&1 | tee -a {shlex.quote(str(log_path))}']


The setsid command-line utility is not available by default on macOS, which will cause launch_studio to fail with a FileNotFoundError on macOS systems. Since subprocess.Popen is already called with start_new_session=True (which internally invokes the setsid() system call to start the process in a new session and process group), prefixing the command with setsid -f is redundant and breaks macOS compatibility. Removing setsid -f ensures the tool runs successfully on macOS.

Suggested change

cmd = ["setsid", "-f", "bash", "-c",

f'{shlex.quote(bin_path)} studio -p {port} '

f'2>&1 | tee -a {shlex.quote(str(log_path))}']

cmd = ["bash", "-c",

f'{shlex.quote(bin_path)} studio -p {port} '

f'2>&1 | tee -a {shlex.quote(str(log_path))}']

gemini-code-assist · 2026-05-27T13:45:26Z

+def _run(cmd: str | list[str], cwd: Optional[Path] = None, env: Optional[dict] = None,
+         check: bool = True, timeout: Optional[int] = None) -> subprocess.CompletedProcess:
+    if isinstance(cmd, str):
+        cmd_list = shlex.split(cmd)
+    else:
+        cmd_list = cmd
+    full_env = {**os.environ, **(env or {})}
+    return subprocess.run(
+        cmd_list, cwd=cwd, env=full_env, check=check, timeout=timeout,
+        text=True, capture_output=True,
+    )


Running subprocess.run with capture_output=True for long-running commands like install_studio (which can take up to 30 minutes) buffers all stdout and stderr in memory. This not only consumes unnecessary memory but also prevents the installation progress from being streamed to the console/CI logs, making it look like the process is hung and extremely difficult to debug if it fails. Making capture_output configurable allows us to disable it for the installation command.

Suggested change

def _run(cmd: str | list[str], cwd: Optional[Path] = None, env: Optional[dict] = None,

check: bool = True, timeout: Optional[int] = None) -> subprocess.CompletedProcess:

if isinstance(cmd, str):

cmd_list = shlex.split(cmd)

else:

cmd_list = cmd

full_env = {**os.environ, **(env or {})}

return subprocess.run(

cmd_list, cwd=cwd, env=full_env, check=check, timeout=timeout,

text=True, capture_output=True,

)

def _run(cmd: str | list[str], cwd: Optional[Path] = None, env: Optional[dict] = None,

check: bool = True, timeout: Optional[int] = None, capture_output: bool = True) -> subprocess.CompletedProcess:

if isinstance(cmd, str):

cmd_list = shlex.split(cmd)

else:

cmd_list = cmd

full_env = {**os.environ, **(env or {})}

return subprocess.run(

cmd_list, cwd=cwd, env=full_env, check=check, timeout=timeout,

text=True, capture_output=capture_output,

)

gemini-code-assist · 2026-05-27T13:45:26Z

+    _run(
+        ["bash", str(install_sh), "--local"],
+        cwd=repo,
+        env={"UNSLOTH_STUDIO_HOME": str(home)},
+        timeout=60 * 30,
+    )


Pass capture_output=False to _run so that the installation progress is streamed directly to the console/CI logs instead of being buffered in memory for up to 30 minutes.

Suggested change

_run(

["bash", str(install_sh), "--local"],

cwd=repo,

env={"UNSLOTH_STUDIO_HOME": str(home)},

timeout=60 * 30,

)

_run(

["bash", str(install_sh), "--local"],

cwd=repo,

env={"UNSLOTH_STUDIO_HOME": str(home)},

timeout=60 * 30,

capture_output=False,

)

gemini-code-assist · 2026-05-27T13:45:26Z

+
+    if reuse_clone and (repo / ".git").exists():
+        _run(["git", "fetch", "origin", branch], cwd=repo)
+        _run(["git", "checkout", branch], cwd=repo)


In automated environments or CI pipelines, git checkout can fail if there are untracked files or local modifications that would be overwritten. Using the --force (-f) flag ensures the checkout succeeds reliably.

Suggested change

_run(["git", "checkout", branch], cwd=repo)

_run(["git", "checkout", "-f", branch], cwd=repo)

gemini-code-assist · 2026-05-27T13:45:26Z

+        "-c:v", "libx264", "-pix_fmt", "yuv420p", "-crf", "23",
+        str(out_mp4),
+    ]
+    subprocess.run(cmd, check=True, capture_output=True, text=True)


When ffmpeg fails, subprocess.run raises a CalledProcessError. Because capture_output=True is set, the actual error output from ffmpeg is hidden from the console/CI logs, making it difficult to diagnose transcoding issues. Catching CalledProcessError and printing e.stderr before re-raising provides much better visibility.

Suggested change

subprocess.run(cmd, check=True, capture_output=True, text=True)

try:

subprocess.run(cmd, check=True, capture_output=True, text=True)

except subprocess.CalledProcessError as e:

import sys

print(f"ffmpeg hstack failed: {e.stderr}", file=sys.stderr)

raise

gemini-code-assist · 2026-05-27T13:45:26Z

+    subprocess.run(
+        ["ffmpeg", "-y", "-i", str(src_webm),
+         "-c:v", "libx264", "-pix_fmt", "yuv420p", "-crf", "23", str(out_mp4)],
+        check=True, capture_output=True, text=True,
+    )


Catch CalledProcessError and print e.stderr to make troubleshooting ffmpeg failures in CI easier.

Suggested change

subprocess.run(

["ffmpeg", "-y", "-i", str(src_webm),

"-c:v", "libx264", "-pix_fmt", "yuv420p", "-crf", "23", str(out_mp4)],

check=True, capture_output=True, text=True,

)

try:

subprocess.run(

["ffmpeg", "-y", "-i", str(src_webm),

"-c:v", "libx264", "-pix_fmt", "yuv420p", "-crf", "23", str(out_mp4)],

check=True, capture_output=True, text=True,

)

except subprocess.CalledProcessError as e:

import sys

print(f"ffmpeg transcode failed: {e.stderr}", file=sys.stderr)

raise

gemini-code-assist · 2026-05-27T13:45:26Z

+        r.raise_for_status()
+        b = r.json()


If the authentication request fails (e.g., due to invalid credentials), r.raise_for_status() will raise an httpx.HTTPStatusError. Catching this error and including the response body (which often contains the error details like {"detail": "..."}) makes debugging authentication failures much easier.

try: r.raise_for_status() except httpx.HTTPStatusError as e: raise RuntimeError(f"Login failed ({e.response.status_code}): {e.response.text}") from e b = r.json()

…studio install.sh creates the venv directly at $UNSLOTH_STUDIO_HOME/unsloth_studio/ (not the .venv_* pattern from older docs). Confirmed via install log on the prior run. Also expose $UNSLOTH_STUDIO_HOME/bin on PATH so the unsloth CLI shim is visible to subsequent steps.

danielhanchen · 2026-05-27T13:55:43Z

Green on commit 97ac228: 18/18 PR-697 probes PASSED against REAL Apple Silicon mlx 0.x / mlx-lm / mlx-vlm 0.5.0 wheels, AND Studio /api/health returned healthy after 23s with the PR-697 unsloth-zoo overlay in place.

Run: https://github.com/danielhanchen/unsloth-staging-2/actions/runs/26515290288

Contract probes covered:

All post-migration subpackage imports (unsloth_zoo.mlx.{utils,loader,runtime,compile,trainer,cce})
Every PR-697 helper present in the right module (fix build(deps): bump oxc-parser from 0.121.0 to 0.123.0 in /studio/backend/core/data_recipe/oxc-validator in the npm-oxc-validator group across 1 directory #1, [studio] Fix VLM detection for transformers v5 #4, [tests] [studio] Fix VLM detection for transformers v5 #5, feat(studio): add version footer showing unsloth version and beta badge #6, test PR creation #7, fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory #8, fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory #9, Studio: Fix empty chat threads on navigation and stabilize new chat flow #10, build(deps): bump transformers from 4.57.6 to 5.0.0rc3 in /studio/backend/requirements/single-env in the pip group across 1 directory #11)
VLM config save uses mlx_vlm + preserves quantization_config (fix build(deps): bump oxc-parser from 0.121.0 to 0.123.0 in /studio/backend/core/data_recipe/oxc-validator in the npm-oxc-validator group across 1 directory #1)
Text-only routes through mlx_lm (fix build(deps): bump oxc-parser from 0.121.0 to 0.123.0 in /studio/backend/core/data_recipe/oxc-validator in the npm-oxc-validator group across 1 directory #1 negative -- no regression)
_mlx_arrays_match value check on real mlx rank-2 arrays (fix [tests] [studio] Fix VLM detection for transformers v5 #5)
_rewrite_mlx_vlm_tensor_for_gguf 3-tuple contract (fix [tests] [studio] Fix VLM detection for transformers v5 #5)
_has_vision_config across nested/top-level/malformed (fix build(deps): bump oxc-parser from 0.121.0 to 0.123.0 in /studio/backend/core/data_recipe/oxc-validator in the npm-oxc-validator group across 1 directory #1)
_get_model_config dataclass extraction (fix build(deps): bump transformers from 4.57.6 to 5.0.0rc3 in /studio/backend/requirements/single-env in the pip group across 1 directory #11)
_read_json_file returns {} for missing + binary (fix fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory #8)
_copy_source_sidecars copies non-weight sidecars, skips weights, handles non-dir src (fix fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory #9)
_sync_gguf_nextn_layer_config does not raise on synthetic model (fix Studio: Fix empty chat threads on navigation and stabilize new chat flow #10)
_MlxVlmSanitizeProxy constructable (fix feat(studio): add version footer showing unsloth version and beta badge #6)
save_pretrained_gguf / push_to_hub_gguf accept first_conversion (fix [studio] Fix VLM detection for transformers v5 #4)

Closing per staging-fork convention; never merged.

gemini-code-assist Bot reviewed May 27, 2026

View reviewed changes

danielhanchen closed this May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PR #697 Studio probe on macos-14 (do not merge)#157

PR #697 Studio probe on macos-14 (do not merge)#157
danielhanchen wants to merge 2 commits into
mainfrom
pr697-studio-probe

danielhanchen commented May 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 27, 2026

Uh oh!

gemini-code-assist Bot May 27, 2026

Uh oh!

gemini-code-assist Bot May 27, 2026

Uh oh!

gemini-code-assist Bot May 27, 2026

Uh oh!

gemini-code-assist Bot May 27, 2026

Uh oh!

gemini-code-assist Bot May 27, 2026

Uh oh!

gemini-code-assist Bot May 27, 2026

Uh oh!

danielhanchen commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	_run(["git", "checkout", branch], cwd=repo)
	_run(["git", "checkout", "-f", branch], cwd=repo)

-    subprocess.run(cmd, check=True, capture_output=True, text=True)
+    try:
+        subprocess.run(cmd, check=True, capture_output=True, text=True)
+    except subprocess.CalledProcessError as e:
+        import sys
+        print(f"ffmpeg hstack failed: {e.stderr}", file=sys.stderr)
+        raise

Conversation

danielhanchen commented May 27, 2026

What this CI does

Why this complements staging-2#156

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

danielhanchen commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants