Skip to content

PR #697 Studio probe on macos-14 (do not merge)#157

Closed
danielhanchen wants to merge 2 commits into
mainfrom
pr697-studio-probe
Closed

PR #697 Studio probe on macos-14 (do not merge)#157
danielhanchen wants to merge 2 commits into
mainfrom
pr697-studio-probe

Conversation

@danielhanchen
Copy link
Copy Markdown
Owner

Companion to staging-2#156 (cross-OS shim path). This run drives Unsloth Studio on real Apple Silicon macos-14 with unslothai/unsloth-zoo#697 overlaid on top of the install -- the highest-fidelity validation we can get without local Mac hardware.

What this CI does

  1. bash install.sh --local --no-torch -- canonical Studio install on macOS.
  2. Force-reinstall unsloth-zoo from PR docker images unslothai/unsloth#697 head (Lyxot:fix/mlx-save-gguf-export-parity).
  3. pip install mlx mlx-lm mlx-vlm -- skipped by --no-torch, needed for real-MLX probes.
  4. tests/pr697/probe_real_mlx.py -- 17 probes exercising every PR-697 helper against real Apple Silicon mlx kernels (not the torch shim).
  5. Boot Studio (UNSLOTH_API_ONLY=1) and confirm /api/health stays healthy with PR-697 overlaid.

Why this complements staging-2#156

staging-2#156 (closed, green) this PR
Repo unsloth-zoo PR head unsloth main (Studio)
MLX backend torch-spoof shim real mlx / mlx-lm / mlx-vlm wheels
Tests 23 author tests 17 contract probes against real symbols
Studio not exercised install + boot + /api/health
OS macos-14 + ubuntu + windows macos-14 only (high-signal)

Throwaway branch. Do not merge.

Companion to the pr697-cross-os branch (which exercises the 23 author tests
via the torch shim). This branch drives Unsloth Studio on Apple Silicon:

  1. install.sh --local --no-torch (Studio install + venv).
  2. Force-reinstall unsloth-zoo from PR unslothai#697 head (Lyxot:fix/mlx-save-gguf-export-parity).
  3. Install mlx / mlx-lm / mlx-vlm wheels (skipped by --no-torch).
  4. Run tests/pr697/probe_real_mlx.py against REAL Apple Silicon mlx kernels
     -- 17 probes covering subpackage imports, every PR-697 helper, and the
     contract of each of the 11 fixes.
  5. Boot Studio (UNSLOTH_API_ONLY=1) and confirm /api/health stays healthy
     with the PR-697 unsloth-zoo overlaid -- no boot regression.

Sweep all 26 existing staging-2 workflows to keep the run focused.

studio_test_kit/ vendored from workspace root for future Playwright UI
walkthroughs (not exercised in this iteration to fit the macos-14 budget).

Throwaway branch. Do not merge.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the studio_test_kit package, which provides a comprehensive suite of Playwright-based tools and examples for driving Unsloth Studio end-to-end. The review feedback focuses on improving cross-platform compatibility (specifically removing setsid to support macOS), enhancing CI robustness (using force checkout and streaming long-running installation logs to prevent memory buffering), and improving error diagnostics by capturing and printing stderr on subprocess and network failures.

Comment on lines +163 to +165
cmd = ["setsid", "-f", "bash", "-c",
f'{shlex.quote(bin_path)} studio -p {port} '
f'2>&1 | tee -a {shlex.quote(str(log_path))}']
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The setsid command-line utility is not available by default on macOS, which will cause launch_studio to fail with a FileNotFoundError on macOS systems. Since subprocess.Popen is already called with start_new_session=True (which internally invokes the setsid() system call to start the process in a new session and process group), prefixing the command with setsid -f is redundant and breaks macOS compatibility. Removing setsid -f ensures the tool runs successfully on macOS.

Suggested change
cmd = ["setsid", "-f", "bash", "-c",
f'{shlex.quote(bin_path)} studio -p {port} '
f'2>&1 | tee -a {shlex.quote(str(log_path))}']
cmd = ["bash", "-c",
f'{shlex.quote(bin_path)} studio -p {port} '
f'2>&1 | tee -a {shlex.quote(str(log_path))}']

Comment on lines +42 to +52
def _run(cmd: str | list[str], cwd: Optional[Path] = None, env: Optional[dict] = None,
check: bool = True, timeout: Optional[int] = None) -> subprocess.CompletedProcess:
if isinstance(cmd, str):
cmd_list = shlex.split(cmd)
else:
cmd_list = cmd
full_env = {**os.environ, **(env or {})}
return subprocess.run(
cmd_list, cwd=cwd, env=full_env, check=check, timeout=timeout,
text=True, capture_output=True,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Running subprocess.run with capture_output=True for long-running commands like install_studio (which can take up to 30 minutes) buffers all stdout and stderr in memory. This not only consumes unnecessary memory but also prevents the installation progress from being streamed to the console/CI logs, making it look like the process is hung and extremely difficult to debug if it fails. Making capture_output configurable allows us to disable it for the installation command.

Suggested change
def _run(cmd: str | list[str], cwd: Optional[Path] = None, env: Optional[dict] = None,
check: bool = True, timeout: Optional[int] = None) -> subprocess.CompletedProcess:
if isinstance(cmd, str):
cmd_list = shlex.split(cmd)
else:
cmd_list = cmd
full_env = {**os.environ, **(env or {})}
return subprocess.run(
cmd_list, cwd=cwd, env=full_env, check=check, timeout=timeout,
text=True, capture_output=True,
)
def _run(cmd: str | list[str], cwd: Optional[Path] = None, env: Optional[dict] = None,
check: bool = True, timeout: Optional[int] = None, capture_output: bool = True) -> subprocess.CompletedProcess:
if isinstance(cmd, str):
cmd_list = shlex.split(cmd)
else:
cmd_list = cmd
full_env = {**os.environ, **(env or {})}
return subprocess.run(
cmd_list, cwd=cwd, env=full_env, check=check, timeout=timeout,
text=True, capture_output=capture_output,
)

Comment on lines +83 to +88
_run(
["bash", str(install_sh), "--local"],
cwd=repo,
env={"UNSLOTH_STUDIO_HOME": str(home)},
timeout=60 * 30,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Pass capture_output=False to _run so that the installation progress is streamed directly to the console/CI logs instead of being buffered in memory for up to 30 minutes.

Suggested change
_run(
["bash", str(install_sh), "--local"],
cwd=repo,
env={"UNSLOTH_STUDIO_HOME": str(home)},
timeout=60 * 30,
)
_run(
["bash", str(install_sh), "--local"],
cwd=repo,
env={"UNSLOTH_STUDIO_HOME": str(home)},
timeout=60 * 30,
capture_output=False,
)


if reuse_clone and (repo / ".git").exists():
_run(["git", "fetch", "origin", branch], cwd=repo)
_run(["git", "checkout", branch], cwd=repo)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In automated environments or CI pipelines, git checkout can fail if there are untracked files or local modifications that would be overwritten. Using the --force (-f) flag ensures the checkout succeeds reliably.

Suggested change
_run(["git", "checkout", branch], cwd=repo)
_run(["git", "checkout", "-f", branch], cwd=repo)

"-c:v", "libx264", "-pix_fmt", "yuv420p", "-crf", "23",
str(out_mp4),
]
subprocess.run(cmd, check=True, capture_output=True, text=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When ffmpeg fails, subprocess.run raises a CalledProcessError. Because capture_output=True is set, the actual error output from ffmpeg is hidden from the console/CI logs, making it difficult to diagnose transcoding issues. Catching CalledProcessError and printing e.stderr before re-raising provides much better visibility.

Suggested change
subprocess.run(cmd, check=True, capture_output=True, text=True)
try:
subprocess.run(cmd, check=True, capture_output=True, text=True)
except subprocess.CalledProcessError as e:
import sys
print(f"ffmpeg hstack failed: {e.stderr}", file=sys.stderr)
raise

Comment on lines +120 to +124
subprocess.run(
["ffmpeg", "-y", "-i", str(src_webm),
"-c:v", "libx264", "-pix_fmt", "yuv420p", "-crf", "23", str(out_mp4)],
check=True, capture_output=True, text=True,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catch CalledProcessError and print e.stderr to make troubleshooting ffmpeg failures in CI easier.

Suggested change
subprocess.run(
["ffmpeg", "-y", "-i", str(src_webm),
"-c:v", "libx264", "-pix_fmt", "yuv420p", "-crf", "23", str(out_mp4)],
check=True, capture_output=True, text=True,
)
try:
subprocess.run(
["ffmpeg", "-y", "-i", str(src_webm),
"-c:v", "libx264", "-pix_fmt", "yuv420p", "-crf", "23", str(out_mp4)],
check=True, capture_output=True, text=True,
)
except subprocess.CalledProcessError as e:
import sys
print(f"ffmpeg transcode failed: {e.stderr}", file=sys.stderr)
raise

Comment thread studio_test_kit/auth.py
Comment on lines +73 to +74
r.raise_for_status()
b = r.json()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the authentication request fails (e.g., due to invalid credentials), r.raise_for_status() will raise an httpx.HTTPStatusError. Catching this error and including the response body (which often contains the error details like {"detail": "..."}) makes debugging authentication failures much easier.

        try:
            r.raise_for_status()
        except httpx.HTTPStatusError as e:
            raise RuntimeError(f"Login failed ({e.response.status_code}): {e.response.text}") from e
        b = r.json()

…studio

install.sh creates the venv directly at $UNSLOTH_STUDIO_HOME/unsloth_studio/
(not the .venv_* pattern from older docs). Confirmed via install log on
the prior run. Also expose $UNSLOTH_STUDIO_HOME/bin on PATH so the
unsloth CLI shim is visible to subsequent steps.
@danielhanchen
Copy link
Copy Markdown
Owner Author

Green on commit 97ac228: 18/18 PR-697 probes PASSED against REAL Apple Silicon mlx 0.x / mlx-lm / mlx-vlm 0.5.0 wheels, AND Studio /api/health returned healthy after 23s with the PR-697 unsloth-zoo overlay in place.

Run: https://github.com/danielhanchen/unsloth-staging-2/actions/runs/26515290288

Contract probes covered:

Closing per staging-fork convention; never merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants