Skip to content

Studio: IME / multilingual composer regression test + RTL dir="auto"#44

Closed
danielhanchen wants to merge 9 commits into
mainfrom
pr-5485-head
Closed

Studio: IME / multilingual composer regression test + RTL dir="auto"#44
danielhanchen wants to merge 9 commits into
mainfrom
pr-5485-head

Conversation

@danielhanchen

Copy link
Copy Markdown
Collaborator

Staging mirror of unslothai/unsloth#5485

Original PR: unslothai/unsloth#5485
Author: danielhanchen

This is a staging copy for review and editing. Once finalized, changes will be pushed back to the original PR.


Original description

Summary

  • Adds tests/studio/playwright_chat_ime_i18n.py: a Playwright smoke that exercises the chat composer textarea across 31 scripts (covering more than 90 percent of the world's population by speaker count) and reproduces the issue #5318 stuck-composition pattern.
  • Wires the smoke into .github/workflows/studio-ui-smoke.yml as a third Studio boot on port 18896.
  • Adds dir="auto" to the main chat composer, the inline edit composer, and the compare-mode composer so RTL scripts (Arabic, Hebrew, Persian, Urdu) flow right to left without forcing the whole UI into RTL.

Why

  • Before #5327 there was no Studio-owned regression coverage for the IME composition path. The bug shipped because @assistant-ui/react floated from 0.12.19 to 0.12.28 (#5296) and surfaced a library bug; the existing chat / extra UI smokes did not exercise composition events.
  • The new test catches both that class of regression and any future Unicode mangling regression (NFC re-normalisation, UTF-16 surrogate splits, combining-mark reorderings) without needing to load a GGUF model.
  • dir="auto" is a one-line opt-in to Unicode bidi auto-detection. Without it, typing Arabic / Hebrew / Persian / Urdu shows correct characters but the cursor, punctuation, and line wrap flow left to right, which is jarring for RTL users.

What the smoke checks

  1. ASCII keyboard typing baseline.
  2. Static guard that the composer textarea carries dir="auto".
  3. Paste round trip across 31 scripts (English, Mandarin, Spanish, Hindi, Arabic, Bengali, Portuguese, Russian, Japanese, Punjabi, German, Javanese, Korean, French, Turkish, Vietnamese, Urdu, Tamil, Telugu, Marathi, Italian, Thai, Polish, Ukrainian, Persian, Dutch, Hebrew, Greek, Indonesian, Swahili, plus an emoji / ZWJ / flag entry).
  4. Normal IME composition sequence committing 你好.
  5. Issue #5318 stuck-composition repro: two compositionstart events with no matching compositionend, then verify the field still accepts abc and a follow-up keystroke d, and that the Send button is not stuck disabled.

Cost

Model-free on purpose, since the bug surface is the React composer rather than inference. About 60 seconds of wall clock on a warm runner, plus the third Studio boot (which is fast because the install is already warm from the earlier steps in the same job).

Verified locally

  • Against main (HEAD with #5327): ascii=OK paste=31/31 normal_composition=OK stuck_recovery=OK, zero page errors, zero console errors.
  • Against the pre-#5327 base (b65a7450): paste 31/31 still passes (paste path is unaffected), normal composition step FAILS with readback '' missing '你好' exactly as expected for the regression we want to catch.

Test plan

  • Studio UI CI passes on ubuntu-latest.
  • tests/studio/playwright_chat_ime_i18n.py step produces screenshots under logs/playwright_ime/.
  • No new console errors surface on main builds.

This PR tracks the moving review branch (pr-5485-head). Iteration fix commits land here directly. Review-added tests are in a separate PR.

Changed files:

  • .github/workflows/studio-ui-smoke.yml
  • studio/frontend/src/components/assistant-ui/thread.tsx
  • studio/frontend/src/features/chat/shared-composer.tsx
  • tests/studio/playwright_chat_ime_i18n.py

pre-commit-ci Bot and others added 3 commits May 16, 2026 12:41
…#5477)

* studio/frontend: drop unused dependencies, move type pkg to devDeps

Removes 11 declared deps that are not imported anywhere in src/, the
Tauri config, src-tauri Rust, backend, scripts, CI workflows, or
sibling workspaces. Moves @types/canvas-confetti to devDependencies
since it ships TypeScript types only.

Removed from dependencies:
  @assistant-ui/react-markdown   (no imports; not a peer of any used pkg)
  @assistant-ui/react-streamdown (no imports; not a peer of any used pkg)
  @langchain/core                (no imports anywhere)
  @streamdown/cjk                (no imports; not a peer of streamdown)
  @radix-ui/react-checkbox       (re-exported by the radix-ui umbrella;
                                  no direct imports)
  @radix-ui/react-label          (same)
  @radix-ui/react-select         (same)
  @radix-ui/react-separator      (same)
  date-fns                       (already a direct dep of react-day-picker)
  remark-gfm                     (already a direct dep of streamdown)

Removed from devDependencies:
  playwright                     (CI installs the pip playwright; the
                                  npm one is unused)

Moved to devDependencies:
  @types/canvas-confetti         (TypeScript types only; not a runtime dep)

Verified with npm install + npm run build (tsc -b && vite build),
clean exit, dist/ produced. Live unsloth studio launch returns 200
on /, on the main JS / CSS bundles, and on /api/health.

* studio/frontend: keep @radix-ui packages (per maintainer)

Maintainer asked to keep the four @radix-ui packages this PR was
originally dropping:

  @radix-ui/react-checkbox  ^1.3.3
  @radix-ui/react-label     ^2.1.8
  @radix-ui/react-select    ^2.2.6
  @radix-ui/react-separator ^1.1.8

Restored to dependencies and refreshed the lockfile. Build still
green (1044 packages, vite build 2.1s, same dist contents).
@danielhanchen

Copy link
Copy Markdown
Collaborator Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request streamlines the frontend by removing several unused dependencies and enhances internationalization support by adding dir="auto" to chat composers for RTL language detection. It also introduces a robust regression test suite for IME composition and multilingual text processing. Review feedback identifies PEP 8 style violations regarding keyword argument spacing and suggests refining exception handling in the new test script to avoid masking potential issues with broad catch blocks.

NEW = os.environ["STUDIO_NEW_PW"]
ART_DIR = os.environ.get("PW_ART_DIR", "logs/playwright_ime")
ART = Path(ART_DIR)
ART.mkdir(parents = True, exist_ok = True)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

According to PEP 8, there should be no spaces around the = sign when used to indicate a keyword argument. This violation is pervasive throughout the file (e.g., lines 121, 125, 147, 152, 183, etc.).

Suggested change
ART.mkdir(parents = True, exist_ok = True)
ART.mkdir(parents=True, exist_ok=True)
References
  1. PEP 8: Pet Peeves - Don't use spaces around the = sign when used to indicate a keyword argument. (link)

Comment on lines +205 to +208
try:
page.wait_for_load_state("networkidle", timeout = 30_000)
except Exception:
pass

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

Catching all exceptions with except Exception: pass is generally discouraged as it can hide unexpected issues. It is better to catch specific exceptions like playwright.sync_api.Error or TimeoutError if you expect a timeout during wait_for_load_state.

Comment on lines +225 to +230
except Exception as e:
form_err = e
info(
f"change-password attempt {_form_attempt + 1} failed: "
f"{type(e).__name__}: {str(e)[:200]}"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

Catching Exception is very broad. It's recommended to catch more specific exceptions that you expect might occur during the password change process (e.g., network errors or specific Playwright errors) to avoid masking other bugs.

Comment on lines +249 to +252
try:
page.wait_for_load_state("networkidle", timeout = 30_000)
except Exception:
pass

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

Similar to the previous instance, catching all exceptions here might hide real issues. Consider catching only expected errors like TimeoutError.

stoicAI1776 and others added 4 commits May 16, 2026 06:18
Builds on the intelgputorch210 fix in the previous commit by also
filling in the four other intel-gpu-torch* wheel stacks that PyTorch
has published but Unsloth was not yet exposing:

* intelgputorch271 / intel-gpu-torch271:
    torch 2.7.1+xpu  with pytorch-triton-xpu 3.3.1
* intelgputorch291 / intel-gpu-torch291:
    torch 2.9.1+xpu  with pytorch-triton-xpu 3.5.0
* intelgputorch2110 / intel-gpu-torch2110:
    torch 2.11.0+xpu with triton-xpu 3.7.0
* intelgputorch2120 / intel-gpu-torch2120:
    torch 2.12.0+xpu with triton-xpu 3.7.1

Each torch + triton pair was cross-checked against the wheel's PEP 658
.metadata sidecar to confirm the upstream Requires-Dist matches what
this block pins. Verified via `uv pip compile` across cp310/cp311/cp312/
cp313 on Linux x86_64 and Windows AMD64.
Adds a Playwright smoke that exercises the chat composer textarea
across 31 scripts (covering >90% of the world's population) and
reproduces the issue #5318 stuck-composition pattern. Also adds
dir="auto" to the main, edit, and compare composers so RTL scripts
(Arabic, Hebrew, Persian, Urdu) flow right-to-left without forcing
the whole UI into RTL.

The new test is wired into the existing studio-ui-smoke workflow as
a third Studio boot. It is model-free (the bug surface is the React
composer, not inference) so it adds ~60 s wall clock on a warm
runner.
The Playwright script for the IME / multilingual composer smoke runs
the change-password form in first-run bootstrap mode and only needs
the new password; it never reads the old bootstrap password. Drop
the corresponding $OLD shell capture, the ::add-mask:: line, the
STUDIO_IME_OLD_PW export, and the STUDIO_OLD_PW env handoff from
the Drive IME step.
@danielhanchen

Copy link
Copy Markdown
Collaborator Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request streamlines the project's dependencies by removing several unused packages and moving type definitions to development dependencies. It also improves RTL language support in the chat composer by adding the dir="auto" attribute and introduces a comprehensive regression test for IME composition and multilingual text handling. Feedback was provided to ensure the new test file adheres to PEP 8 style conventions regarding keyword argument spacing.

NEW = os.environ["STUDIO_NEW_PW"]
ART_DIR = os.environ.get("PW_ART_DIR", "logs/playwright_ime")
ART = Path(ART_DIR)
ART.mkdir(parents = True, exist_ok = True)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

For consistency with the PEP 8 style guide, please remove the spaces around the equals sign for keyword arguments (e.g., func(arg=value) instead of func(arg = value)).

This convention should be applied to all function calls with keyword arguments throughout this file.

Suggested change
ART.mkdir(parents = True, exist_ok = True)
ART.mkdir(parents=True, exist_ok=True)

Reinstates the bootstrap password export and env mapping that the IME /
multilingual composer Playwright smoke reads at module import time, so
the Linux Studio UI CI job stops failing with KeyError: 'STUDIO_OLD_PW'
before Chromium launches. Matches the env contract used by the existing
chat and extra UI smokes.
Compress the multi-paragraph rationale on the new IME / i18n CI step
and on the chat-composer dir="auto" comment to one or two load-bearing
lines; the issue #5318 and PR #5327 references stay since they pin the
specific upstream bug + fix.
@danielhanchen

Copy link
Copy Markdown
Collaborator Author

Fixes pushed to unslothai/unsloth#5485.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants