Skip to content

Studio: IME / multilingual composer regression test + RTL dir="auto"#5485

Merged
danielhanchen merged 1 commit into
mainfrom
studio/ime-i18n-regression
May 17, 2026
Merged

Studio: IME / multilingual composer regression test + RTL dir="auto"#5485
danielhanchen merged 1 commit into
mainfrom
studio/ime-i18n-regression

Conversation

@danielhanchen

Copy link
Copy Markdown
Member

Summary

  • Adds tests/studio/playwright_chat_ime_i18n.py: a Playwright smoke that exercises the chat composer textarea across 31 scripts (covering more than 90 percent of the world's population by speaker count) and reproduces the issue [Bug] Japanese IME activation makes the chat input field unable to accept any text #5318 stuck-composition pattern.
  • Wires the smoke into .github/workflows/studio-ui-smoke.yml as a third Studio boot on port 18896.
  • Adds dir="auto" to the main chat composer, the inline edit composer, and the compare-mode composer so RTL scripts (Arabic, Hebrew, Persian, Urdu) flow right to left without forcing the whole UI into RTL.

Why

  • Before fix: harden Studio IME composer sends #5327 there was no Studio-owned regression coverage for the IME composition path. The bug shipped because @assistant-ui/react floated from 0.12.19 to 0.12.28 (Fix Studio chat history and attachments with newer assistant-ui #5296) and surfaced a library bug; the existing chat / extra UI smokes did not exercise composition events.
  • The new test catches both that class of regression and any future Unicode mangling regression (NFC re-normalisation, UTF-16 surrogate splits, combining-mark reorderings) without needing to load a GGUF model.
  • dir="auto" is a one-line opt-in to Unicode bidi auto-detection. Without it, typing Arabic / Hebrew / Persian / Urdu shows correct characters but the cursor, punctuation, and line wrap flow left to right, which is jarring for RTL users.

What the smoke checks

  1. ASCII keyboard typing baseline.
  2. Static guard that the composer textarea carries dir="auto".
  3. Paste round trip across 31 scripts (English, Mandarin, Spanish, Hindi, Arabic, Bengali, Portuguese, Russian, Japanese, Punjabi, German, Javanese, Korean, French, Turkish, Vietnamese, Urdu, Tamil, Telugu, Marathi, Italian, Thai, Polish, Ukrainian, Persian, Dutch, Hebrew, Greek, Indonesian, Swahili, plus an emoji / ZWJ / flag entry).
  4. Normal IME composition sequence committing 你好.
  5. Issue [Bug] Japanese IME activation makes the chat input field unable to accept any text #5318 stuck-composition repro: two compositionstart events with no matching compositionend, then verify the field still accepts abc and a follow-up keystroke d, and that the Send button is not stuck disabled.

Cost

Model-free on purpose, since the bug surface is the React composer rather than inference. About 60 seconds of wall clock on a warm runner, plus the third Studio boot (which is fast because the install is already warm from the earlier steps in the same job).

Verified locally

Test plan

  • Studio UI CI passes on ubuntu-latest.
  • tests/studio/playwright_chat_ime_i18n.py step produces screenshots under logs/playwright_ime/.
  • No new console errors surface on main builds.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves support for right-to-left (RTL) languages by adding the dir="auto" attribute to composer inputs in the frontend. It also introduces a comprehensive Playwright test suite, playwright_chat_ime_i18n.py, designed to catch regressions in IME composition and multilingual text handling. Review feedback suggests refactoring event listener attachments to ensure they persist after page recovery in tests, adhering to PEP 8 by removing spaces around keyword argument assignments, and avoiding silent exception handling in console log collection.

Comment on lines +233 to +238
page,
ctx,
default_timeout_ms = 60_000,
info = lambda m: print(f"[ime] recovery: {m}", flush = True),
)
if form_err is not None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

When recover_or_replace_page replaces the page object (e.g., after a crash or navigation failure), the event listeners for pageerror and console attached at lines 164 and 175 are lost. This means subsequent errors on the recovered page will not be captured. Consider refactoring the listener attachment into a helper function and calling it both after initial page creation and after recovery.

References
  1. Centralize recurring or complex logical checks into a single helper function and reuse it across the codebase to ensure consistency and simplify maintenance.

NEW = os.environ["STUDIO_NEW_PW"]
ART_DIR = os.environ.get("PW_ART_DIR", "logs/playwright_ime")
ART = Path(ART_DIR)
ART.mkdir(parents = True, exist_ok = True)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Adhere to the PEP 8 style guide by avoiding spaces around the = sign for keyword arguments. This is a consistent pattern throughout the file that should be corrected for better adherence to Python standards.

Suggested change
ART.mkdir(parents = True, exist_ok = True)
ART.mkdir(parents=True, exist_ok=True)
References
  1. Avoid spaces around the = sign when used to indicate a keyword argument. (link)
  2. Prioritize consistency with established coding patterns within a module.

Comment on lines +172 to +173
except Exception:
return

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid using broad, silent exception handlers. Instead, log the exception (even at a debug level) to aid in future debugging if the console message property access fails.

Suggested change
except Exception:
return
except Exception as e:
info(f"Error collecting console message: {e}")
return
References
  1. Avoid using broad, silent exception handlers like except Exception: pass. Instead, log the exception, even if at a debug level, to aid in future debugging.

@danielhanchen

Copy link
Copy Markdown
Member Author

This PR appears to address open issue(s). The duplicate detector matched the following open issues with HIGH confidence:

  • unslothai/unsloth#5262@king5699 — Issue reports Studio chat input cannot accept Chinese IME or paste; PR adds IME/paste regression coverage and composer dir handling.

If this PR fixes any of them, consider adding closes #N / resolves #N to the description so the issue auto-closes on merge. If the match is wrong, ignore this comment.

@danielhanchen danielhanchen added the auto-reviewing Auto-review in progress label May 16, 2026
@danielhanchen danielhanchen force-pushed the studio/ime-i18n-regression branch from 052dd41 to 61dab76 Compare May 16, 2026 13:55

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 61dab767d6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread pyproject.toml Outdated
"unsloth[intelgputorch271]"
]
intelgputorch291 = [
intelgputorch210 = [

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore removed Intel XPU extras

This hunk drops all of the Intel XPU optional-dependency groups for torch 2.7.1, 2.9.1, 2.11.0, and 2.12.0 (plus their intel-gpu-* aliases), even though the change is only about Studio IME coverage. After this, users installing those advertised variants such as unsloth[intel-gpu-torch291] or unsloth[intel-gpu-torch2120] will no longer get the pinned torch/triton/torchvision XPU wheels and will fall back to the base package, breaking those Intel configurations.

Useful? React with 👍 / 👎.

@danielhanchen danielhanchen added auto-approved Auto-review approved the PR and removed auto-reviewing Auto-review in progress labels May 16, 2026
@danielhanchen

Copy link
Copy Markdown
Member Author

Auto-review verdict: Approved

Adds dir="auto" to the three Studio chat composer textareas so Arabic/Hebrew/Persian/Urdu input flows right-to-left, and wires a model-free Playwright smoke (multilingual paste + stuck-IME composition repro for issue #5318) into the Studio UI CI job. Useful because it both fixes a user-visible RTL bug and locks down the previously-fixed stuck-composition regression that backend/frontend CI both missed.

Reason: All real review findings fixed; PR adds correct RTL bidi support and a regression smoke for a real upstream IME bug, with no remaining defects.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba4f1c0918

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread .github/workflows/studio-ui-smoke.yml Outdated
- name: Drive IME + multilingual paste regression with Playwright
env:
BASE_URL: http://127.0.0.1:18896
STUDIO_OLD_PW: ${{ env.STUDIO_IME_OLD_PW }}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove dead STUDIO_OLD_PW from IME Playwright step

The IME smoke script only reads BASE_URL and STUDIO_NEW_PW, but this step still injects STUDIO_OLD_PW; the newly added guard test (test_ime_workflow_step_does_not_set_studio_old_pw) explicitly asserts that this variable is absent, so the test suite now fails immediately on this mismatch. Dropping this env var is required to keep the new regression checks green.

Useful? React with 👍 / 👎.

Comment thread .github/workflows/studio-ui-smoke.yml Outdated
NEW="CIIme-$(python -c 'import secrets; print(secrets.token_urlsafe(16))')"
echo "::add-mask::$OLD"
echo "::add-mask::$NEW"
echo "STUDIO_IME_OLD_PW=$OLD" >> "$GITHUB_ENV"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop exporting unused STUDIO_IME_OLD_PW

This workflow still exports STUDIO_IME_OLD_PW even though the new IME path no longer consumes it, and the new test (test_ime_pass_password_step_does_not_export_old_pw) now fails because of this exact line. Keeping the obsolete export breaks the regression test contract introduced in the same commit.

Useful? React with 👍 / 👎.

@danielhanchen danielhanchen force-pushed the studio/ime-i18n-regression branch from f89359d to f4be9b6 Compare May 17, 2026 08:36
@danielhanchen

Copy link
Copy Markdown
Member Author

Force-pushed a clean rebuild on top of current main. The previous tip had drifted across iterations and ended up reverting #4931 (intelgputorch271 extras), #5477 (studio/frontend dep cleanup), and #5478 (deterministic dep-removal check), because the staging working copy was based on a snapshot of main that predated those merges. The final diff now stops at the five files this PR actually owns:

  • .github/workflows/studio-ui-smoke.yml
  • studio/frontend/src/components/assistant-ui/thread.tsx
  • studio/frontend/src/features/chat/shared-composer.tsx
  • tests/studio/playwright_chat_ime_i18n.py
  • tests/studio/test_composer_rtl_bidi_attribute.py

No source edits beyond the dir="auto" lines and the new CI step. pyproject.toml, studio/frontend/package*.json, scripts/check_frontend_dep_removal.py, tests/studio/test_frontend_dep_removal.py, and .github/workflows/studio-frontend-ci.yml are no longer touched.

Adds dir="auto" to the main, edit, and compare chat composers so RTL
scripts (Arabic, Hebrew, Persian, Urdu) flow right to left without
forcing the rest of the UI into RTL. Wires a model-free Playwright
smoke (multilingual paste round trip across 31 scripts + a stuck-IME
composition repro for issue #5318 / PR #5327) into the Studio UI CI
job as a third Studio boot, plus a pure-Python static-guard test that
locks down dir="auto" on all three composers and the minimal env
contract for the smoke.
@danielhanchen danielhanchen force-pushed the studio/ime-i18n-regression branch from f4be9b6 to a8772e1 Compare May 17, 2026 09:16

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8772e185a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +228 to +232
page = recover_or_replace_page(
page,
ctx,
default_timeout_ms = 60_000,
info = lambda m: print(f"[ime] recovery: {m}", flush = True),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Re-navigate after composer mount recovery

If the first composer wait times out or the renderer/page is replaced, this recovery call does not pass goto_url=BASE, so recover_or_replace_page can return a fresh blank page (or leave the page on a stale route) and the second attempt just waits for the textarea in the wrong document. The existing chat and extra UI scripts use goto_url=BASE for this exact mount-recovery path; without it the new IME smoke can fail instead of recovering from the transient mount/renderer issues it is trying to tolerate.

Useful? React with 👍 / 👎.

@danielhanchen danielhanchen merged commit 0542dc0 into main May 17, 2026
34 checks passed
@danielhanchen danielhanchen deleted the studio/ime-i18n-regression branch May 17, 2026 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-addresses-issue Pre-flight: appears to address an open issue auto-approved Auto-review approved the PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant