Studio: stop the model from replying twice when it refuses by oobabooga · Pull Request #5775 · unslothai/unsloth

oobabooga · 2026-05-26T04:30:42Z

When a GGUF model in Studio refuses a request, it can reply twice in one turn: e.g. "I will not respond to that." then "I cannot fulfill this request.", each with its own thinking block.

Cause: the agentic loop re-prompts the model when it announces a plan ("I'll...", "Let me...") without calling a tool, detected by the _INTENT_SIGNAL regex. Its i will pattern also matches refusals like "I will not...", so a refusal is misread as a plan and triggers a second generation.

Fix: a negative lookahead so negated forms ("I will not", "I'll never") no longer match. Genuine plans like "I will search" still re-prompt.

Only fires with tools enabled. Related: #5714, #5549, #5706 rework the same path for other cases; none cover refusals.

Reproduction

Load a GGUF model in Studio (e.g. gemma-4-E2B-it-GGUF) and turn on tools ("Think, Search, Code").
Send a request the model refuses with a reply starting "I will not...".
It replies twice in one turn, each with its own thinking block.

chatgpt-codex-connector · 2026-05-26T04:30:46Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

gemini-code-assist

Code Review

This pull request updates the regular expression in llama_cpp.py to include a negative lookahead that drops negated forms (like "not" or "never") so that model refusals do not trigger a re-prompt. The reviewer suggested expanding this negative lookahead to cover other common refusal phrasings such as "decline", "refuse", or "be unable" to make the detection more robust.

gemini-code-assist · 2026-05-26T04:31:43Z

+    # Negative lookahead drops negated forms ("I will not", "I'll never")
+    # so a refusal doesn't trigger a re-prompt.
+    r"\b(i['\u2019](ll|m going to|m gonna)|i am (going to|gonna)|i will|i shall|let me|allow me)\b(?!\s+(?:not|never)\b)"


A refusal-verb list is bypassed by one intervening word ("I will respectfully decline", "I will have to decline"), so it never converges. The general case belongs in the "don't re-prompt after a final answer" path (the #5714 direction), not in this regex. Keeping it to "not"/"never", which handles the common case.

MLX CI on Mac M1 + Backend CI (both Repo tests CPU and Python 3.10/11/12/13) have been red on every push to main for days. None of the underlying code is wrong; three test files have stale anchors / assertions left behind by PR #5537 (max_steps bump) and PR #5775 (composer + provision-desktop-auth). 1. tests/studio/run_real_mlx_smoke.py:393 PR #5537 bumped max_steps from 7 to 30 for seed-robust convergence but left `assert len(losses_per_step) == 7`. With logging_steps=1 the callback fires once per step; 30 entries, not 7. Track config.max_steps so the gate auto-follows future bumps. 2. tests/studio/test_composer_rtl_bidi_attribute.py:29 PR #5775 changed the composer aria-label from the literal `aria-label="Message input"` to a JSX ternary `aria-label={overlay ? "Image edit instructions" : "Message input"}`. Anchor on the inner string literal `"Message input"` instead. 3. studio/backend/tests/test_desktop_auth.py:487 The guarded_import in test_provision_desktop_auth_writes_secret_and_creates_db_without_backend_deps blocks any import whose name == "utils", including the relative `from .utils import echo` inside typer._click.decorators (typer 0.25+). Gate the block on level == 0 so only absolute imports of `utils` / `auth` / `fastapi` / `structlog` are rejected; relative imports inside third-party packages pass through. All three tests pass locally; the MLX one is a mechanical 7->config.max_steps swap and will be exercised by MLX CI on this PR.

…d CI) (#5803) * tests: unblock three stale assertions broken on main MLX CI on Mac M1 + Backend CI (both Repo tests CPU and Python 3.10/11/12/13) have been red on every push to main for days. None of the underlying code is wrong; three test files have stale anchors / assertions left behind by PR #5537 (max_steps bump) and PR #5775 (composer + provision-desktop-auth). 1. tests/studio/run_real_mlx_smoke.py:393 PR #5537 bumped max_steps from 7 to 30 for seed-robust convergence but left `assert len(losses_per_step) == 7`. With logging_steps=1 the callback fires once per step; 30 entries, not 7. Track config.max_steps so the gate auto-follows future bumps. 2. tests/studio/test_composer_rtl_bidi_attribute.py:29 PR #5775 changed the composer aria-label from the literal `aria-label="Message input"` to a JSX ternary `aria-label={overlay ? "Image edit instructions" : "Message input"}`. Anchor on the inner string literal `"Message input"` instead. 3. studio/backend/tests/test_desktop_auth.py:487 The guarded_import in test_provision_desktop_auth_writes_secret_and_creates_db_without_backend_deps blocks any import whose name == "utils", including the relative `from .utils import echo` inside typer._click.decorators (typer 0.25+). Gate the block on level == 0 so only absolute imports of `utils` / `auth` / `fastapi` / `structlog` are rejected; relative imports inside third-party packages pass through. All three tests pass locally; the MLX one is a mechanical 7->config.max_steps swap and will be exercised by MLX CI on this PR. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…k-glm-kimi Resolves three conflicts against the updated 5620 base (which itself merged main after main moved on with PRs unslothai#5735 / unslothai#5775 / unslothai#5803 etc. touching the same routes/inference.py and tool_call_parser.py surface): * studio/backend/core/inference/tool_call_parser.py ``_TOOL_CLOSED_PATS``: kept 5624's full set (Mistral pre-v11 array, Mistral v11+ name{json}, DeepSeek envelope, Kimi section) on top of the 3-pattern base. New 5620 base reverted to the 3 base patterns because main never carried the tool-format extensions. * studio/backend/routes/inference.py Merged the two regex bodies: kept 5624's elaborate python_tag ``(?:[^<]|<(?!\|))*`` clause and the new-family closed-pair patterns (DeepSeek envelope, Kimi section). DROPPED the inlined Mistral patterns in favour of the base's ``_strip_tool_xml`` helper which delegates Mistral handling to the parser module's ``_strip_mistral_closed_calls`` -- the non-greedy ``\{.*?\}`` form truncates at the first ``}`` of a nested JSON arg, so balanced brace/bracket scanning is correct here. Also kept the base's orphan-close and tail-only ``</parameter>`` patterns from the speculative-buffer split boundary work. Net: 9 call sites continue to use ``_strip_tool_xml(...)`` (Mistral-safe). * studio/backend/tests/test_safetensors_tool_loop.py ``TestRoutesPythonTagStrip``: kept the base's wording for the section header (the two were near-identical) and switched the helper ``_strip`` back to ``_strip_tool_xml`` since the helper is restored. Also retargeted ``test_pr5624_regressions.py``'s routes-layer strip tests to ``_strip_tool_xml`` for consistency with the restored helper. Tests: pytest studio/backend/tests/test_safetensors_tool_loop.py studio/backend/tests/test_safetensors_capability_advertise.py studio/backend/tests/test_pr5624_regressions.py -q -> 170 passed in 1.93s pytest studio/backend/tests/ -q -k 'not gpu and not llama_cpp_integration' -> 2034 passed, 15 failed (pre-existing on the 5620 base; same set as before the merge: test_training_worker_flash_attn, test_desktop_auth, test_studio_api integration shims).

Studio: stop the model from replying twice when it refuses

86f7db8

oobabooga requested a review from rolandtannous as a code owner May 26, 2026 04:30

gemini-code-assist Bot reviewed May 26, 2026

View reviewed changes

danielhanchen merged commit ef67442 into unslothai:main May 26, 2026
1 check passed

danielhanchen mentioned this pull request May 27, 2026

tests: unblock three stale assertions broken on main (MLX CI + Backend CI) #5803

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Studio: stop the model from replying twice when it refuses#5775

Studio: stop the model from replying twice when it refuses#5775
danielhanchen merged 1 commit into
unslothai:mainfrom
oobabooga:fix/reprompt-negated-intent

oobabooga commented May 26, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

oobabooga May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

oobabooga commented May 26, 2026

Reproduction

Uh oh!

chatgpt-codex-connector Bot commented May 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

oobabooga May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants