Chat-template repair: warn-by-default, AST classification, dict support by danielhanchen · Pull Request #5049 · unslothai/unsloth

danielhanchen · 2026-04-16T07:56:20Z

Follow-up hardening on top of PR #4426 (which fixed the #4150 RuntimeError for ChatML LoRA reloads). Addresses the six follow-ups identified during the #4426 investigation.

Behavior changes

1. Warn-by-default instead of RuntimeError

Before: if the tokenizer's chat_template ignored add_generation_prompt and _fix_chat_template could not repair it, fix_chat_template raised a hard RuntimeError that blocked model loading entirely. In the #4150 scenario this meant a tokenizer saved by a downstream tool (LlamaFactory, Axolotl) could not be reloaded at all.

After: repair failure emits a logger.warning_once and returns the original template. Model loading continues; the user can either manually set tokenizer.chat_template or supply their own prompt for inference. UNSLOTH_STRICT_CHAT_TEMPLATE=1 restores the pre-warn hard fail for users who want it.

2. Local path vs HF hub distinguished in the message

Before: Please file a bug report to the maintainers of <name_or_path>. In the #4150 case <name_or_path> was the user's own saves/Llama-3.1-8B-Instruct/lora/train_.../ folder -- the user could not file a bug against their own directory.

After: messages are branched on os.path.isdir(name_or_path). Local paths get a message pointing at the likely downstream tool that re-serialized the tokenizer; HF hub IDs get a message pointing at the upstream model maintainers.

3. Dict / list chat_template now handled

Before: passing a tokenizer whose chat_template is a dict (e.g. Hermes-3 {default, tool_use}) or list to _fix_chat_template crashed with AttributeError: 'dict' object has no attribute 'find'. Surfaced during #4426 Phase 3 testing.

After: fix_chat_template detects dict / list shape and fixes each variant independently, preserving the outer structure.

Internals

_find_end_position now matches all four Jinja whitespace-control variants ({% %}, {%- %}, {% -%}, {%- -%}) and returns the rightmost endfor/endif. Previously {%- endfor -%} (both-side dash, used by Qwen3-Guard) was silently bypassed because the old str.find was only looking for two specific dash styles.
_has_add_generation_prompt_block uses Jinja AST (jinja2.nodes.If / Name walk) instead of substring matching. Templates that mention add_generation_prompt only inside a comment are correctly classified as "no block"; templates that use whitespace-control variants on the if/endif tags are correctly classified as "has block".
_template_ends_with_toplevel_for gates the GH#4150 ChatML repair on the AST: only fires when the last structural top-level node is a For loop (the standard ChatML shape), ignoring trailing pure-whitespace Output nodes. Templates wrapped in an outer If (Qwen3-Guard) are now explicitly skipped at the _fix_chat_template level, not just at load_correct_tokenizer's name-based exemption. This means direct callers of _fix_chat_template also get correct behavior.
_validate_patched_template renders the patched template with and without add_generation_prompt and confirms the patched output responds to the flag by appending (not replacing) content. If validation fails, the patch is discarded and the warn path is taken instead.

Verification

Expanded regression suite across five test files (paths are in the workspace, not committed to this PR):

test_fix_chat_template_pr4426.py: 42/42 template-matrix cells (Hermes, Magnum, Phi-4 single + multi-sep, dot-access, split-literal, trailing whitespace, trailing Jinja comment, two trap templates, already-fixed idempotency, Llama-3 / Gemma-3 / Qwen2.5 regressions).
test_load_correct_tokenizer_pr4426.py: 5/5 integration loads including two synthetic broken stubs that would have crashed on main.
test_chat_template_followups.py: 10/10 new tests covering each follow-up (dash variants, AST classification, dict + list handling, warn vs strict env var, local vs HF message, validation rejection, Qwen3-Guard NOP, dash-both ChatML repair).
test_mistral_pr4426.py: 5 Mistral-family templates byte-identical (Mistral-7B-v0.3, Mistral-Nemo, Mistral-Small-24B, Mixtral).
test_qwen_pr4426.py: 14 Qwen-family templates byte-identical (Qwen1.5, Qwen2, Qwen2.5-Instruct/Coder/Math/VL, Qwen3, Qwen3-Coder, QwQ-32B, Qwen3Guard-Gen).

No change to the ChatML repair heuristic itself: PR #4426 continues to repair the broken Hermes / Magnum / Phi-4 shapes exactly as before.

Follow-ups intentionally NOT included

Replacing the string-surgery rewrite with a full Jinja AST unparse is a separate, larger change. Jinja2 has no built-in unparse, so it would require either a manual visitor that emits Jinja source for each node type, or using the AST only as a classifier (as this PR does for detection and gating) while keeping string surgery for the actual write. This PR takes the latter approach; a full unparse can be a later refactor.

Follow-up hardening on top of PR #4426 (which fixed the #4150 RuntimeError for ChatML LoRA reloads). Behavior changes: - Warn-by-default instead of RuntimeError. When fix_chat_template cannot repair a broken template, emit a warning and return the original. Set UNSLOTH_STRICT_CHAT_TEMPLATE=1 to restore the pre-warn hard fail. Fixes the UX where a missing `{% if add_generation_prompt %}` block on a saved LoRA (typical after LlamaFactory / Axolotl re-serialize) would block model loading entirely. - Local path vs HF hub distinguished in the warning message. For local paths the message points at the likely downstream tool; for HF IDs it points at the upstream model maintainers. Previously both said "file a bug report to the maintainers of <path>" even when <path> was the user's own saves/ directory. - Dict / list chat_template now handled. Hermes-3 ships with {default, tool_use} and the previous code crashed with AttributeError: 'dict' object has no attribute 'find' when entering _fix_chat_template with a dict. Each variant is now fixed independently; structure is preserved. Internals: - _find_end_position now matches all four Jinja whitespace-control variants ({% %}, {%- %}, {% -%}, {%- -%}) and returns the rightmost endfor/endif so multi-for templates aren't locked onto the first loop. Previously {%- endfor -%} (both-side dash, used by Qwen3-Guard) was silently bypassed. - _has_add_generation_prompt_block uses Jinja AST via jinja2.nodes.If/Name walks instead of substring matching, so templates that hide the block behind comments or dash-style variants are classified correctly. - _template_ends_with_toplevel_for gates the GH#4150 ChatML repair on the AST: only fires when the last structural top-level node is a For (standard ChatML shape), ignoring trailing pure-whitespace output nodes. Templates wrapped in an outer If (Qwen3-Guard) are now explicitly skipped at the _fix_chat_template level as well, not just at load_correct_tokenizer's name-based exemption. - _validate_patched_template renders the patched template with and without add_generation_prompt and confirms the patched output responds to the flag by appending (not replacing) content. If validation fails, the patch is discarded and we fall through to the warn path. Verified with an expanded regression suite in tests/: - test_fix_chat_template_pr4426.py: 42/42 template-matrix cells - test_load_correct_tokenizer_pr4426.py: 5/5 tokenizer loads - test_chat_template_followups.py: 10/10 new follow-up tests - test_mistral_pr4426.py: 5 Mistral variants byte-identical - test_qwen_pr4426.py: 14 Qwen variants byte-identical (Qwen1.5, Qwen2, Qwen2.5-Instruct/Coder/Math/VL, Qwen3, Qwen3-Coder, QwQ, Qwen3-Guard-Gen)

for more information, see https://pre-commit.ci

gemini-code-assist

Code Review

This pull request refactors the chat template repair logic to be more robust, utilizing regex and Jinja AST parsing to handle various whitespace-control variants and multi-variant template formats (dicts and lists). It also introduces a validation step for repairs, a strict mode toggle via environment variables, and improved diagnostic messaging that distinguishes between local and upstream sources. A review comment suggests refining the warning message for non-functional template blocks to improve clarity when a block exists but fails to produce the expected output.

gemini-code-assist · 2026-04-16T08:00:34Z

+        "Unsloth: The tokenizer `{name}` does not have a "
+        "{{% if add_generation_prompt %}} block for generation purposes, and "
+        "automatic repair was not possible. The model will still load, but "
+        "`apply_chat_template(add_generation_prompt=True)` may not produce a "
+        "correct assistant-turn marker. {hint} Set "
+        "UNSLOTH_STRICT_CHAT_TEMPLATE=1 to raise instead of warn."
+    ).format(name = name_or_path, hint = source_hint)


The warning message for repaired=False states that the tokenizer "does not have a block", which is slightly misleading when called from the logic path where the block exists but is non-functional (line 965). Consider adjusting the message or adding a parameter to _format_chat_template_message to distinguish between a missing block and one that was provided but is incorrect, as this helps users better understand the underlying issue with their tokenizer configuration.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab99ae11b4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T08:00:50Z

+            if _has_add_generation_prompt_block(tmpl):
+                fixed[key] = tmpl
+                continue
+            new_tmpl = _fix_chat_template(tmpl)
+            if _has_add_generation_prompt_block(new_tmpl):


Gate variant rewrites on actual add_generation_prompt behavior

This branch rewrites a dict variant whenever _has_add_generation_prompt_block is false, but that helper only detects {% if ... %} blocks and misses valid inline uses like {{ ... if add_generation_prompt else ... }}. In that case a working template can be rewritten by _fix_chat_template, and wrapping the whole expression in a new {% if add_generation_prompt %} drops the original else path (for example an EOS suffix), changing rendered prompts. The string path avoids this via the no/yes render probe, but dict/list variants currently skip that safety check.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-16T08:00:50Z

            else:
+                fixed[key] = tmpl


Preserve strict/warn diagnostics when variant repair fails

If _fix_chat_template cannot repair a dict/list variant, this code silently keeps the broken template (fixed[key] = tmpl) and continues. That bypasses the new warn/strict behavior used for string templates, so users can still get add_generation_prompt no-op behavior with no warning, and UNSLOTH_STRICT_CHAT_TEMPLATE=1 has no effect for these variants. At minimum the same failure diagnostic path should run when a variant remains unrepaired.

Useful? React with 👍 / 👎.

If tokenizer.chat_template is a property or otherwise read-only, the validation helper would crash with AttributeError when trying to temporarily set the patched template. Catch the assignment failure and return False (skip validation), and best-effort restore in the finally block.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 25c4b58439

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T08:13:46Z

+            if _has_add_generation_prompt_block(tmpl):
+                fixed[key] = tmpl
+                continue


Validate dict variants even when block already exists

In the new dict/list handling path, variants are treated as healthy as soon as _has_add_generation_prompt_block(tmpl) is true, but this only checks syntax presence and never verifies behavior. If a variant has a malformed or inert add_generation_prompt block (renders the same output for both flag values), this branch skips the warn/strict diagnostics used for string templates and silently leaves generation markers broken. This can still surface at inference time as missing assistant prefixes while appearing “fixed” during load.

Useful? React with 👍 / 👎.

… non-ChatML templates The previous `_infer_assistant_separator` was a four-tier regex heuristic that only worked on ChatML-shaped templates and forced a hard `<|im_start|>` / `<|im_end|>` presence gate on Case 2 repair. This meant a Llama-3, Gemma, or Phi-3 template stripped of its generation-prompt block by a downstream tool (LlamaFactory, Axolotl, etc.) would still warn-and-return even though the structural shape is identical to the ChatML case the PR already handles. This replaces the regex with `_derive_assistant_prefix_by_render`: render the template with two dialogs that differ only in assistant content, then `os.path.commonprefix` on the tails captures the exact assistant-turn prefix the template emits. The template itself is ground truth, so non-ChatML shapes work as long as the assistant block is a literal the template emits once per message. Three guards keep the derivation safe: A. both assistant renders extend the base render (no reordering); B. the divergence point is exactly the content-insertion site (sentinel follows the common prefix); C. a user-role cross-check: if a render with a user sentinel also emits the same prefix, role has no effect on output and we reject. A render failure on [user, user] (e.g. Gemma's `raise_exception` alternation check) is evidence that role matters; we accept. Sentinels differ at character 0 so `commonprefix` cannot absorb them, and trailing whitespace/comments after the last `{% endfor %}` are stripped before probing (they would appear in base but not after the appended assistant turn and break Guard A). `_fix_chat_template` and `_repair_string_template` now thread an `is_sharegpt` kwarg; `_fix_chat_template` retries once with `is_sharegpt=True` if the first probe returns None (dual-probe fallback for dict/list callers). The ChatML `<|im_start|>` / `<|im_end|>` hard gate in Case 2 is dropped. `_infer_assistant_separator` is deleted. Verified via: - tests/test_fix_chat_template_pr4426.py: 51/51 cells (new Llama-3, Gemma, Phi-3 broken-template rows all repair FIX-OK) - tests/test_load_correct_tokenizer_pr4426.py: 5/5 - tests/test_chat_template_followups.py: 18/18 (T11-T18 cover non-ChatML repair + probe failure modes) - tests/test_mistral_pr4426.py: 5/5 byte-identical - tests/test_qwen_pr4426.py: 14/14 byte-identical (Qwen3-Guard AST gate still rejects) - tests/hermes3_lora_pr4426.py reload: patched template ends with `<|im_start|>assistant\n`, inference returns sensible output. - temp/sim/battery.py: 79/79 followup; vs baseline: 0 regressions, 9 improvements. - Spot-check probe on real stripped tokenizers (Hermes-3, Phi-4, Llama-3.2-1B, Gemma-3-1B): all derive the expected prefix.

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 64db5e6db3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T09:27:53Z

+    except Exception:
+        return chat_template


Warn or fail when add_generation_prompt=True probe throws

If apply_chat_template(..., add_generation_prompt=True) raises while the False probe succeeds, this except silently returns the original template and skips both the new warning path and strict-mode enforcement. That leaves templates with a broken generation branch undiagnosed during load, and they will still fail later at inference when callers enable add_generation_prompt. This affects any tokenizer whose generation-specific branch errors only when the flag is true.

Useful? React with 👍 / 👎.

…comment-safe end scan Resolves three reviewer findings on PR #5049 (`fix/chat-template-followups`): Finding #1 [10/10]: dict/list variants now route through `_fix_chat_template_for_tokenizer` via a new `_VariantTokenizerProxy` adapter. Previously the dict/list branches called `_fix_chat_template` directly, silently bypassing the warn/strict (`UNSLOTH_STRICT_CHAT_TEMPLATE`) contract, the `no == yes` diagnostic, broken-existing-block detection, and `_validate_patched_template` guard. The proxy swaps `base.chat_template` to the variant string before each `apply_chat_template` call so tokenizer globals (`bos_token`, custom filters, `raise_exception`) remain available; if the base is read-only it falls back to isolated Jinja rendering. Finding #2 [1/10]: `_has_add_generation_prompt_block` now requires the `If` body to contain at least one `Output` node (a new `_if_body_emits_content` helper walks descendants). This distinguishes a real generation-prompt block from a header guard like `{% if not add_generation_prompt is defined %}{% set ... %}{% endif %}` (body contains only `Assign`) which references the name but emits nothing. Also dropped a now-redundant `"add_generation_prompt" not in scrubbed` guard in `_fix_chat_template` Case 2 so header-guarded templates still get repaired. Finding #4 [1/10]: `_find_end_position` now replaces Jinja comments with equal-length whitespace before scanning for `{% endfor %}` / `{% endif %}` tokens. This prevents a trailing comment containing those tokens from being picked as the real end tag. Positions in the padded string map 1:1 to positions in the original template. Tests: - tests/test_chat_template_followups.py: 21/21 (T19 strict-mode dict variant, T20 header-guard repair, T21 comment-endfor trap added; T4/T5 stubs updated with a working apply_chat_template that routes through Jinja). - tests/test_fix_chat_template_pr4426.py: 51/51 cells unchanged. - tests/test_load_correct_tokenizer_pr4426.py: 5/5. - tests/test_mistral_pr4426.py: 5/5 byte-identical. - tests/test_qwen_pr4426.py: 14/14 byte-identical. - temp/sim/battery.py: 79/79 followup; 0 regressions vs baseline. - Phase 3 Hermes-3 broken-LoRA reload: inference still returns `'The answer to the equation 2+2 is 4.'`. - Spot-checks on Hermes-3 / Phi-4 / Llama-3.2-1B / Gemma-3-1B real stripped templates: probe still derives the expected prefix.

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5ce5c4e9c6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T10:27:38Z

+        env = jinja2.Environment(
+            autoescape = False,
+            keep_trailing_newline = True,
+        )
+        tmpl = env.from_string(probe_template)


Render probe templates in a sandboxed Jinja environment

The repair probe now renders chat_template with a plain jinja2.Environment, which executes model-provided template code outside a sandbox. In the load path, chat_template can come from untrusted tokenizer artifacts, so a malicious template can run arbitrary Python during this probe when the no == yes repair path is hit. This is a security regression from relying on the tokenizer's normal sandboxed rendering; use jinja2.sandbox.ImmutableSandboxedEnvironment (or delegate probing to tokenizer.apply_chat_template) for these internal renders.

Useful? React with 👍 / 👎.

danielhanchen · 2026-04-16T10:29:45Z

/gemini review

gemini-code-assist

Code Review

This pull request significantly refactors the chat template patching logic in unsloth/tokenizer_utils.py, replacing regex-based inference with a more robust approach using Jinja AST analysis and render-diffing. It introduces support for multi-variant templates and a strict mode for template validation. Feedback suggests enhancing the isolated Jinja environments used for probing and validation by including common globals like raise_exception to improve compatibility with templates that enforce role alternation.

gemini-code-assist · 2026-04-16T10:35:30Z

+        env = jinja2.Environment(
+            autoescape = False,
+            keep_trailing_newline = True,
+        )


The isolated Jinja environment used for prefix derivation lacks common globals like raise_exception, which are frequently used in model templates (e.g., Mistral, Gemma) to enforce role alternation. If a template calls raise_exception during the probe render, _derive_assistant_prefix_by_render will return None and skip the repair.

Consider adding a dummy raise_exception function to env.globals to improve compatibility with these templates.

gemini-code-assist · 2026-04-16T10:35:31Z

+            env = jinja2.Environment(
+                autoescape = False,
+                keep_trailing_newline = True,
+            )


Similar to the prefix derivation probe, this fallback Jinja environment for read-only tokenizers lacks access to tokenizer-specific globals and filters. While acceptable as a fallback, adding raise_exception to env.globals here would allow the validation and diagnostic logic to work for a wider range of templates when the base tokenizer is immutable.

Pure comment minimization across `_find_end_position`, `_has_add_generation_prompt_block`, `_if_body_emits_content`, `_derive_assistant_prefix_by_render`, `_fix_chat_template` Case 2, and `_VariantTokenizerProxy`. No behavior change; same intent, fewer lines. All 21 follow-up tests and the 51-cell Phase 1 matrix still pass.

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba3d1ad085

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T10:47:14Z

+        if isinstance(node, jinja2.nodes.Output):
+            return True


Ignore whitespace-only Output nodes in AGP block detection

_if_body_emits_content treats any jinja2.nodes.Output as real content, including newline/indentation TemplateData. For templates that only contain a header guard like {% if not add_generation_prompt is defined %}\n{% set ... %}\n{% endif %}, this returns true even though the branch emits no generation marker, so _has_add_generation_prompt_block misclassifies the template as already having a usable generation block. In the no == yes path, _fix_chat_template_for_tokenizer then skips repair and leaves add_generation_prompt ineffective, which breaks prompt construction for affected tokenizers.

Useful? React with 👍 / 👎.

Three real bugs from the 10-agent Opus review: 1. Probe now uses `jinja2.sandbox.SandboxedEnvironment` instead of bare `jinja2.Environment`. The probe renders at model-load time (before the user calls `apply_chat_template`), so it was a new eager code-execution surface that the base HF tokenizer loading does not have. SandboxedEnvironment blocks attribute-chain exploits at negligible cost. 2. `_repair_string_template` now tries validation with both `is_sharegpt=False` and `is_sharegpt=True`. Previously, when `_fix_chat_template` internally fell back to the other schema via its dual-probe, the outer validation still used the caller's original `is_sharegpt` -- rendering with the wrong message keys and spuriously dropping a valid repair. 3. `_has_add_generation_prompt_block` now skips `If` nodes whose test is a `Not` expression. A negated gate like `{% if not add_generation_prompt %}{{ x }}{% endif %}` fires when agp=False, so its emitting body is not a generation block -- but the old code counted any Name reference regardless of polarity. Cleanup: removed unused `self._label`, added `\r` escape in generation-block literal, switched variant labels to `!r` formatting, removed redundant `import os as _os`.

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bec15e865e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T11:13:35Z

+            env = jinja2.Environment(
+                autoescape = False,
+                keep_trailing_newline = True,
+            )


Render proxy fallback with a sandboxed Jinja environment

Fresh evidence after the earlier sandbox fix: _VariantTokenizerProxy.apply_chat_template still uses a plain jinja2.Environment in its read-only fallback, so model-provided template code is rendered unsandboxed during load-time repair probes. This is reachable when base.chat_template cannot be reassigned, and it reintroduces code-execution risk from untrusted tokenizer templates; the fallback should use a sandboxed environment (or avoid direct Jinja rendering entirely).

Useful? React with 👍 / 👎.

Two critical findings from the 20-reviewer pass: 1. [20/20] The proxy read-only fallback used bare `jinja2.Environment`, not sandboxed. All 20 reviewers independently reproduced marker-file creation via `cycler.__init__.__globals__['os'].system(...)` during `fix_chat_template()`. Fixed: fallback now uses `from jinja2.sandbox import SandboxedEnvironment`. 2. [14/20] The render-diff probe did `import jinja2` then referenced `jinja2.sandbox.SandboxedEnvironment`. `jinja2.sandbox` is a submodule that is NOT auto-imported by `import jinja2` on Jinja 3.1.6. This caused `AttributeError` (swallowed by `except Exception`), making the entire Case 2 repair path silently return None in a clean process. The 6 reviewers who saw it work had `jinja2.sandbox` pre-imported by an earlier module in their process. Fixed: both the probe and the proxy fallback now use `from jinja2.sandbox import SandboxedEnvironment`.

for more information, see https://pre-commit.ci

danielhanchen · 2026-04-16T11:51:08Z

/gemini review

gemini-code-assist

Code Review

This pull request refactors the chat template patching logic in unsloth/tokenizer_utils.py to improve robustness and support for complex tokenizer configurations. Key changes include the introduction of AST-based template analysis, a render-diff mechanism for inferring assistant prefixes, and a proxy class to handle multi-variant tokenizers. It also adds an environment variable to control strictness and provides more descriptive warning messages for incomplete templates. I have no feedback to provide.

danielhanchen · 2026-04-16T13:49:14Z

Auto-review verdict: Changes requested

Reason: no verdict parsed; defaulting based on commit history

* Fix review findings for PR #49 1. Sandbox fallback Jinja env in _VariantTokenizerProxy.apply_chat_template (use SandboxedEnvironment, matching _derive_assistant_prefix_by_render) 2. Unwrap benign outer-If guards in _template_ends_with_toplevel_for so templates like {% if messages %}{% for ... %}{% endfor %}{% endif %} are still repairable (preserves Qwen3-Guard rejection via else-branch and add_generation_prompt-name checks) 3. Preserve raw name_or_path in _VariantTokenizerProxy._source_path so local-path detection works for dict/list variant tokenizers 4. Context-aware strict-mode messages: omit "will still load" and "Set UNSLOTH_STRICT_CHAT_TEMPLATE=1" when already raising * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

danielhanchen requested a review from mmathew23 as a code owner April 16, 2026 07:56

[pre-commit.ci] auto fixes from pre-commit.com hooks

ab99ae1

for more information, see https://pre-commit.ci

gemini-code-assist Bot reviewed Apr 16, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

danielhanchen and others added 2 commits April 16, 2026 09:20

[pre-commit.ci] auto fixes from pre-commit.com hooks

64db5e6

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

danielhanchen and others added 2 commits April 16, 2026 10:22

[pre-commit.ci] auto fixes from pre-commit.com hooks

5ce5c4e

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

gemini-code-assist Bot reviewed Apr 16, 2026

View reviewed changes

danielhanchen and others added 2 commits April 16, 2026 10:41

[pre-commit.ci] auto fixes from pre-commit.com hooks

ba3d1ad

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

danielhanchen and others added 2 commits April 16, 2026 11:08

[pre-commit.ci] auto fixes from pre-commit.com hooks

bec15e8

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

danielhanchen mentioned this pull request Apr 16, 2026

Chat-template repair: warn-by-default, AST classification, dict support danielhanchen/unsloth-staging-2#49

Closed

danielhanchen added the auto-reviewing Auto-review in progress label Apr 16, 2026

danielhanchen and others added 2 commits April 16, 2026 11:29

[pre-commit.ci] auto fixes from pre-commit.com hooks

c0cb430

for more information, see https://pre-commit.ci

danielhanchen added a commit to danielhanchen/unsloth-staging-2 that referenced this pull request Apr 16, 2026

Add review tests for PR unslothai#5049

c122b08

gemini-code-assist Bot reviewed Apr 16, 2026

View reviewed changes

danielhanchen added a commit to danielhanchen/unsloth-staging-2 that referenced this pull request Apr 16, 2026

Add review tests for PR unslothai#5049

fbbfa37

danielhanchen added a commit to danielhanchen/unsloth-staging-2 that referenced this pull request Apr 16, 2026

Add review tests for PR unslothai#5049

490555c

danielhanchen merged commit c5be8b1 into main Apr 16, 2026
5 checks passed

danielhanchen deleted the fix/chat-template-followups branch April 16, 2026 12:52

danielhanchen added a commit to danielhanchen/unsloth-staging-2 that referenced this pull request Apr 16, 2026

Add review tests for PR unslothai#5049

5c75969

danielhanchen added a commit to danielhanchen/unsloth-staging-2 that referenced this pull request Apr 16, 2026

Add review tests for PR unslothai#5049

e827a0e

danielhanchen mentioned this pull request Apr 16, 2026

Fix review findings for chat-template repair (#5049) #5056

Merged

danielhanchen added auto-review-failed Auto-review rejected the PR and removed auto-reviewing Auto-review in progress labels Apr 16, 2026

Uh oh!

Conversation

danielhanchen commented Apr 16, 2026

Behavior changes

1. Warn-by-default instead of RuntimeError

2. Local path vs HF hub distinguished in the message

3. Dict / list chat_template now handled

Internals

Verification

Follow-ups intentionally NOT included

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

danielhanchen commented Apr 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

danielhanchen commented Apr 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review