t3219: fix CodeRabbit review feedback on prompt-guard patterns#4692
t3219: fix CodeRabbit review feedback on prompt-guard patterns#4692alex-solovyev merged 1 commit intomainfrom
Conversation
- Move _PG_YAML_PATTERNS_LOADED flag assignment to after successful parse+cache in _pg_load_yaml_patterns(); previously set after file discovery but before parsing, so a parse failure would permanently disable YAML loading on subsequent calls (transient failures treated as permanent) - Replace byte-level \xNN hex escapes in zero-width character YAML pattern with literal Unicode chars (U+200B ZWSP, U+200C ZWNJ, U+200D ZWJ, U+FEFF BOM) for portability across rg/grep/ggrep; byte-level escapes match individual bytes, not multi-byte UTF-8 codepoints Closes #3219
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses previously unactioned feedback, enhancing the robustness and correctness of the prompt-guard system. It rectifies a caching logic flaw that could prematurely disable YAML pattern loading and refines a zero-width character detection pattern for improved cross-platform compatibility. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Review skipped — only excluded labels are configured. (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughUpdated the zero-width character pattern from hex-escaped byte sequences to literal Unicode characters for portability, and repositioned the YAML patterns loaded flag to trigger only after successful parsing rather than after file discovery, preventing transient failures from permanently disabling pattern loading. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested labels
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sat Mar 14 07:19:20 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Code Review
This pull request effectively addresses two separate issues. The first fix correctly defers setting the _PG_YAML_PATTERNS_LOADED flag in prompt-guard-helper.sh until after patterns are successfully parsed and cached, which robustly handles transient parsing failures. The second fix in prompt-injection-patterns.yaml corrects a faulty regex pattern for zero-width characters. While the fix is correct, I've suggested a small improvement to use YAML's Unicode escape sequences for better maintainability. Overall, these are solid improvements.
| # Literal Unicode chars (U+200B ZWSP, U+200C ZWNJ, U+200D ZWJ, U+FEFF BOM) | ||
| # for portability across rg/grep/ggrep — byte-level \xNN escapes match | ||
| # individual bytes, not multi-byte UTF-8 codepoints. | ||
| pattern: '[]' |
There was a problem hiding this comment.
For better readability and maintainability, consider using YAML's standard Unicode escape sequences within a double-quoted string. This approach makes the pattern explicit and avoids potential issues with editors or source control tools mishandling the literal, invisible zero-width characters.
pattern: "[\u200B\u200C\u200D\uFEFF]"
|
…ns (#4692) - Move _PG_YAML_PATTERNS_LOADED flag assignment to after successful parse+cache in _pg_load_yaml_patterns(); previously set after file discovery but before parsing, so a parse failure would permanently disable YAML loading on subsequent calls (transient failures treated as permanent) - Replace byte-level \xNN hex escapes in zero-width character YAML pattern with literal Unicode chars (U+200B ZWSP, U+200C ZWNJ, U+200D ZWJ, U+FEFF BOM) for portability across rg/grep/ggrep; byte-level escapes match individual bytes, not multi-byte UTF-8 codepoints Closes #3219



Summary
Addresses unactioned CodeRabbit review feedback from PR #2715 (issue #3219).
Changes
_PG_YAML_PATTERNS_LOADEDflag moved after successful parse+cacheBug:
_PG_YAML_PATTERNS_LOADED="true"was set after file discovery succeeds but before parsing. If parsing produced no patterns (e.g., malformed YAML, empty file), the flag was already set totruewith an empty cache. Subsequent calls would hit the cache check, find_PG_YAML_PATTERNS_CACHEempty, and return 1 — permanently disabling YAML loading for the session.Fix: Move the flag assignment to after
_PG_YAML_PATTERNS_CACHE="$patterns"so it is only set on a successful parse+cache. Transient parse failures now allow retries.Zero-width character pattern uses literal Unicode chars
Bug: The YAML pattern
'[\xE2\x80\x8B\xE2\x80\x8C\xE2\x80\x8D\xEF\xBB\xBF]'used byte-level\xNNhex escapes. In a character class, these match individual bytes (0xE2, 0x80, 0x8B, etc.), not the multi-byte UTF-8 codepoints U+200B/U+200C/U+200D/U+FEFF.Fix: Replace with literal Unicode characters (U+200B ZWSP, U+200C ZWNJ, U+200D ZWJ, U+FEFF BOM) — same approach used for the homoglyph patterns. Verified:
rgcorrectly matches the literal chars against test input.Verification
shellcheck: SC1091 info only (pre-existing, not a violation)prompt-guard-helper.sh status: 119 YAML patterns loaded correctlyprompt-guard-helper.sh test: all PASSCloses #3219
Summary by CodeRabbit
Release Notes
Bug Fixes
Updates