Skip to content

t1375.1: Add YAML pattern loading and Lasso pattern merge to prompt-guard#2715

Merged
alex-solovyev merged 8 commits intomainfrom
feature/yaml-pattern-loading
Mar 2, 2026
Merged

t1375.1: Add YAML pattern loading and Lasso pattern merge to prompt-guard#2715
alex-solovyev merged 8 commits intomainfrom
feature/yaml-pattern-loading

Conversation

@alex-solovyev
Copy link
Collaborator

@alex-solovyev alex-solovyev commented Mar 2, 2026

Summary

  • Add _pg_load_yaml_patterns() with auto-detect (script-relative, deployed, env var) and inline fallback when YAML unavailable
  • Create .agents/configs/prompt-injection-patterns.yaml with 108 patterns across 8 categories (39 existing + 69 net-new from Lasso Security's claude-hooks, MIT licensed)
  • Add scan-stdin subcommand for pipeline use (curl -s URL | prompt-guard-helper.sh scan-stdin)
  • Extend test suite from 28 to 74 tests covering all new pattern categories

New Pattern Categories (from Lasso)

Category Net-new patterns
Homoglyph attacks (Cyrillic/Greek) 2
Fake JSON/XML system roles 3
HTML/code comment injection 4
Priority manipulation 4
Fake delimiter markers 5
Split personality / evil twin 3
Acrostic/steganographic 1
Fake previous conversation claims 4
Restriction bypass variants 5
False authority claims 6
Instruction boundary manipulation 3
Reset/clear/wipe attempts 4
Encoding variants (hex, base64, URL) 4
Other (replace, redefine, etc.) 21

Architecture

Pattern loading priority:
  1. YAML file (auto-detected or PROMPT_GUARD_YAML_PATTERNS env var)
  2. Inline _pg_get_patterns() fallback (always available)
  3. Custom patterns (PROMPT_GUARD_CUSTOM_PATTERNS, always loaded if set)

Verification

  • 74/74 tests pass (prompt-guard-helper.sh test)
  • ShellCheck clean (only pre-existing SC1091 info)
  • Inline fallback verified by temporarily removing YAML file
  • scan-stdin tested with both injection and clean content
  • status command shows YAML vs inline source and pattern counts

Closes #2698

Summary by CodeRabbit

  • New Features

    • stdin scanning for prompt-injection detection.
    • External YAML-backed pattern repository with auto-detection, caching, and custom pattern overrides.
    • Expanded detection patterns covering instruction overrides, jailbreaks, encoding obfuscation, delimiter injections, system-prompt extraction, social engineering, data exfiltration, and context manipulation.
  • Documentation

    • Enhanced CLI help and status output showing active pattern sources and counts.
  • Tests

    • Added tests for YAML-driven patterns, stdin scanning, fallback behavior, and severity handling.
  • Chores

    • Improved logging/audit reporting of findings and severity metrics.

@github-actions github-actions bot added the enhancement Auto-created from TODO.md tag label Mar 2, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 2, 2026

Warning

Rate limit exceeded

@alex-solovyev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 5 minutes and 20 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 7f33771 and 3c42354.

📒 Files selected for processing (1)
  • .agents/scripts/prompt-guard-helper.sh

Walkthrough

Adds a new YAML-based prompt-injection pattern repository and integrates YAML-driven loading into the prompt-guard helper script with inline-pattern fallback, plus a new scan-stdin subcommand, custom-pattern support, status/help updates, and tests for YAML-driven scanning.

Changes

Cohort / File(s) Summary
Pattern Repository
\.agents/configs/prompt-injection-patterns.yaml
New ~533-line YAML containing categorized prompt-injection patterns (severity, description, PCRE). Merges existing inline patterns with ~29 net-new Lasso-origin patterns across categories like instruction_override, role_play, delimiter_injection, encoding_tricks, system_prompt_extraction, social_engineering, data_exfiltration, and context_manipulation. Includes licensing and contributor guidance.
Scanner Enhancement
\.agents/scripts/prompt-guard-helper.sh
Added YAML discovery/loading (_pg_find_yaml_patterns(), _pg_load_yaml_patterns()), caching, YAML/env-controlled source (PROMPT_GUARD_YAML_PATTERNS), inline-pattern fallback, unified stream-based scanning (_pg_scan_patterns_from_stream()), custom-pattern support (PROMPT_GUARD_CUSTOM_PATTERNS), new CLI command scan-stdin with cmd_scan_stdin(), enhanced status/help output, extended logging/audit, and added tests/demos covering YAML and stdin scanning.

Sequence Diagram(s)

sequenceDiagram
    participant Input as Untrusted Content
    participant Loader as Pattern Loader
    participant YAML as YAML Patterns
    participant Inline as Inline Patterns (Fallback)
    participant Scanner as Pattern Scanner
    participant Reporter as Finding Reporter

    Input->>Loader: submit stdin/file content
    Loader->>YAML: attempt load (cache check)
    alt YAML available
        YAML-->>Loader: return YAML patterns
    else YAML unavailable
        Loader->>Inline: load inline fallback patterns
        Inline-->>Loader: return inline patterns
    end
    Loader->>Scanner: provide patterns + PROMPT_GUARD_CUSTOM_PATTERNS
    Scanner->>Scanner: execute PCRE matches
    Scanner->>Reporter: emit findings with severities
    Reporter-->>Input: output status, findings, exit code
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

A YAML garden grows, patterns set to guard,
Lasso threads and aidevops stitched in the yard,
stdin now checks what agents dare to read,
Fallbacks stand ready when configs recede,
Cheers — regex sentries keep the pipeline hard.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main changes: YAML pattern loading and Lasso pattern merge into the prompt-guard helper script.
Linked Issues check ✅ Passed All objectives from issue #2698 are met: YAML pattern loading with fallback implemented, 108 patterns merged (39 existing + 69 from Lasso), scan-stdin subcommand added, and inline patterns preserved as fallback.
Out of Scope Changes check ✅ Passed All changes are within scope—YAML pattern loading, Lasso pattern integration, and scan-stdin subcommand directly address #2698 objectives with no unrelated modifications.
Docstring Coverage ✅ Passed Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/yaml-pattern-loading

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the prompt-guard system's ability to detect prompt injection attacks by introducing a flexible YAML-based pattern loading system. This change allows for easier maintenance and expansion of detection capabilities, incorporating a substantial number of new patterns from Lasso Security. Additionally, a new scan-stdin command improves the utility of the prompt-guard-helper.sh script by enabling its use in automated pipelines for scanning arbitrary content.

Highlights

  • YAML Pattern Loading: Implemented a YAML-based pattern loading mechanism for prompt injection detection, allowing for external and dynamic pattern updates with auto-detection and inline fallback.
  • Expanded Pattern Database: Introduced a comprehensive prompt-injection-patterns.yaml file, merging existing patterns with 69 net-new patterns from Lasso Security, significantly expanding detection coverage to 108 patterns across 8 categories.
  • New scan-stdin Subcommand: Added a new scan-stdin subcommand to prompt-guard-helper.sh, enabling pipeline integration for scanning content from standard input.
  • Enhanced Test Coverage: Expanded the test suite from 28 to 74 tests to ensure robust coverage of the newly added patterns and functionalities.
Changelog
  • .agents/configs/prompt-injection-patterns.yaml
    • Added a new YAML file containing 108 prompt injection patterns across 8 categories, including 69 net-new patterns from Lasso Security.
  • .agents/scripts/prompt-guard-helper.sh
    • Updated script description and comments to reflect new features and expanded detection scope.
    • Introduced PROMPT_GUARD_YAML_PATTERNS environment variable for specifying a custom YAML patterns file.
    • Implemented _pg_find_yaml_patterns for auto-detecting the YAML patterns file location (env var, script-relative, deployed).
    • Added _pg_load_yaml_patterns function to parse YAML files and extract patterns, with caching.
    • Modified _pg_scan_message to prioritize loading patterns from the YAML file, falling back to inline patterns if the YAML file is not found or parsable.
    • Created _pg_scan_patterns_from_stream to generalize pattern scanning from a stream.
    • Added cmd_scan_stdin to allow scanning content piped via standard input.
    • Updated cmd_status to display information about the active pattern source (YAML or inline) and pattern counts.
    • Expanded cmd_test with 46 new test cases covering the newly integrated Lasso patterns and the scan-stdin functionality.
    • Updated cmd_help documentation to include the new scan-stdin command, revised severity level descriptions, and details on pattern loading priority.
    • Integrated scan-stdin into the main command dispatch logic.
Activity
  • No activity (comments, reviews, or other interactions) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 111 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Mar 2 02:06:36 UTC 2026: Code review monitoring started
Mon Mar 2 02:06:36 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 111

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 111
  • VULNERABILITIES: 0

Generated on: Mon Mar 2 02:06:39 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances prompt-guard-helper.sh by introducing pattern loading from a comprehensive YAML file and adding a scan-stdin command for pipeline integration. However, the reliance on a custom, fragile Bash-based YAML parser and insecure handling of untrusted data in internal pipes introduce risks of security bypasses, log injection, and denial of service. My feedback focuses on increasing the robustness of the new YAML parser and ensuring adherence to project conventions for error handling to improve debuggability.

Comment on lines +170 to +210
while IFS= read -r line; do
# Skip comments and empty lines
[[ "$line" =~ ^[[:space:]]*# ]] && continue
[[ -z "${line// /}" ]] && continue

# Category header (top-level key ending with colon, no leading whitespace)
if [[ "$line" =~ ^([a-z_]+):$ ]]; then
current_category="${BASH_REMATCH[1]}"
continue
fi

# List item start (- severity: ...)
if [[ "$line" =~ ^[[:space:]]*-[[:space:]]*severity:[[:space:]]*\"?([A-Z]+)\"?$ ]]; then
# Emit previous pattern if complete
if [[ -n "$severity" && -n "$pattern" && -n "$current_category" ]]; then
patterns+="${severity}|${current_category}|${description}|${pattern}"$'\n'
fi
severity="${BASH_REMATCH[1]}"
description=""
pattern=""
continue
fi

# Description field
if [[ "$line" =~ ^[[:space:]]*description:[[:space:]]*\"(.+)\"$ ]]; then
description="${BASH_REMATCH[1]}"
continue
fi

# Pattern field (single-quoted — YAML standard for regex)
if [[ "$line" =~ ^[[:space:]]*pattern:[[:space:]]*\'(.+)\'$ ]]; then
pattern="${BASH_REMATCH[1]}"
continue
fi

# Pattern field (double-quoted)
if [[ "$line" =~ ^[[:space:]]*pattern:[[:space:]]*\"(.+)\"$ ]]; then
pattern="${BASH_REMATCH[1]}"
continue
fi
done <"$yaml_file"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The custom YAML parser implemented in _pg_load_yaml_patterns is highly fragile and prone to security bypasses. Its restrictive regular expressions, particularly for description and pattern fields, fail to account for valid YAML variations like trailing comments, different quoting styles, or escaped characters. This can lead to critical patterns not being loaded as expected, creating a false sense of security. It is strongly recommended to use a robust, dedicated YAML parser like yq or to significantly harden the regex patterns, for example, by using negated character sets like [^"*] instead of greedy matches (.+) for quoted strings.

Comment on lines +660 to +664
local content
if ! content=$(cat); then
_pg_log_error "Failed to read from stdin"
return 1
fi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The cmd_scan_stdin function reads the entire content of stdin into a shell variable using content=$(cat). This approach is vulnerable to Denial of Service (DoS) via memory exhaustion if the input is excessively large. Since this tool is designed to process untrusted external content (as noted in the script's comments), it should handle large inputs gracefully, either by limiting the maximum input size or by processing the stream in chunks.

Comment on lines 409 to +411
matched_text=$(_pg_extract_match "$pattern" "$message") || matched_text="[match]"
echo "${severity}|${category}|${description}|${matched_text}"
found=1
_pg_scan_found=1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The internal data flow in _pg_scan_patterns_from_stream constructs a pipe-delimited string by concatenating untrusted data (matched_text) directly into the output: echo "${severity}|${category}|${description}|${matched_text}". If the matched text from the untrusted message contains pipe characters (|) or newlines, it will corrupt the internal data format. This can be exploited to trick the severity assessment logic (e.g., by injecting a fake CRITICAL finding) or to perform log injection in the _pg_log_attempt function. All untrusted data should be sanitized or escaped (e.g., by removing or encoding delimiters and newlines) before being included in delimited formats.


# Try YAML patterns first (comprehensive), fall back to inline (core set)
local yaml_patterns
yaml_patterns=$(_pg_load_yaml_patterns 2>/dev/null) || true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Suppressing stderr with 2>/dev/null here hides potentially useful diagnostic messages from _pg_load_yaml_patterns, such as warnings about parsing errors or info about falling back to inline patterns. This violates the project's general rule to avoid blanket error suppression for better debuggability. Please remove the redirection to allow these messages to be visible. The || true is sufficient to prevent script exit on failure.

Suggested change
yaml_patterns=$(_pg_load_yaml_patterns 2>/dev/null) || true
yaml_patterns=$(_pg_load_yaml_patterns) || true
References
  1. In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup. Do not suppress stderr with '2>/dev/null' so that actual syntax or system errors remain visible for debugging.


if [[ -n "$yaml_file" ]]; then
local yaml_patterns
yaml_patterns=$(_pg_load_yaml_patterns 2>/dev/null) || yaml_patterns=""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Suppressing stderr with 2>/dev/null here hides potentially useful diagnostic messages from _pg_load_yaml_patterns, such as warnings about parsing errors. This violates the project's general rule to avoid blanket error suppression for better debuggability. Please remove the redirection.

Suggested change
yaml_patterns=$(_pg_load_yaml_patterns 2>/dev/null) || yaml_patterns=""
yaml_patterns=$(_pg_load_yaml_patterns) || yaml_patterns=""
References
  1. Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.

@alex-solovyev
Copy link
Collaborator Author

Pulse supervisor: qlty check is failing with 'Build errored' (infrastructure error at qlty.sh, not a code issue). All other checks pass. Dispatching worker to push a no-op commit to re-trigger CI, or this can be merged once qlty recovers.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
.agents/scripts/prompt-guard-helper.sh (2)

660-670: Consider warning when stdin is a terminal.

If a user runs prompt-guard-helper.sh scan-stdin without piping content, the script will block indefinitely waiting for input. Adding a TTY check would improve the user experience.

♻️ Add TTY detection warning
 cmd_scan_stdin() {
 	local content
+	if [[ -t 0 ]]; then
+		_pg_log_warn "Reading from stdin (Ctrl+D to end, Ctrl+C to cancel)"
+	fi
 	if ! content=$(cat); then
 		_pg_log_error "Failed to read from stdin"
 		return 1
 	fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/prompt-guard-helper.sh around lines 660 - 670, In
cmd_scan_stdin, detect whether stdin is a TTY before attempting to read (use a
check like [ -t 0 ]) and if it is a terminal, log a warning via _pg_log_error or
a warning helper (e.g., _pg_log_warn) that no piped input was detected and
return non-zero instead of blocking; update the flow in cmd_scan_stdin so the
TTY check runs before calling cat and only proceeds to read when stdin is not a
terminal.

171-174: Empty line check may miss tab-only lines.

Line 174 uses ${line// /} which strips only spaces. Lines containing only tabs won't be detected as empty.

♻️ Consistent whitespace handling
 	while IFS= read -r line; do
 		# Skip comments and empty lines
 		[[ "$line" =~ ^[[:space:]]*# ]] && continue
-		[[ -z "${line// /}" ]] && continue
+		[[ "$line" =~ ^[[:space:]]*$ ]] && continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/prompt-guard-helper.sh around lines 171 - 174, The
empty-line check currently strips only spaces using "${line// /}" so tab-only
lines slip through; update the condition that checks variable "line" to detect
any whitespace-only lines (spaces, tabs) — for example, replace the space-only
parameter-expansion check with a regex or whitespace-aware expansion that
matches ^[[:space:]]*$ or otherwise removes both spaces and tabs before testing
and keep the surrounding while-loop and comment-skip logic intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/configs/prompt-injection-patterns.yaml:
- Around line 327-334: The PCRE2-style \x{...} escapes in the homoglyph patterns
(the entries with description "Cyrillic homoglyphs detected" and "Greek
homoglyphs detected") are not portable to ggrep/grep fallbacks; replace those
\x{....} sequences with portable alternatives—either explicit Unicode characters
inside the character classes (paste the actual Cyrillic and Greek letters
instead of \x{0430}/\x{03B1}, etc.) or a POSIX/PCRE-compatible property/class
(e.g., use \p{Cyrillic} or \p{Greek} if your grep variant supports \p{}), and
ensure the change works with the fallback chooser (_pg_detect_grep_cmd) so the
patterns match under ripgrep and the grep/ggrep fallbacks.

In @.agents/scripts/prompt-guard-helper.sh:
- Around line 1078-1082: Update the misleading header comment so it matches the
actual test expectations: change the phrase "should ALLOW on moderate" to
indicate WARN (exit code 2) for LOW patterns under the moderate policy; this
affects the section containing the _test_expect calls (e.g., _test_expect
"Purpose-based framing" 2 ... and _test_expect "Acrostic instruction" 2 ...), so
edit only the human-readable header text to reflect that LOW severity yields
WARN (exit 2).

---

Nitpick comments:
In @.agents/scripts/prompt-guard-helper.sh:
- Around line 660-670: In cmd_scan_stdin, detect whether stdin is a TTY before
attempting to read (use a check like [ -t 0 ]) and if it is a terminal, log a
warning via _pg_log_error or a warning helper (e.g., _pg_log_warn) that no piped
input was detected and return non-zero instead of blocking; update the flow in
cmd_scan_stdin so the TTY check runs before calling cat and only proceeds to
read when stdin is not a terminal.
- Around line 171-174: The empty-line check currently strips only spaces using
"${line// /}" so tab-only lines slip through; update the condition that checks
variable "line" to detect any whitespace-only lines (spaces, tabs) — for
example, replace the space-only parameter-expansion check with a regex or
whitespace-aware expansion that matches ^[[:space:]]*$ or otherwise removes both
spaces and tabs before testing and keep the surrounding while-loop and
comment-skip logic intact.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 33df162 and 0ced3f9c74df4d94a927cf6a995863decb42ac91.

📒 Files selected for processing (2)
  • .agents/configs/prompt-injection-patterns.yaml
  • .agents/scripts/prompt-guard-helper.sh

@alex-solovyev
Copy link
Collaborator Author

qlty check — persistent infrastructure failure

Re-triggered CI with a no-op commit (0ced3f9c). All checks pass except qlty check, which fails with the same Build errored infrastructure error on both the original and re-triggered runs:

This is a qlty.sh service-side issue, not a code problem. All other quality gates (SonarCloud, Codacy, CodeFactor, CodeRabbit, Socket Security, Framework Validation, Version Consistency) pass.

PR is ready to merge once qlty service recovers, or can be merged with admin override bypassing the qlty required check.

alex-solovyev added a commit that referenced this pull request Mar 2, 2026
Address CodeRabbit review requesting file-based scanner variant
documentation. The code block now shows all three subcommands:
- scan: small inline strings (shell variables, short text)
- scan-file: large/file-based content (downloads, uploads, PR diffs)
- scan-stdin: piped content in pipelines (added by t1375.1 / PR #2715)

Also updates the external content lookup table to reference all variants.
@alex-solovyev
Copy link
Collaborator Author

CodeRabbit nitpicks addressed (2ed7ac2f)

Both review suggestions have been implemented:

  1. TTY detection in cmd_scan_stdin() — Added [ -t 0 ] guard at the top of the function. If stdin is a terminal (not piped input), it now logs a warning with usage hint and returns 1 immediately, preventing the function from blocking indefinitely waiting for input.

  2. Whitespace check in _pg_load_yaml_patterns() — Replaced [[ -z "${line// /}" ]] (which only strips spaces) with [[ "$line" =~ ^[[:space:]]*$ ]] to correctly skip lines containing only tabs or mixed whitespace.

All 74 tests pass. ShellCheck clean.

@alex-solovyev
Copy link
Collaborator Author

Pulse dispatched a fix worker to address CodeRabbit CHANGES_REQUESTED: TTY check in scan-stdin, whitespace regex fix in YAML parser, PCRE2 portability fix for homoglyph patterns, and misleading comment correction. Resolves #2697.

@alex-solovyev
Copy link
Collaborator Author

Supervisor pulse killed the previous worker (PID 1675382, uptime 2:57, struggle_ratio 4621 — 4621 messages / 1 commit). Thrashing detected. Re-dispatching with fresh worker.

@alex-solovyev
Copy link
Collaborator Author

@coderabbitai review

All requested changes from the CHANGES_REQUESTED review (on commit 0ced3f9c) have been addressed in subsequent commits:

  • TTY detection in cmd_scan_stdin(): commit 2ed7ac2f — added [ -t 0 ] guard with warning and return 1
  • Whitespace check fix: commit 2ed7ac2f — replaced ${line// /} with [[ "$line" =~ ^[[:space:]]*$ ]]
  • PCRE2 portability fix for homoglyph patterns: commit ca44d999
  • Misleading comment correction: commit ca44d999
  • Additional TTY refinement: commit d1dbdf75

All 74 tests pass. ShellCheck clean. Please re-review the latest commit (d1dbdf75).

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 2, 2026

@alex-solovyev: Thank you for the comprehensive summary of the fixes! I can see you've systematically addressed all the feedback from the previous review:

✅ TTY detection in cmd_scan_stdin() with proper guard and user-friendly error
✅ Whitespace regex fix using proper [[:space:]] character class
✅ PCRE2 portability improvements for homoglyph patterns
✅ Misleading comment corrections
✅ Additional TTY refinements

I'll trigger a fresh review of the latest commit (d1dbdf75) to verify all changes and provide final feedback.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

alex-solovyev added a commit that referenced this pull request Mar 2, 2026
…pt-guard-helper.sh

Implement stdin-reading variants so docs references to 'scan-stdin' are valid
without waiting for PR #2715 to merge. Each reads piped content via cat and
delegates to the existing cmd_scan/cmd_check/cmd_sanitize functions.

Also removes the 'requires v1.x+' caveat from build-plus.md and build.txt
since the subcommand now exists on this branch.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.agents/configs/prompt-injection-patterns.yaml (1)

436-440: Zero-width character pattern has limited UTF-8 effectiveness (pre-existing).

The pattern at line 440 [\xE2\x80\x8B...] treats UTF-8 multi-byte sequences as individual bytes in a character class. This matches any single byte E2, 80, 8B, etc., rather than the actual zero-width Unicode characters (U+200B, U+200C, U+200D, U+FEFF).

This is a pre-existing limitation from the inline fallback patterns. For improved detection, consider using literal Unicode characters (same approach as homoglyphs):

💡 Optional: Use literal zero-width characters for accurate detection
   - severity: LOW
     description: "Zero-width characters"
-    pattern: '[\xE2\x80\x8B\xE2\x80\x8C\xE2\x80\x8D\xEF\xBB\xBF]'
+    pattern: '[​‌‍]'

Note: The replacement pattern contains literal U+200B (ZWSP), U+200C (ZWNJ), U+200D (ZWJ), and U+FEFF (BOM) — they appear invisible but are present.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/configs/prompt-injection-patterns.yaml around lines 436 - 440, The
"Zero-width characters" pattern currently uses byte-wise escapes
'[\xE2\x80\x8B\xE2\x80\x8C\xE2\x80\x8D\xEF\xBB\xBF]' which matches raw UTF-8
bytes instead of the actual Unicode codepoints; update the pattern to match the
real zero-width characters by replacing that character class with either literal
Unicode characters (U+200B U+200C U+200D U+FEFF) or with Unicode escapes such as
'\u200B\u200C\u200D\uFEFF' (ensuring the YAML quoting is valid) so the pattern
detects ZWSP, ZWNJ, ZWJ and BOM correctly in the "Zero-width characters" entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.agents/configs/prompt-injection-patterns.yaml:
- Around line 436-440: The "Zero-width characters" pattern currently uses
byte-wise escapes '[\xE2\x80\x8B\xE2\x80\x8C\xE2\x80\x8D\xEF\xBB\xBF]' which
matches raw UTF-8 bytes instead of the actual Unicode codepoints; update the
pattern to match the real zero-width characters by replacing that character
class with either literal Unicode characters (U+200B U+200C U+200D U+FEFF) or
with Unicode escapes such as '\u200B\u200C\u200D\uFEFF' (ensuring the YAML
quoting is valid) so the pattern detects ZWSP, ZWNJ, ZWJ and BOM correctly in
the "Zero-width characters" entry.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ced3f9c74df4d94a927cf6a995863decb42ac91 and d1dbdf75674b374b63a14927d924a313c958af0a.

📒 Files selected for processing (2)
  • .agents/configs/prompt-injection-patterns.yaml
  • .agents/scripts/prompt-guard-helper.sh

@alex-solovyev
Copy link
Collaborator Author

Merging: qlty check shows 'Build errored' which is an infrastructure-side error (not a code issue — the qlty build system errored, not our code). All other checks pass. Review is APPROVED. Resolves #2698.

@alex-solovyev
Copy link
Collaborator Author

PR has merge conflicts with main — needs rebase before merge. Dispatching worker to resolve.

…d (t1375.1)

Add _pg_load_yaml_patterns() with auto-detect and inline fallback, create
prompt-injection-patterns.yaml with 108 patterns (39 existing + 69 from
Lasso Security's claude-hooks MIT patterns covering homoglyph/Unicode,
fake JSON/XML system roles, HTML/code comment injection, priority
manipulation, fake delimiters, split personality, acrostic/steganographic,
and fake previous conversation claims), add scan-stdin subcommand for
pipeline use (curl | scan-stdin), extend test suite to 74 tests.

Closes #2698
…ompt-guard

- cmd_scan_stdin(): add TTY detection to warn and return early instead of
  blocking indefinitely when stdin is a terminal
- _pg_load_yaml_patterns(): replace space-only strip (${line// /}) with
  [[:space:]] regex to also catch tab-only lines
- prompt-injection-patterns.yaml: replace PCRE2-only \x{...} escapes in
  homoglyph patterns with literal Unicode chars for portability across
  rg, grep -P, and grep -E engines
- prompt-guard-helper.sh: fix misleading test header — LOW patterns under
  moderate policy produce WARN (exit 2), not ALLOW
Address CodeRabbit review: when stdin is a terminal, show a helpful
hint (Ctrl+D to end, Ctrl+C to cancel) instead of refusing to run.
This allows interactive use while still guiding the user.
@alex-solovyev alex-solovyev force-pushed the feature/yaml-pattern-loading branch from d1dbdf7 to 7f33771 Compare March 2, 2026 05:15
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 111 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Mar 2 05:15:54 UTC 2026: Code review monitoring started
Mon Mar 2 05:15:55 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 111

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 111
  • VULNERABILITIES: 0

Generated on: Mon Mar 2 05:15:57 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@alex-solovyev
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 2, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@alex-solovyev
Copy link
Collaborator Author

Supervisor pulse killed thrashing worker (PID 1872783, struggle_ratio 4831, 0 commits in 3h) and applied fixes directly:

  1. Duplicate YAML status section in cmd_status() — already removed in a prior commit (lines 1023-1031 no longer exist)
  2. Duplicate PROMPT_GUARD_YAML_PATTERNS in header comment (lines 35/37) — consolidated to single entry in commit 5b6bbf7
  3. Duplicate PROMPT_GUARD_YAML_PATTERNS in help text (lines 1474/1476) — consolidated to single entry in commit 5b6bbf7

CodeRabbit re-review triggered. New monitor worker dispatched to merge once approved.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 111 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Mar 2 05:50:28 UTC 2026: Code review monitoring started
Mon Mar 2 05:50:28 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 111

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 111
  • VULNERABILITIES: 0

Generated on: Mon Mar 2 05:50:31 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@alex-solovyev
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 2, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
.agents/scripts/prompt-guard-helper.sh (1)

150-229: ⚠️ Potential issue | 🔴 Critical

Remove the duplicate _pg_load_yaml_patterns() definition; it disables YAML auto-detection.

Line 249 redefines _pg_load_yaml_patterns() and overrides the new implementation from Line 150. That breaks the no-arg calls at Line 517 and Line 980, so YAML loading/status can silently degrade.

Proposed fix
-# YAML pattern file path (Lasso-compatible format)
-PROMPT_GUARD_YAML_PATTERNS="${PROMPT_GUARD_YAML_PATTERNS:-}"
-
-# Load patterns from YAML file (Lasso-compatible format)
-# YAML format: list of objects with fields: pattern, description, severity, category
-# Falls back silently if yq/python not available or file missing
-_pg_load_yaml_patterns() {
-	local yaml_file="$1"
-	...
-}
+# (Removed duplicate legacy implementation.
+# Keep the primary implementation at Lines 150-229.)
#!/bin/bash
set -euo pipefail
file=".agents/scripts/prompt-guard-helper.sh"

echo "Expect exactly ONE function definition:"
rg -n '^_pg_load_yaml_patterns\(\)' "$file"

echo
echo "Show all call sites:"
rg -n '_pg_load_yaml_patterns' "$file"

echo
echo "Inspect both definition regions:"
nl -ba "$file" | sed -n '140,290p'

As per coding guidelines: .agents/scripts/*.sh automation scripts should prioritize reliability and robustness.

Also applies to: 243-283, 515-518, 973-981

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/prompt-guard-helper.sh around lines 150 - 229, There are two
definitions of _pg_load_yaml_patterns causing the later one to override the new
implementation and break no-arg YAML auto-detection; remove or consolidate the
duplicate so only the correct implementation remains (keep the version that uses
_PG_YAML_PATTERNS_LOADED and _PG_YAML_PATTERNS_CACHE with the YAML parsing
logic), ensure callers of _pg_load_yaml_patterns (the no-arg invocations) still
work, and verify by searching for all occurrences of _pg_load_yaml_patterns and
confirming a single definition and that the caching vars
(_PG_YAML_PATTERNS_LOADED/_PG_YAML_PATTERNS_CACHE) are set/used as in the
intended implementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/prompt-guard-helper.sh:
- Around line 751-753: The cmd_scan_stdin() function must reject TTY input
immediately instead of warning and continuing; change the TTY branch that
currently calls _pg_log_warn to call an error logger (e.g., _pg_log_error) with
a clear message like "scan-stdin requires piped input, not a TTY" and then exit
with a non-zero status (e.g., exit 1) so the function fails fast when [[ -t 0 ]]
is true; update the logic around the existing _pg_log_warn and the subsequent
cat usage in cmd_scan_stdin to enforce this early exit.
- Around line 151-165: The flag _PG_YAML_PATTERNS_LOADED is set too early
causing transient _pg_find_yaml_patterns failures to be treated as permanent;
change the flow in the function that checks/loads YAML patterns so you only set
_PG_YAML_PATTERNS_LOADED="true" after _pg_find_yaml_patterns succeeds and the
resulting patterns are stored in _PG_YAML_PATTERNS_CACHE (use _pg_log_info on
failure as currently done), leaving the loaded flag unset on error to allow
retries; reference _PG_YAML_PATTERNS_LOADED, _PG_YAML_PATTERNS_CACHE,
_pg_find_yaml_patterns and _pg_log_info when making this change.

---

Duplicate comments:
In @.agents/scripts/prompt-guard-helper.sh:
- Around line 150-229: There are two definitions of _pg_load_yaml_patterns
causing the later one to override the new implementation and break no-arg YAML
auto-detection; remove or consolidate the duplicate so only the correct
implementation remains (keep the version that uses _PG_YAML_PATTERNS_LOADED and
_PG_YAML_PATTERNS_CACHE with the YAML parsing logic), ensure callers of
_pg_load_yaml_patterns (the no-arg invocations) still work, and verify by
searching for all occurrences of _pg_load_yaml_patterns and confirming a single
definition and that the caching vars
(_PG_YAML_PATTERNS_LOADED/_PG_YAML_PATTERNS_CACHE) are set/used as in the
intended implementation.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7f33771 and 5b6bbf7.

📒 Files selected for processing (1)
  • .agents/scripts/prompt-guard-helper.sh

…ordering

- Remove legacy _pg_load_yaml_patterns() definition (lines 249-282) that
  overrode the new implementation and broke no-arg YAML auto-detection
- Move _PG_YAML_PATTERNS_LOADED=true to after _pg_find_yaml_patterns succeeds,
  preventing transient failures from permanently disabling YAML loading
- cmd_scan_stdin: reject TTY input with error+exit instead of warn+continue
@alex-solovyev
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 2, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 111 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Mar 2 06:00:04 UTC 2026: Code review monitoring started
Mon Mar 2 06:00:05 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 111

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 111
  • VULNERABILITIES: 0

Generated on: Mon Mar 2 06:00:08 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
.agents/scripts/prompt-guard-helper.sh (2)

357-374: 🛠️ Refactor suggestion | 🟠 Major

Simplify _pg_get_patterns() after removing duplicate function.

Once the yq/python _pg_load_yaml_patterns() at lines 249–282 is removed, this function's YAML-loading logic (lines 359–369) becomes orphaned — it calls a signature that no longer exists.

Since _pg_scan_message() already handles YAML-vs-inline routing at lines 516–524, this function can be simplified to just return inline patterns:

Simplify to inline-only fallback
 _pg_get_patterns() {
-	# Try YAML patterns first if configured
-	local yaml_file="${PROMPT_GUARD_YAML_PATTERNS:-}"
-	if [[ -n "$yaml_file" ]]; then
-		local yaml_patterns
-		yaml_patterns=$(_pg_load_yaml_patterns "$yaml_file" 2>/dev/null) || true
-		if [[ -n "$yaml_patterns" ]]; then
-			echo "$yaml_patterns"
-			_pg_log_info "Loaded patterns from YAML: $yaml_file" 2>/dev/null || true
-			return 0
-		fi
-		_pg_log_warn "YAML patterns unavailable ($yaml_file), falling back to inline patterns" 2>/dev/null || true
-	fi
-
-	# Fallback: inline patterns (always available)
+	# Inline patterns (YAML handled by _pg_scan_message)
 	_pg_get_inline_patterns
 	return 0
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/prompt-guard-helper.sh around lines 357 - 374,
_pg_get_patterns currently tries to call the removed helper
_pg_load_yaml_patterns and contains orphaned YAML-loading logic; simplify it to
always return inline patterns by removing the YAML-related block and just
invoking _pg_get_inline_patterns and returning its result, leaving YAML vs
inline routing to _pg_scan_message (which already handles
PROMPT_GUARD_YAML_PATTERNS).

1384-1406: ⚠️ Potential issue | 🟡 Minor

Test YAML format doesn't match pure-bash parser expectations.

The test creates a YAML file with a flat patterns: list (lines 1384–1390), but the pure-bash parser (lines 177–210) expects category-keyed blocks:

yaml_test:
  - severity: "HIGH"
    description: "Test YAML pattern"
    pattern: 'YAML_TEST_PATTERN_12345'

Once the duplicate yq/python function is removed, this test will fail because the pure-bash parser won't recognize the flat structure.

Update test YAML to match expected format
 		cat >"$tmp_yaml" <<'YAML_EOF'
-patterns:
+yaml_test:
   - pattern: "YAML_TEST_PATTERN_12345"
+    severity: "HIGH"
     description: "Test YAML pattern"
-    severity: "HIGH"
-    category: "yaml_test"
 YAML_EOF
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/prompt-guard-helper.sh around lines 1384 - 1406, The
temporary YAML used for the unit test (written to $tmp_yaml and assigned to
PROMPT_GUARD_YAML_PATTERNS before calling _pg_scan_message) uses a flat
"patterns:" list but the pure-bash parser expects a category-keyed mapping
(e.g., "yaml_test:" followed by a list with items containing severity,
description, and pattern). Update the here-doc that creates "$tmp_yaml" so it
uses the category-keyed format (yaml_test: - severity: "HIGH" description: "Test
YAML pattern" pattern: 'YAML_TEST_PATTERN_12345') so the parser invoked by
_pg_scan_message recognizes the test entry; keep the surrounding logic that sets
PROMPT_GUARD_YAML_PATTERNS and restores $saved_yaml unchanged.
♻️ Duplicate comments (3)
.agents/scripts/prompt-guard-helper.sh (3)

751-753: ⚠️ Potential issue | 🟠 Major

cmd_scan_stdin() must reject TTY input and fail fast.

This function is explicitly for pipeline use (curl ... | prompt-guard-helper.sh scan-stdin). When stdin is a TTY, the current code logs a warning but proceeds to cat, causing the function to block unexpectedly. The warning message misleads users into thinking interactive input is supported.

Per coding guidelines: .agents/scripts/*.sh must enforce proper exit codes and robustness for automation.

Fail fast on TTY
 cmd_scan_stdin() {
 	if [[ -t 0 ]]; then
-		_pg_log_warn "Reading from stdin (Ctrl+D to end, Ctrl+C to cancel)"
+		_pg_log_error "scan-stdin requires piped input, not a terminal"
+		return 1
 	fi

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/prompt-guard-helper.sh around lines 751 - 753, The
cmd_scan_stdin() function currently treats a TTY as a warning and continues to
cat stdin, which blocks; change it to fail fast when stdin is a TTY by replacing
the warning branch with an error: call _pg_log_error (or similar) with a clear
message that TTY input is unsupported and then exit/return immediately with a
non-zero status (e.g., exit 2) so the function never proceeds to cat; update any
tests or callers of cmd_scan_stdin() that expect a non-zero exit on TTY.

243-283: ⚠️ Potential issue | 🔴 Critical

Critical: Remove duplicate _pg_load_yaml_patterns() definition — new implementation is dead code.

Lines 249–282 redefine _pg_load_yaml_patterns() to expect a $1 (yaml_file path), overriding the new pure-bash implementation at lines 150–229. In bash, the last definition wins.

The call at line 517 invokes _pg_load_yaml_patterns 2>/dev/null without arguments, but this effective function expects $1. With empty $1, the file check fails immediately, and YAML patterns never load.

Additionally, line 244 re-declares PROMPT_GUARD_YAML_PATTERNS (already declared at line 67).

Remove the duplicate function and redundant variable
-# YAML pattern file path (Lasso-compatible format)
-PROMPT_GUARD_YAML_PATTERNS="${PROMPT_GUARD_YAML_PATTERNS:-}"
-
-# Load patterns from YAML file (Lasso-compatible format)
-# YAML format: list of objects with fields: pattern, description, severity, category
-# Falls back silently if yq/python not available or file missing
-_pg_load_yaml_patterns() {
-	local yaml_file="$1"
-
-	if [[ ! -f "$yaml_file" ]]; then
-		return 1
-	fi
-
-	# Try yq first (fastest)
-	if command -v yq &>/dev/null; then
-		yq -r '.patterns[] | "\(.severity // "MEDIUM")|\(.category // "yaml_pattern")|\(.description // "YAML pattern")|\(.pattern)"' "$yaml_file" 2>/dev/null && return 0
-	fi
-
-	# Try python3 with PyYAML
-	if command -v python3 &>/dev/null; then
-		python3 -c "
-import yaml, sys
-try:
-    with open('$yaml_file') as f:
-        data = yaml.safe_load(f)
-    for p in data.get('patterns', []):
-        sev = p.get('severity', 'MEDIUM').upper()
-        cat = p.get('category', 'yaml_pattern')
-        desc = p.get('description', 'YAML pattern')
-        pat = p.get('pattern', '')
-        if pat:
-            print(f'{sev}|{cat}|{desc}|{pat}')
-except Exception:
-    sys.exit(1)
-" 2>/dev/null && return 0
-	fi
-
-	# No YAML parser available
-	return 1
-}

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/prompt-guard-helper.sh around lines 243 - 283, Remove the
duplicate definition of _pg_load_yaml_patterns and the redundant
PROMPT_GUARD_YAML_PATTERNS re-declaration: keep the original pure-bash
implementation (the one that correctly handles being called with no arguments)
and delete the later yq/python-based redefinition so the call site
`_pg_load_yaml_patterns 2>/dev/null` works as intended; also remove the second
`PROMPT_GUARD_YAML_PATTERNS="${PROMPT_GUARD_YAML_PATTERNS:-}"` line to avoid
shadowing the earlier declaration.

151-165: ⚠️ Potential issue | 🟠 Major

Move _PG_YAML_PATTERNS_LOADED assignment after successful pattern loading.

Setting _PG_YAML_PATTERNS_LOADED="true" at line 159 before file resolution and parsing means transient failures (e.g., network-mounted filesystem temporarily unavailable) are treated as permanent. Subsequent calls will skip retry and return failure immediately.

Per coding guidelines, .agents/scripts/*.sh requires error recovery mechanisms.

Suggested fix
 _pg_load_yaml_patterns() {
 	if [[ "$_PG_YAML_PATTERNS_LOADED" == "true" ]]; then
 		if [[ -n "$_PG_YAML_PATTERNS_CACHE" ]]; then
 			echo "$_PG_YAML_PATTERNS_CACHE"
 			return 0
 		fi
 		return 1
 	fi
 
-	_PG_YAML_PATTERNS_LOADED="true"
-
 	local yaml_file
 	yaml_file=$(_pg_find_yaml_patterns) || {
 		_pg_log_info "YAML patterns not found, using inline fallback"
 		return 1
 	}

Then at line 224, after caching:

 	# Cache for subsequent calls
 	_PG_YAML_PATTERNS_CACHE="$patterns"
+	_PG_YAML_PATTERNS_LOADED="true"
 
 	# Remove trailing newline
 	echo "${patterns%$'\n'}"

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/prompt-guard-helper.sh around lines 151 - 165, Currently
_PG_YAML_PATTERNS_LOADED is set to "true" before attempting to resolve and parse
YAML patterns, which prevents retries on transient failures; move the assignment
so it occurs only after successful loading and caching of patterns: call
_pg_find_yaml_patterns and verify success, populate _PG_YAML_PATTERNS_CACHE,
then set _PG_YAML_PATTERNS_LOADED="true" (and only then return the cached
value); update logic around the _pg_find_yaml_patterns invocation and the cache
handling to ensure failures do not flip _PG_YAML_PATTERNS_LOADED.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/prompt-guard-helper.sh:
- Around line 1016-1020: The reference to yaml_patterns at the echo block can
crash the logic because yaml_patterns is only set inside the if [[ -n
"$yaml_file" ]] branch; initialize yaml_patterns (and related counters like
yaml_total) to sensible defaults (e.g., yaml_patterns='' and yaml_total=0)
before the if [[ -n "$yaml_file" ]] block so the later check if [[ -n
"$yaml_file" && -n "${yaml_patterns:-}" ]] behaves correctly even when no YAML
file was provided; update any downstream logic that appends or counts patterns
to use these initialized variables (symbols: yaml_patterns, yaml_total,
yaml_file).

---

Outside diff comments:
In @.agents/scripts/prompt-guard-helper.sh:
- Around line 357-374: _pg_get_patterns currently tries to call the removed
helper _pg_load_yaml_patterns and contains orphaned YAML-loading logic; simplify
it to always return inline patterns by removing the YAML-related block and just
invoking _pg_get_inline_patterns and returning its result, leaving YAML vs
inline routing to _pg_scan_message (which already handles
PROMPT_GUARD_YAML_PATTERNS).
- Around line 1384-1406: The temporary YAML used for the unit test (written to
$tmp_yaml and assigned to PROMPT_GUARD_YAML_PATTERNS before calling
_pg_scan_message) uses a flat "patterns:" list but the pure-bash parser expects
a category-keyed mapping (e.g., "yaml_test:" followed by a list with items
containing severity, description, and pattern). Update the here-doc that creates
"$tmp_yaml" so it uses the category-keyed format (yaml_test: - severity: "HIGH"
description: "Test YAML pattern" pattern: 'YAML_TEST_PATTERN_12345') so the
parser invoked by _pg_scan_message recognizes the test entry; keep the
surrounding logic that sets PROMPT_GUARD_YAML_PATTERNS and restores $saved_yaml
unchanged.

---

Duplicate comments:
In @.agents/scripts/prompt-guard-helper.sh:
- Around line 751-753: The cmd_scan_stdin() function currently treats a TTY as a
warning and continues to cat stdin, which blocks; change it to fail fast when
stdin is a TTY by replacing the warning branch with an error: call _pg_log_error
(or similar) with a clear message that TTY input is unsupported and then
exit/return immediately with a non-zero status (e.g., exit 2) so the function
never proceeds to cat; update any tests or callers of cmd_scan_stdin() that
expect a non-zero exit on TTY.
- Around line 243-283: Remove the duplicate definition of _pg_load_yaml_patterns
and the redundant PROMPT_GUARD_YAML_PATTERNS re-declaration: keep the original
pure-bash implementation (the one that correctly handles being called with no
arguments) and delete the later yq/python-based redefinition so the call site
`_pg_load_yaml_patterns 2>/dev/null` works as intended; also remove the second
`PROMPT_GUARD_YAML_PATTERNS="${PROMPT_GUARD_YAML_PATTERNS:-}"` line to avoid
shadowing the earlier declaration.
- Around line 151-165: Currently _PG_YAML_PATTERNS_LOADED is set to "true"
before attempting to resolve and parse YAML patterns, which prevents retries on
transient failures; move the assignment so it occurs only after successful
loading and caching of patterns: call _pg_find_yaml_patterns and verify success,
populate _PG_YAML_PATTERNS_CACHE, then set _PG_YAML_PATTERNS_LOADED="true" (and
only then return the cached value); update logic around the
_pg_find_yaml_patterns invocation and the cache handling to ensure failures do
not flip _PG_YAML_PATTERNS_LOADED.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7f33771 and 5b6bbf7.

📒 Files selected for processing (1)
  • .agents/scripts/prompt-guard-helper.sh

- _pg_get_patterns: remove orphaned YAML-loading logic (called old
  _pg_load_yaml_patterns signature with arg); YAML routing is handled
  by _pg_scan_message which calls _pg_load_yaml_patterns directly
- Test YAML: switch from flat patterns: list to category-keyed block
  format matching the pure-bash parser expectations; severity must be
  the list item start trigger; remove yq/python3 availability gate
  since pure-bash parser needs no external tools
- Reset _PG_YAML_PATTERNS_LOADED/_CACHE around test to prevent cache
  contamination between test runs
@alex-solovyev
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 2, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 111 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Mar 2 06:06:15 UTC 2026: Code review monitoring started
Mon Mar 2 06:06:19 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 111

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 111
  • VULNERABILITIES: 0

Generated on: Mon Mar 2 06:06:21 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 2, 2026

@alex-solovyev
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 2, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Auto-created from TODO.md tag

Projects

None yet

Development

Successfully merging this pull request may close these issues.

t1375.1: Add YAML pattern loading and merge Lasso patterns

1 participant