Skip to content

t4954: Add section 8.4 column-level query discipline and URL credential detection#4956

Merged
marcusquinn merged 2 commits intomainfrom
bugfix/t4954-column-query-discipline
Mar 15, 2026
Merged

t4954: Add section 8.4 column-level query discipline and URL credential detection#4956
marcusquinn merged 2 commits intomainfrom
bugfix/t4954-column-query-discipline

Conversation

@alex-solovyev
Copy link
Collaborator

@alex-solovyev alex-solovyev commented Mar 15, 2026

Summary

  • Add section 8.4 to prompts/build.txt — column-level query discipline for application config tables (webhook settings, integration records, OAuth configs) that may contain embedded credentials in URL query parameters
  • Extend prompt-guard-helper.sh with credential_exposure pattern category detecting secrets in URL query params (?secret=, ?token=, ?api_key=, ?password=, ?access_token=, ?auth=, ?client_secret=, ?webhook_secret=)
  • Add URL credential redaction to the sanitize function — replaces 8+ character values with [REDACTED]
  • Add patterns to prompt-injection-patterns.yaml (YAML source) for comprehensive coverage

Approach

Option C from the issue — layered defense:

  1. Prompt rule (section 8.4): teaches metadata-first querying habit — query schema/keys first, then extract only non-credential fields via targeted selectors. Prevents most cases by changing the default approach from "query everything" to "query metadata first, selectively extract."
  2. Output redaction tooling: catches what the rule misses via deterministic pattern matching on URL query parameters. Different failure mode from the prompt rule (pattern matching vs. judgment), so correlated failures are unlikely.

Testing

  • 9 new tests added (7 detection + 2 sanitization), all passing
  • Zero regressions: main has 93/21/114 (pass/fail/total), this branch has 102/21/123
  • ShellCheck clean (only pre-existing SC1091 info)

Files Changed

File Change
.agents/prompts/build.txt Add section 8.4 with column-level query discipline rules
.agents/scripts/prompt-guard-helper.sh Add credential_exposure patterns, URL redaction in sanitize, 9 tests
.agents/configs/prompt-injection-patterns.yaml Add credential_exposure category with 8 URL param patterns

Closes #4954

Summary by CodeRabbit

  • New Features

    • Detects credential leakage via URL query parameters (secrets, tokens, API keys, passwords, access tokens) and redacts sensitive values while preserving parameter names.
  • Documentation

    • Expanded guidance on handling embedded credentials in application configuration with concrete safe/unsafe query examples and remediation steps.
  • Tests

    • Added tests verifying detection and redaction of URL-based credentials.

…credential detection (#4954)

Add layered defense (Option C) against credential exposure from application
config tables that store secrets in URL query parameters.

Layer 1 - Prompt rule (build.txt section 8.4):
- Never fetch raw record values from webhook/integration/OAuth config tables
- Query schema/keys first, then extract only non-credential fields
- Treat any URL field in application config as potentially containing credentials

Layer 2 - Output redaction tooling:
- Add credential_exposure category to prompt-guard-helper.sh (inline + YAML)
- Detect secrets in URL query params: secret=, token=, api_key=, password=,
  access_token=, auth=, client_secret=, webhook_secret=
- Redact matched values in sanitize function (replaces 8+ char values with [REDACTED])
- 9 new tests (7 detection + 2 sanitization), all passing, zero regressions

Closes #4954
@github-actions github-actions bot added the enhancement Auto-created from TODO.md tag label Mar 15, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the system's ability to prevent sensitive credential exposure, particularly from application configuration data. It implements a two-pronged approach: first, by educating the system on safer querying practices to avoid fetching raw credential fields, and second, by introducing robust pattern-matching and redaction tools to automatically identify and mask credentials found in URL query parameters within output, thereby bolstering overall security.

Highlights

  • New Prompt Rule: Introduced section 8.4 in prompts/build.txt to guide column-level query discipline for application configuration tables, preventing accidental credential exposure.
  • Credential Detection: Extended prompt-guard-helper.sh with a credential_exposure pattern category to detect secrets in URL query parameters (e.g., ?secret=, ?token=).
  • URL Redaction: Added functionality to the sanitize function in prompt-guard-helper.sh to redact 8+ character credential values found in URL query parameters, replacing them with [REDACTED].
  • Pattern Coverage: Updated prompt-injection-patterns.yaml with new patterns under the credential_exposure category for comprehensive detection.
Changelog
  • .agents/configs/prompt-injection-patterns.yaml
    • Expanded with a new credential_exposure category containing 8 specific patterns for detecting secrets in URL query parameters.
  • .agents/prompts/build.txt
    • Added section 8.4, which outlines column-level query discipline to prevent credential exposure in application configuration tables.
  • .agents/scripts/prompt-guard-helper.sh
    • Integrated the credential_exposure pattern category.
    • Implemented URL credential redaction within the _pg_sanitize_message function.
    • Included 9 new tests for detection and sanitization.
Activity
  • Added 9 new tests (7 for detection, 2 for sanitization), all of which are passing.
  • Confirmed zero regressions, with the branch showing 102 passing tests, 21 failing tests, and 123 total tests, consistent with the main branch's failure rate.
  • Ensured the codebase remains ShellCheck clean, with only pre-existing SC1091 informational messages.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 15, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

Adds credential-exposure detection and redaction: new t4954 patterns for URL query secrets, prompt guidance section 8.4 to query config metadata instead of raw values, and sanitizer updates to redact credential query parameters before outputs enter the conversation.

Changes

Cohort / File(s) Summary
Pattern Detection
.agents/configs/prompt-injection-patterns.yaml
Appends a new CREDENTIAL_EXPOSURE (t4954) category with eight MEDIUM-severity PCRE patterns for URL query parameters (secret, token, api_key/apikey/api-key, password, access_token, auth/authorization, client_secret, webhook_secret).
Prompt Guidance
.agents/prompts/build.txt
Adds section 8.4 describing querying application config safely: prefer schema/keys metadata, targeted selectors, examples for WordPress/FluentForm webhooks, and guidance to redact credentials before display.
Output Sanitization & Tests
.agents/scripts/prompt-guard-helper.sh
Includes credential_exposure in inline patterns, implements _pg_sanitize_message redaction for the new query-parameter names (replacing values with [REDACTED]), and adds tests verifying detection and sanitization behavior.

Sequence Diagram

sequenceDiagram
    participant Agent as Agent Process
    participant Cmd as Command Execution
    participant RawOut as Raw Output
    participant Guard as prompt-guard-helper (Sanitizer)
    participant Conv as Conversation Context

    Agent->>Cmd: Execute DB query (e.g., SELECT webhook config)
    Cmd->>RawOut: Returns config JSON with URL query params
    RawOut->>Guard: Stream contains secret=VALUE or token=VALUE
    Guard->>Guard: Match credential_exposure patterns
    Guard->>Guard: Redact values → secret=[REDACTED]
    Guard->>Conv: Emit sanitized output
    Conv->>Agent: Safe transcript for agent reasoning
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🔐 Query strings whisper secrets in line,
Now patterns listen and quietly confine.
Metadata first, then sanitize the trace—
Redacted values keep the convo safe place.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: adding section 8.4 for column-level query discipline and URL credential detection, which is the primary objective across all modified files.
Linked Issues check ✅ Passed The PR fully implements Option C (layered defense): section 8.4 adds prompt-level column-level query discipline, credential_exposure patterns detect URL query secrets, and sanitization redacts exposed credentials.
Out of Scope Changes check ✅ Passed All changes directly support the core objective of preventing credential exposure via application config queries. No extraneous modifications or scope creep detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bugfix/t4954-column-query-discipline
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 362 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sun Mar 15 20:35:39 UTC 2026: Code review monitoring started
Sun Mar 15 20:35:39 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 362

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 362
  • VULNERABILITIES: 0

Generated on: Sun Mar 15 20:35:42 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces important safeguards against credential exposure from application configuration. The new prompt rules, detection patterns, and output sanitization create a robust, layered defense. My review focuses on ensuring consistency across these new mechanisms. I've pointed out that the key= parameter, mentioned in the new prompt guidelines, is missing from the detection patterns and sanitization logic. Addressing this will make the protection more comprehensive and consistent with the documentation.

Comment on lines +636 to +637
description: "URL query param: api_key/apikey"
pattern: '[?&](api_key|apikey|api-key)=[^&\s]{8,}'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

For consistency with the new rule in .agents/prompts/build.txt (section 8.4, line 295), which mentions ?key=, this pattern should also detect key=. The current pattern only detects api_key, apikey, and api-key. The description should also be updated to reflect this change.

    description: "URL query param: key/api_key/apikey"
    pattern: '[?&](key|api_key|apikey|api-key)=[^&\s]{8,}'
References
  1. This rule emphasizes the need to include all known variations of sensitive patterns (like key=) for comprehensive security detection and sanitization.
  2. This rule promotes consistency by ensuring that if a new sensitive parameter (key=) is identified, all relevant detection and sanitization logic across the codebase is updated.

LOW|system_prompt_extraction|Prompt leak via translation|([Tt]ranslate|[Cc]onvert)\s+(your\s+)?(system\s+)?(prompt|instructions|rules)\s+(to|into)\s+(French|Spanish|Chinese|another\s+language)
MEDIUM|credential_exposure|URL query param: secret|[?&]secret=[^&\s]{8,}
MEDIUM|credential_exposure|URL query param: token|[?&]token=[^&\s]{8,}
MEDIUM|credential_exposure|URL query param: key|[?&](api_key|apikey|api-key)=[^&\s]{8,}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

There are a couple of inconsistencies on this line:

  1. The description is "URL query param: key", but the pattern does not match key=.
  2. The pattern is missing key= which is mentioned as a sensitive parameter in .agents/prompts/build.txt.

To ensure all documented credential parameters are detected, please update the pattern and description for consistency.

Suggested change
MEDIUM|credential_exposure|URL query param: key|[?&](api_key|apikey|api-key)=[^&\s]{8,}
MEDIUM|credential_exposure|URL query param: key/api_key|[?&](key|api_key|apikey|api-key)=[^&\s]{8,}
References
  1. This rule emphasizes the need to include all known variations of sensitive patterns (like key=) for comprehensive security detection and sanitization.
  2. This rule promotes consistency by ensuring that if a new sensitive parameter (key=) is identified, all relevant detection and sanitization logic across the codebase is updated.


# Redact credential values in URL query parameters (t4954)
# Matches ?secret=VALUE or &token=VALUE etc. and replaces VALUE with [REDACTED]
sanitized=$(printf '%s' "$sanitized" | sed -E 's/([?&](secret|token|api_key|apikey|api-key|password|access_token|auth|authorization|client_secret|webhook_secret)=)[^&[:space:]]{8,}/\1[REDACTED]/g')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The sanitization regex is missing the key parameter, which is mentioned as a sensitive parameter in the new section 8.4 of .agents/prompts/build.txt. To ensure consistent redaction of all potential credentials, key should be added to the list of parameters in the regex.

Suggested change
sanitized=$(printf '%s' "$sanitized" | sed -E 's/([?&](secret|token|api_key|apikey|api-key|password|access_token|auth|authorization|client_secret|webhook_secret)=)[^&[:space:]]{8,}/\1[REDACTED]/g')
sanitized=$(printf '%s' "$sanitized" | sed -E 's/([?&](key|secret|token|api_key|apikey|api-key|password|access_token|auth|authorization|client_secret|webhook_secret)=)[^&[:space:]]{8,}/\1[REDACTED]/g')
References
  1. This rule emphasizes the need to include all known variations of sensitive patterns (like key=) for comprehensive security detection and sanitization.
  2. This rule promotes consistency by ensuring that if a new sensitive parameter (key=) is identified, all relevant detection and sanitization logic across the codebase is updated.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/prompt-guard-helper.sh:
- Around line 1496-1501: Replace the real-looking credential literals used in
the _test_expect calls (e.g., test descriptions "URL with ?secret= param", "URL
with &token= param", etc.) with clearly fake placeholders that preserve
length/shape for regex testing (for example "REDACTED_SECRET_PLACEHOLDER",
"REDACTED_TOKEN_PLACEHOLDER", "REDACTED_AWS_KEY_PLACEHOLDER",
"REDACTED_PASSWORD_PLACEHOLDER", "REDACTED_JWT_PLACEHOLDER",
"REDACTED_CLIENT_SECRET_PLACEHOLDER"); update the six _test_expect invocations
so the URL query values are non-sensitive placeholders while still exercising
the same patterns and add a brief comment near these _test_expect lines
referencing that real secrets must be stored securely (env/secret manager)
rather than in test literals.
- Line 1502: The test call to _test_expect ("Short param value (no match)") uses
"key=abc" which is not a tracked/detected parameter so the length check isn't
exercised; update the URL argument passed to _test_expect to use one of the
script's tracked parameter names (the same param name used in other tests in
this file) and give it a short value under the threshold (e.g., param=ab) so the
short-value branch in the parameter length validation is actually tested.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8710458e-994f-47f8-ae19-67bf8b5fc3af

📥 Commits

Reviewing files that changed from the base of the PR and between fdaa862 and 2e13f6c.

📒 Files selected for processing (3)
  • .agents/configs/prompt-injection-patterns.yaml
  • .agents/prompts/build.txt
  • .agents/scripts/prompt-guard-helper.sh

_test_expect "URL with ?password= param" 2 "https://service.example.com/auth?password=SuperSecret123!"
_test_expect "URL with ?access_token= param" 2 "https://api.example.com/data?access_token=eyJhbGciOiJIUzI1NiJ9"
_test_expect "URL with ?client_secret= param" 2 "https://oauth.example.com/token?client_secret=cs_abcdef123456789"
_test_expect "Short param value (no match)" 0 "https://example.com/page?key=abc"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Test intent mismatch: this case does not validate the “short value” threshold.

Line 1502 uses key=abc, but key is not one of the detected parameters, so this passes even if length logic regresses. Use a tracked parameter with a short value.

Suggested patch
-	_test_expect "Short param value (no match)" 0 "https://example.com/page?key=abc"
+	_test_expect "Short param value (no match)" 0 "https://example.com/page?secret=abc"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
_test_expect "Short param value (no match)" 0 "https://example.com/page?key=abc"
_test_expect "Short param value (no match)" 0 "https://example.com/page?secret=abc"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/prompt-guard-helper.sh at line 1502, The test call to
_test_expect ("Short param value (no match)") uses "key=abc" which is not a
tracked/detected parameter so the length check isn't exercised; update the URL
argument passed to _test_expect to use one of the script's tracked parameter
names (the same param name used in other tests in this file) and give it a short
value under the threshold (e.g., param=ab) so the short-value branch in the
parameter length validation is actually tested.

…es and fix short-value threshold test

- Lines 1497-1501: prefix test literals with FAKE_SK_LIVE_, FAKE_AKIA_, FAKE_JWT_, FAKE_CS_
  to make it unambiguous these are test fixtures, not real credentials (CodeRabbit CHANGES_REQUESTED)
- Line 1502: change 'key=abc' to 'secret=abc' — 'key' is not a tracked parameter so the
  test never validated the short-value threshold; 'secret' is tracked (line 330) and 'abc'
  (3 chars) is below the 8-char minimum, correctly producing no match

Closes #4954
@github-actions
Copy link
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 364 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sun Mar 15 20:48:02 UTC 2026: Code review monitoring started
Sun Mar 15 20:48:03 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 364

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 364
  • VULNERABILITIES: 0

Generated on: Sun Mar 15 20:48:05 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

@marcusquinn marcusquinn merged commit 8afdae8 into main Mar 15, 2026
18 of 19 checks passed
@marcusquinn marcusquinn deleted the bugfix/t4954-column-query-discipline branch March 15, 2026 20:51
alex-solovyev added a commit that referenced this pull request Mar 15, 2026
- Add `key` to credential_exposure detection pattern (inline + YAML)
- Add `key` to URL credential sanitization regex
- Replace credential-like test literals with PLACEHOLDER_ prefixed values
  to avoid Gitleaks/secret-scanner false positives
- Add test for new key= parameter detection
- Short param test already used tracked param (secret=abc), confirmed correct

Closes #4959
alex-solovyev added a commit that referenced this pull request Mar 15, 2026
#4967)

- Add `key` to credential_exposure detection pattern (inline + YAML)
- Add `key` to URL credential sanitization regex
- Replace credential-like test literals with PLACEHOLDER_ prefixed values
  to avoid Gitleaks/secret-scanner false positives
- Add test for new key= parameter detection
- Short param test already used tracked param (secret=abc), confirmed correct

Closes #4959
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Auto-created from TODO.md tag needs-review-fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Secret exposure gap: DB queries returning application config with embedded credentials (post-t4939)

2 participants