t2846: Add secret-leaking prevention guardrails for agent conversations by alex-solovyev · Pull Request #2847 · marcusquinn/aidevops

alex-solovyev · 2026-03-04T20:18:51Z

Summary

Adds comprehensive rules to prompts/build.txt preventing agents from suggesting or running commands that expose secret values in conversation transcripts
Replaces the previous single-line command blocklist with a principle-based approach covering password managers, env dumps, container inspection, cloud CLI tools, and scripting one-liners
Adds safe alternatives for debugging env var issues (key names only, never values)
Adds pre-staging guidance so credential lookups happen in the user's terminal, not in conversation
Adds credential-paste detection: when a user pastes a credential value, the agent warns about compromise and suggests rotation

Why

During an ILDS session, the agent repeatedly suggested commands (gopass show, pm2 env, cat dump.pm2) that would expose secret values in the conversation transcript. Credentials were exposed and had to be rotated. The existing rule (line 168) only listed 3 specific commands — this expands to a principle-based approach that covers the full attack surface.

Design decisions

Principle over blocklist: The rule says "apply judgment to ANY command that could print credentials" rather than relying on an exhaustive list. This follows the "Intelligence Over Determinism" framework principle — the examples teach the pattern, but the agent should catch novel violations too.
Safe alternatives included: Rather than just saying "don't do X", the rules show exactly how to debug env var issues safely (key names only).
No new scripts or automation: This is a prompt-level guardrail, not a deterministic filter. The agent's judgment is the enforcement mechanism, which is appropriate for this class of problem (infinite command variations that could leak secrets).

Closes #2846

Summary by CodeRabbit

Documentation
- Expanded and clarified security guidelines for handling sensitive credentials, including prohibited commands, safer debugging practices, and procedures for managing pasted credentials.

…s (t2846) Add comprehensive rules to prompts/build.txt preventing agents from suggesting or running commands that expose secret values in conversation transcripts. Replaces the previous single-line blocklist with: - Principle-based rule (not just a command blocklist) with common violations - Safe alternatives for debugging env var issues (key names only) - Pre-staging guidance for credential lookups in user's terminal - Credential-paste detection with rotation warning Closes #2846

coderabbitai · 2026-03-04T20:19:10Z

Warning

Rate limit exceeded

@alex-solovyev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 11 minutes and 53 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8f3d2195-4ee5-4a31-8e02-73be19a63296

📥 Commits

Reviewing files that changed from the base of the PR and between 2de205e and a207e7e.

📒 Files selected for processing (1)

.agents/prompts/build.txt

Walkthrough

This PR enhances security guidance in the agent prompt configuration by replacing a single-line prohibition on secret-exposing commands with a detailed security policy block. The expansion covers threat modeling, root causes, prohibited command patterns, safer debugging practices, and credential handling protocols.

Changes

Cohort / File(s)	Summary
Secret-handling security policy `.agents/prompts/build.txt`	Replaced brief defensive guideline with comprehensive section `#8` detailing secret value leaking prevention: includes threat description, root cause analysis, expanded prohibited command list (`gopass show`, `pm2 env`, `echo $SECRET` variants), safer debugging alternatives, credential lookup protocols, and guidance for handling user-pasted credentials.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

PR #2039: Centralizes security/secret-handling rules in prompts/build.txt as single source-of-truth
PR #2048: Tightens language in the same prompts/build.txt secrets/security guidance block
PR #2710: Adds new security-rule blocks to prompts/build.txt (prompt-injection rules alongside secret-handling)

Suggested labels

enhancement

Poem

🔐 Secrets sealed within the shell,
No gopass show shall break the spell,
Credentials guarded, warnings clear,
The agent learns what not to sear.
t2846 protects us all. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly addresses the main change: adding secret-leaking prevention guardrails to agent conversation prompts, which aligns with the primary objective of preventing credential exposure.
Linked Issues check	✅ Passed	The PR implements all key coding requirements from issue `#2846`: principle-based command blocklist, safe env-var debugging alternatives, credential-paste detection warnings, and pre-staging guidance in prompts/build.txt.
Out of Scope Changes check	✅ Passed	All changes are within scope: the PR modifies only prompts/build.txt with security guardrails directly addressing secret-leaking prevention, with no unrelated alterations.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/secret-leaking-guardrails

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-04T20:19:16Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the security posture of agent conversations by implementing robust guardrails against accidental secret exposure. It shifts from a limited blocklist to a comprehensive, principle-based strategy, ensuring that sensitive information like credentials remains protected within the conversation context. The changes aim to prevent incidents where agents might inadvertently reveal secrets, thereby safeguarding user data and system integrity.

Highlights

Secret Leak Prevention Rules: Added comprehensive rules to prompts/build.txt to prevent agents from suggesting or running commands that expose secret values in conversation transcripts.
Principle-Based Approach: Replaced the previous single-line command blocklist with a principle-based approach covering various secret exposure vectors like password managers, env dumps, container inspection, and cloud CLI tools.
Safe Debugging Alternatives: Included safe alternatives for debugging environment variable issues, focusing on showing key names only, never values.
Credential Lookup Guidance: Provided pre-staging guidance for credential lookups, instructing users to perform these actions in their terminal rather than in conversation.
Credential-Paste Detection: Implemented credential-paste detection, where the agent warns users about compromise and suggests rotation if a credential value is pasted directly into the conversation.

Changelog

.agents/prompts/build.txt
- Expanded secret-leaking prevention rules, replacing a simple blocklist with a principle-based approach.
- Added safe alternatives for environment variable debugging.
- Introduced credential-paste detection and warning mechanisms.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-03-04T20:19:28Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 107 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Mar 4 20:19:24 UTC 2026: Code review monitoring started
Wed Mar 4 20:19:24 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 107

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 107
VULNERABILITIES: 0

Generated on: Wed Mar 4 20:19:26 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

gemini-code-assist

Code Review

This pull request significantly enhances the security guardrails for the agent by providing detailed, principle-based rules to prevent secret leakage in conversations. The new instructions are comprehensive, covering a wide range of commands and scenarios, and providing safe alternatives. I've added a couple of suggestions to improve the robustness and consistency of the example commands, which should help the agent interpret them more reliably.

_{Note: Security Review has been skipped due to the limited scope of the PR.}

gemini-code-assist · 2026-03-04T20:20:21Z

.agents/prompts/build.txt

+  - `heroku config`, `vercel env pull`, `fly secrets list` (with values)
+  - Any `grep` or `rg` command targeting files known to contain secrets
+- When debugging env var issues, show key NAMES only, never values:
+  - SAFE: `pm2 show <app> | grep -oP '^\s+\K[A-Z_]+(?=\s)'` (key names only)


The provided grep command for pm2 relies on parsing the text output, which can be fragile if the pm2 version or configuration changes the output format. A more robust method would be to use pm2's JSON output feature and parse it with jq, which appears to be a tool used in this project. This avoids reliance on text formatting and makes the safe alternative more reliable.

- SAFE: `pm2 show <app> --json | jq -r '.[0].pm2_env | keys_unsorted[]'` (key names only, robust)

gemini-code-assist · 2026-03-04T20:20:21Z

.agents/prompts/build.txt

+  - SAFE: `pm2 show <app> | grep -oP '^\s+\K[A-Z_]+(?=\s)'` (key names only)
+  - SAFE: `printenv | cut -d= -f1 | sort` (list env var names without values)
+  - SAFE: `grep -oP '^[A-Z_]+(?==)' .env` (key names from .env without values)
+  - SAFE: `docker inspect <c> --format '{{range .Config.Env}}{{println .}}{{end}}' | cut -d= -f1`


For consistency with the unsafe command example on line 183 (docker inspect <container>), it would be clearer to use <container> as the placeholder here instead of <c>. Consistent placeholders help the agent generalize better from examples.

- SAFE: `docker inspect <container> --format '{{range .Config.Env}}{{println .}}{{end}}' | cut -d= -f1`

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/prompts/build.txt:
- Around line 188-193: The policy text has a contradiction: the rule "Any `grep`
or `rg` command targeting files known to contain secrets" is absolute while the
SAFE examples (e.g., `grep -oP '^[A-Z_]+(?==)' .env`, `printenv | cut -d= -f1 |
sort`, and the `pm2` example) explicitly allow name-only inspections; update the
wording to forbid any grep/rg that can expose secret VALUES but permit name-only
inspections using patterns or pipelines that explicitly strip values (mention
the safe patterns shown: `-oP` with a regex capturing only keys or piping to
`cut -d= -f1`), and revise the unsafe rule to read something like "Disallow
grep/rg that may display secret values; allow grep/rg only when using patterns
or processing steps that guarantee values are not printed (see `grep -oP
'^[A-Z_]+(?==)' .env`, `printenv | cut -d= -f1`)." Ensure the SAFE example lines
remain and the unsafe line is replaced with the clarified prohibition text.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8ff0b261-c8f5-4db3-a23e-3da5faadb9dc

📥 Commits

Reviewing files that changed from the base of the PR and between 26803f3 and 2de205e.

📒 Files selected for processing (1)

.agents/prompts/build.txt

.agents/prompts/build.txt

…(t2846) - Replace fragile pm2 grep text-parsing with pm2 JSON output + jq - Use consistent <container> placeholder matching unsafe example on line 183 - Resolve policy contradiction: revise absolute grep/rg ban to forbid printing secret VALUES while explicitly allowing key-name-only patterns

github-actions · 2026-03-04T20:28:17Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 107 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Mar 4 20:28:12 UTC 2026: Code review monitoring started
Wed Mar 4 20:28:13 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 107

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 107
VULNERABILITIES: 0

Generated on: Wed Mar 4 20:28:15 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

Revise the grep/rg prohibition to use explicit allow/disallow language instead of a parenthetical exception. The rule now reads: disallow commands that may display secret values, allow grep/rg only when using patterns or processing steps that guarantee values are not printed. SAFE examples preserved unchanged.

github-actions · 2026-03-04T20:34:21Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 107 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Mar 4 20:34:17 UTC 2026: Code review monitoring started
Wed Mar 4 20:34:18 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 107

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 107
VULNERABILITIES: 0

Generated on: Wed Mar 4 20:34:20 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-03-04T20:35:16Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions bot added the enhancement Auto-created from TODO.md tag label Mar 4, 2026

alex-solovyev mentioned this pull request Mar 4, 2026

[Supervisor:alex-solovyev] 13 PRs, 16 assigned, 6 workers at 06:15 UTC #2646

Closed

gemini-code-assist bot reviewed Mar 4, 2026

View reviewed changes

coderabbitai bot requested changes Mar 4, 2026

View reviewed changes

.agents/prompts/build.txt Outdated Show resolved Hide resolved

alex-solovyev merged commit f9b52ec into main Mar 4, 2026
12 checks passed

alex-solovyev deleted the feature/secret-leaking-guardrails branch March 4, 2026 21:41

alex-solovyev mentioned this pull request Mar 4, 2026

Secret leaking prevention: add guardrails for credential exposure in agent conversations #2846

Closed

marcusquinn mentioned this pull request Mar 5, 2026

quality-debt: .agents/prompts/build.txt — PR #2847 review feedback (medium) #2861

Closed

coderabbitai bot mentioned this pull request Mar 7, 2026

t1412.4: Runtime content scanning for worker pipelines #3098

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

t2846: Add secret-leaking prevention guardrails for agent conversations#2847

t2846: Add secret-leaking prevention guardrails for agent conversations#2847
alex-solovyev merged 3 commits intomainfrom
feature/secret-leaking-guardrails

alex-solovyev commented Mar 4, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 4, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

gemini-code-assist bot commented Mar 4, 2026

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 4, 2026

Uh oh!

gemini-code-assist bot Mar 4, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

sonarqubecloud bot commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alex-solovyev commented Mar 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Design decisions

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

Uh oh!

gemini-code-assist bot commented Mar 4, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Mar 4, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Mar 4, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

github-actions bot commented Mar 4, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Mar 4, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alex-solovyev commented Mar 4, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 4, 2026 •

edited

Loading