Skip to content

t1349: fix: strengthen webfetch guidance to prevent 404 errors#2462

Merged
alex-solovyev merged 2 commits intomainfrom
bugfix/webfetch-404-guidance
Feb 27, 2026
Merged

t1349: fix: strengthen webfetch guidance to prevent 404 errors#2462
alex-solovyev merged 2 commits intomainfrom
bugfix/webfetch-404-guidance

Conversation

@alex-solovyev
Copy link
Collaborator

@alex-solovyev alex-solovyev commented Feb 27, 2026

Summary

  • Added a webfetch decision table to build-plus.md research section — maps each content need (GitHub files, library docs, PR/issue details) to the correct tool (gh api, Context7 MCP) with explicit "NOT" column showing what to avoid
  • Fixed context-guardrails.md which recommended raw.githubusercontent.com as a positive example — replaced with gh api and Context7 MCP alternatives
  • Fixed context-builder.md which recommended webfetch on github.com URLs — replaced with gh api repos/{owner}/{repo}/contents/README.md

Context

Session miner detected 117 webfetch 404 errors out of 250 calls (47% failure rate). 70% of failures were agents inventing raw.githubusercontent.com paths that don't exist. build.txt already had comprehensive error prevention guidance (lines 97-109) — this PR fixes the downstream agent docs that still recommended the bad patterns.

Files Changed

File Change
.agents/build-plus.md Added 7-row decision table for external content lookup
.agents/tools/context/context-guardrails.md Replaced raw.githubusercontent.com positive example with gh api + Context7
.agents/tools/context/context-builder.md Replaced webfetch GitHub URL with gh api command
todo/tasks/t1349-brief.md Task brief

Closes #2461

Summary by CodeRabbit

  • Documentation
    • Updated research and investigation guidance with structured tool recommendations and decision tables
    • Improved protocols for accessing library documentation and repository information
    • Enhanced guidance clarity with best practice recommendations for various research tasks

…URLs

Session miner detected 117 webfetch 404 errors (47% failure rate), with
70% caused by agents inventing raw.githubusercontent.com paths. Three
agent docs still recommended the bad patterns.

Changes:
- build-plus.md: add decision table for external content lookup (gh api,
  Context7 MCP vs webfetch) in the research section
- context-guardrails.md: replace raw.githubusercontent.com positive
  example with gh api alternative and Context7 MCP recommendation
- context-builder.md: replace webfetch GitHub URL with gh api command

build.txt already had comprehensive error prevention guidance (lines
97-109) — no changes needed there.

Closes #2461
@github-actions github-actions bot added the bug Auto-created from TODO.md tag label Feb 27, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a high rate of webfetch 404 errors, primarily caused by agents incorrectly guessing raw.githubusercontent.com paths. The changes provide clearer, more robust guidance for agents on how to fetch external content, steering them towards using gh api for GitHub content and Context7 MCP for library documentation, thereby improving reliability and reducing wasted resources.

Highlights

  • Webfetch Guidance: Added a webfetch decision table to the build-plus.md research section, mapping content needs to the correct tools (gh api, Context7 MCP) and explicitly stating what to avoid to prevent 404 errors.
  • Context Guardrails Update: Fixed context-guardrails.md by replacing raw.githubusercontent.com as a positive example with gh api and Context7 MCP alternatives, addressing a source of frequent failures.
  • Context Builder Refinement: Updated context-builder.md to replace the recommendation of using webfetch on github.com URLs with the more robust gh api repos/{owner}/{repo}/contents/README.md command.
Changelog
  • .agents/build-plus.md
    • Added a detailed decision table for external content lookup, specifying when to use gh api, Context7 MCP, or webfetch.
    • Included an explanation of why webfetch fails frequently and the advantages of gh api and Context7 MCP.
  • .agents/tools/context/context-builder.md
    • Replaced the webfetch command for fetching a repository's README with a gh api command that retrieves and decodes the content.
  • .agents/tools/context/context-guardrails.md
    • Removed webfetch examples using raw.githubusercontent.com as positive patterns.
    • Added new examples promoting Context7 MCP for library documentation and gh api for GitHub content.
    • Explicitly marked raw.githubusercontent.com URLs as an 'AVOID' pattern due to high failure rates.
  • todo/tasks/t1349-brief.md
    • Added a new task brief document outlining the problem, solution, approach, and acceptance criteria for fixing webfetch 404 errors.
Activity
  • A session miner detected 117 webfetch 404 errors out of 250 calls (47% failure rate), with 70% of failures attributed to agents inventing raw.githubusercontent.com paths.
  • Issue GH#2461 was filed to track and address this problem.
  • The pull request was created to implement stronger guidance across agent documentation to prevent these errors.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 27, 2026

Warning

Rate limit exceeded

@alex-solovyev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 58 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 4c7395e and 76e817c.

📒 Files selected for processing (4)
  • .agents/build-plus.md
  • .agents/tools/context/context-builder.md
  • .agents/tools/context/context-guardrails.md
  • todo/tasks/t1349-brief.md

Walkthrough

The PR updates agent guidance documentation to prevent webfetch 404 errors by replacing raw GitHub URL fetching with gh api and Context7 MCP approaches. Changes span three guidance files and a task checklist, introducing structured tool recommendation tables and examples for authenticated content retrieval.

Changes

Cohort / File(s) Summary
Guidance Documentation — Research & Content Retrieval
.agents/build-plus.md, .agents/tools/context/context-builder.md, .agents/tools/context/context-guardrails.md
Replaced webfetch commands for raw GitHub URLs with gh api alternatives for README/content retrieval. Added tool recommendation tables and warnings against raw.githubusercontent.com URLs. Introduced Context7 MCP for library documentation sourcing.
Task Documentation
todo/tasks/t1349-brief.md
Added task checklist and acceptance criteria for webfetch guidance updates, including verification snippets and references to existing build.txt guardrails.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~18 minutes

Possibly related PRs

  • marcusquinn/aidevops#2048 — Modifies .agents/build-plus.md with overlapping guidance tightening in the same research section.
  • marcusquinn/aidevops#117 — Updates the same context guardrails and context-builder files to adjust webfetch and content-retrieval guidance toward authenticated sources.

Suggested labels

bug, needs-review

Poem

404 errors haunt no more,
Raw GitHub URLs ignored at the door 🚪
gh api arrives with auth in hand ✨
Context7 curates docs so grand 📚
117 failures fade away! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: strengthening webfetch guidance to prevent 404 errors by steering away from raw GitHub URLs toward safer alternatives.
Linked Issues check ✅ Passed The PR fully addresses issue #2461 objectives: adds webfetch decision tables to build-plus.md, replaces raw.githubusercontent.com examples with gh api alternatives in context-guardrails.md and context-builder.md, preventing 404 errors.
Out of Scope Changes check ✅ Passed All changes are scoped to agent guidance documentation aligned with issue #2461; the task brief in todo/tasks/t1349-brief.md documents the changes and is properly scoped.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bugfix/webfetch-404-guidance

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 36 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 27 12:26:45 UTC 2026: Code review monitoring started
Fri Feb 27 12:26:45 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 36

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 36
  • VULNERABILITIES: 0

Generated on: Fri Feb 27 12:26:48 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides excellent and much-needed clarification on the correct tools for fetching external content, particularly from GitHub. The changes across the markdown files are clear, well-reasoned, and directly address the high failure rate of webfetch. The new decision table in build-plus.md is especially helpful. I have one minor suggestion to correct a command example in the new documentation.

| Need | Use | NOT |
|------|-----|-----|
| GitHub file content | `gh api repos/{owner}/{repo}/contents/{path}` | `webfetch` on `raw.githubusercontent.com` |
| GitHub repo overview | `gh api repos/{owner}/{repo}` + `--jq '.description'` | `webfetch` on `github.com` URLs |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The + in this gh api command example appears to be a typo and would cause the command to fail if executed literally. It should be removed to ensure the example is correct.

Suggested change
| GitHub repo overview | `gh api repos/{owner}/{repo}` + `--jq '.description'` | `webfetch` on `github.com` URLs |
| GitHub repo overview | `gh api repos/{owner}/{repo} --jq '.description'` | `webfetch` on `github.com` URLs |

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
.agents/tools/context/context-guardrails.md (1)

128-130: Use /readme endpoint instead to ensure README detection regardless of case or filename variations.

The /readme endpoint is GitHub's dedicated API for retrieving README content and automatically handles filename variations (case-insensitive, different extensions). Using /contents/README.md requires exact filename matching and fails silently on repos with alternate README naming conventions.

Proposed change
 # BETTER - use gh api for GitHub content (handles auth, structured JSON)
-gh api repos/{owner}/{repo}/contents/README.md --jq '.content' | base64 -d
+gh api repos/{owner}/{repo}/readme --jq '.content' | base64 -d
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/tools/context/context-guardrails.md around lines 128 - 130, Replace
the GitHub contents API usage that requires an exact filename with the dedicated
README endpoint: locate the line containing the gh api call that references
"repos/{owner}/{repo}/contents/README.md --jq '.content' | base64 -d" and change
it to use the "/readme" endpoint so README detection is case-insensitive and
handles alternate filenames; keep the existing --jq '.content' and base64 decode
step so behavior (decoding the returned content) remains the same.
.agents/tools/context/context-builder.md (1)

88-88: Use the GitHub /readme endpoint instead of hardcoding README.md.

Line 88 hardcodes README.md, which fails on repositories with variant README formats (e.g., .rst, .txt). The GitHub API /readme endpoint automatically locates the README regardless of extension and is the official approach for this use case.

Proposed doc fix
-1. **Fetch README first** - `gh api repos/{owner}/{repo}/contents/README.md --jq '.content' | base64 -d` (~1-5K tokens)
+1. **Fetch README first** - `gh api repos/{owner}/{repo}/readme --jq '.content' | base64 -d` (~1-5K tokens)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/tools/context/context-builder.md at line 88, Replace the hardcoded
`README.md` curl/gh api command shown under "Fetch README first" (the line
containing `gh api repos/{owner}/{repo}/contents/README.md --jq '.content' |
base64 -d`) with a call to the GitHub `/readme` endpoint so the README is
resolved regardless of extension; update the text and example command to use the
`/readme` endpoint and show decoding the returned base64 content (keep the same
output/intent but reference `/readme` instead of `contents/README.md`).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@todo/tasks/t1349-brief.md`:
- Around line 30-58: Multiple fenced code blocks in this section violate MD031
by missing a blank line before the closing ```; locate each YAML fenced block
(e.g., the ones showing verify: method: codebase pattern: "gh api.*repos" path:
".agents/build-plus.md" and the other verify blocks) and insert a single blank
line immediately above each closing triple backtick so every fenced block has a
blank line before the final ```; apply the same fix to all fenced blocks in this
fragment to satisfy the lint rule.

---

Nitpick comments:
In @.agents/tools/context/context-builder.md:
- Line 88: Replace the hardcoded `README.md` curl/gh api command shown under
"Fetch README first" (the line containing `gh api
repos/{owner}/{repo}/contents/README.md --jq '.content' | base64 -d`) with a
call to the GitHub `/readme` endpoint so the README is resolved regardless of
extension; update the text and example command to use the `/readme` endpoint and
show decoding the returned base64 content (keep the same output/intent but
reference `/readme` instead of `contents/README.md`).

In @.agents/tools/context/context-guardrails.md:
- Around line 128-130: Replace the GitHub contents API usage that requires an
exact filename with the dedicated README endpoint: locate the line containing
the gh api call that references "repos/{owner}/{repo}/contents/README.md --jq
'.content' | base64 -d" and change it to use the "/readme" endpoint so README
detection is case-insensitive and handles alternate filenames; keep the existing
--jq '.content' and base64 decode step so behavior (decoding the returned
content) remains the same.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6163f2 and 4c7395e.

📒 Files selected for processing (4)
  • .agents/build-plus.md
  • .agents/tools/context/context-builder.md
  • .agents/tools/context/context-guardrails.md
  • todo/tasks/t1349-brief.md

Comment on lines +30 to +58
- [ ] build-plus.md contains a "what to use instead of webfetch" decision table
```yaml
verify:
method: codebase
pattern: "gh api.*repos"
path: ".agents/build-plus.md"
```
- [ ] context-guardrails.md no longer recommends raw.githubusercontent.com
```yaml
verify:
method: codebase
pattern: "raw\\.githubusercontent\\.com"
path: ".agents/tools/context/context-guardrails.md"
expect: absent
```
- [ ] context-builder.md uses gh api instead of webfetch for GitHub repos
```yaml
verify:
method: codebase
pattern: "webfetch.*github\\.com/\\{user\\}"
path: ".agents/tools/context/context-builder.md"
expect: absent
```
- [ ] No raw.githubusercontent.com recommended as a positive pattern in any agent doc
```yaml
verify:
method: bash
run: "rg 'raw\\.githubusercontent\\.com' .agents/ --type md -l | grep -v 'build.txt' | xargs -I{} rg -c 'NEVER|CAUTION|avoid|bad|wrong|DON.T' {} | grep ':0$' && exit 1 || exit 0"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix markdown fence spacing to satisfy your own lint criterion.

MD031 (blanks-around-fences) is triggered throughout this section, so “Lint clean” (Line 59) is currently not met.

Proposed formatting fix pattern
 - [ ] build-plus.md contains a "what to use instead of webfetch" decision table
+
   ```yaml
   verify:
     method: codebase
     pattern: "gh api.*repos"
     path: ".agents/build-plus.md"
Apply the same blank-line pattern to the other fenced blocks in this section.
</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 GitHub Check: Codacy Static Code Analysis</summary>

[notice] 31-31: todo/tasks/t1349-brief.md#L31
Fenced code blocks should be surrounded by blank lines

---

[notice] 36-36: todo/tasks/t1349-brief.md#L36
Fenced code blocks should be surrounded by blank lines

</details>
<details>
<summary>🪛 markdownlint-cli2 (0.21.0)</summary>

[warning] 31-31: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 36-36: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 38-38: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 44-44: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 46-46: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 52-52: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 54-54: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 58-58: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @todo/tasks/t1349-brief.md around lines 30 - 58, Multiple fenced code blocks
in this section violate MD031 by missing a blank line before the closing ; locate each YAML fenced block (e.g., the ones showing verify: method: codebase pattern: "gh api.*repos" path: ".agents/build-plus.md" and the other verify blocks) and insert a single blank line immediately above each closing triple backtick so every fenced block has a blank line before the final ; apply the
same fix to all fenced blocks in this fragment to satisfy the lint rule.


</details>

<!-- fingerprinting:phantom:triton:hawk -->

<!-- This is an auto-generated comment by CodeRabbit -->

…amples

- Use GitHub /readme endpoint instead of /contents/README.md for case-insensitive README detection
- Remove errant + in gh api command example in build-plus.md decision table
- Fix MD031 blank lines around fenced code blocks in t1349-brief.md
@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 36 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 27 12:45:43 UTC 2026: Code review monitoring started
Fri Feb 27 12:45:44 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 36

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 36
  • VULNERABILITIES: 0

Generated on: Fri Feb 27 12:45:46 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

@alex-solovyev alex-solovyev merged commit 632cad2 into main Feb 27, 2026
10 of 11 checks passed
@github-actions github-actions bot mentioned this pull request Feb 27, 2026
5 tasks
@marcusquinn marcusquinn deleted the bugfix/webfetch-404-guidance branch March 3, 2026 03:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Auto-created from TODO.md tag

Projects

None yet

Development

Successfully merging this pull request may close these issues.

t1349: fix: webfetch 404 errors

1 participant