Skip to content

Conversation

@michaelneale
Copy link
Collaborator

@michaelneale michaelneale commented Nov 24, 2025

fixes #5752 - which means it may get clearer content back for agent consonsumption.

Please explain the motivation behind the feature request.
I’d like goose to hint to web servers that it prefers markdown, but will also take other content types. This is helpful when requesting / reading docs and is much more token efficient than the agent reading HTML/JS/CSS

Describe the solution you'd like
Simply adding Accepts: text/markdown to outgoing requests would be sufficient.

Additional context

https://x.com/bcherny/status/1988860326306087102?s=46

(brings it in line with direction of other agents).

Copilot AI review requested due to automatic review settings November 24, 2025 03:18
@michaelneale michaelneale changed the title suggests using text/markdown when fetching content chore: suggest using text/markdown when fetching content Nov 24, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for requesting markdown content from web servers by including an Accept: text/markdown, */* header in HTTP requests. This aligns with modern AI agent practices and improves token efficiency when fetching documentation, as markdown is more compact than HTML/JS/CSS.

  • Adds Accept: text/markdown, */* header to the web_scrape tool's HTTP requests
  • Updates documentation to suggest using the Accept header when fetching web content via shell commands

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
crates/goose-mcp/src/computercontroller/mod.rs Implements the Accept header in the web_scrape function's HTTP client request
crates/goose-mcp/src/developer/rmcp_developer.rs Documents the Accept header usage for shell-based web content fetching

sourcing files do not persist between tool calls. So you may need to repeat them each time by
stringing together commands.
If fetching web content, consider adding Accept: text/markdown header
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add the reason why to add the header? Right now it would take some extra context to know why it’s needed or when it’s appropriate, so maybe inlining that context here is good?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah - I suspect so. I am always wary of "adding more" to a prompt that effectively ends up in a system prompt, so am trying to be brief - any ideas of the densest possible way to say that? (I am also not sure if the shell() tool is the right place, but there is no developer tool that is "fetch" - there are a ton of extensions that are, and the computer controller one...). Maybe don't even need it here and over time models will know (via pre-train) to do this when appropriate? (ie don't touch this one, but leave the one in computer controller) @simonsickle ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this a bit more. When I have written rules, I typically like to follow the following pattern: (a) short, (b) conditional, (c) with explicit carve-outs

so, my attempt here following that pattern is

“For HTTP GETs that retrieve human-readable documentation or prose (e.g., READMEs, guides, specs, blog posts), include ‘Accept: text/markdown, /’ to request markdown when available (token-efficient, easy to parse). Do not add this header for API/JSON requests, binary/media downloads, or HTML scraping.”

this tells the LLMs when and why we should use the header but also why we should not use the header (if you ask it to scrape HTML or call a JSON endpoint specifically).

So we have any cli benchmarks we could test this with?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is tbench - but still working on getting that a) regular and reliable (it costs about $200+ each run and takes some time) and b) let it run on branches too (may be overkillf or that). That seems a bit much wording for that prompt, would rather it be part of a bigger developer MCP refactoring (ie take things out, add other things). Perhaps it is time to bite the bullet and have a built in fetch tool? (I expect that is now common?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting discussion here, but in the interest of testing this out I am going to merge with it as-is. We can always iterate on prompts.

Copy link
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think converting html to markdown would be even better, but could do in a follow up

@alexhancock alexhancock merged commit 89aa38c into main Nov 25, 2025
27 of 28 checks passed
@alexhancock alexhancock deleted the micn/markdown-header branch November 25, 2025 14:35
mitchelsblockacct added a commit to mitchelsblockacct/goose that referenced this pull request Nov 25, 2025
…ed-context-2

* upstream/main:
  Move recipe actions to bottom bar icon and edit goosehints to settings (block#5864)
  [docs] Add “Building a Social Media Agent” Blog Post (block#5844)
  deps: upgrade rmcp to 0.9.1 (block#5860)
  chore: suggest using text/markdown when fetching content (block#5854)
  Revert "fix: do not load active extensions when no extensions in the recipe" (block#5871)
  goose remote access (block#5251)
  docs: add DataHub MCP server extension documentation (block#5769)

# Conflicts:
#	ui/desktop/src/components/ChatInput.tsx
wpfleger96 added a commit that referenced this pull request Nov 26, 2025
* main:
  fix: adjust strange spacing in agent.rs (#5877)
  Move recipe actions to bottom bar icon and edit goosehints to settings (#5864)
  [docs] Add “Building a Social Media Agent” Blog Post (#5844)
  deps: upgrade rmcp to 0.9.1 (#5860)
  chore: suggest using text/markdown when fetching content (#5854)
  Revert "fix: do not load active extensions when no extensions in the recipe" (#5871)
  goose remote access (#5251)
  docs: add DataHub MCP server extension documentation (#5769)
BlairAllan pushed a commit to BlairAllan/goose that referenced this pull request Nov 29, 2025
Signed-off-by: Blair Allan <Blairallan@icloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When goose makes a web request, hint to the server that markdown is preferred

5 participants