chore(tests/mcp): testing for MCP sampling #5456

alexhancock · 2025-10-29T21:12:54Z

Two approaches to tests that will ensure MCP sampling continues to work (following the merge of #5367)

1. mcp_integration_tests.rs - Make sure all codepaths related to sampling work correctly assuming the llm responds predictably. No assertions on content here, or integration with a real provider, just making sure the messages are handled properly.
1. A new scripts/test_mcp.sh script, similar to scripts/test_providers.sh but which does two things:
- Calls a tool that causes sampling to occur across a range of models
- Uses a separate fast model as a judge to determine if the interaction went through and actually performed sampling.

alexhancock · 2025-10-30T20:59:09Z

@jamadeo @DOsinga testing is now set up here for sampling.

Douwe and I discussed how we should work to merge test_providers.sh test_subrecipes.sh and test_mcp.sh into a unified approach of tests matrixed across the the providers/models with some ability to choose what runs where, but makes more sense in a followup.

Copilot

Pull Request Overview

This pull request adds MCP (Model Context Protocol) sampling test functionality to the Goose project. The changes enable testing of MCP sampling capabilities across multiple AI providers by validating that models can properly respond to sampling requests initiated through MCP tools.

Key changes include:

New test script test_mcp.sh that validates MCP sampling across multiple providers (Anthropic, Google, OpenRouter, OpenAI, Tetrate, Databricks)
MockProvider implementation for testing MCP sampling in integration tests without requiring actual API calls
Updated test replay files to reflect MCP sampling test results
Integration of the MCP test script into the CI/CD workflow

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
scripts/test_mcp.sh	New bash script that tests MCP sampling across multiple providers using a judge model to validate responses
crates/goose/tests/mcp_integration_test.rs	Adds MockProvider implementation and integrates it with ExtensionManager for MCP sampling tests
crates/goose/tests/mcp_replays/npx-y@modelcontextprotocol_server-everything	Updated replay file showing successful sampleLLM tool invocation with Great Gatsby quote
crates/goose/tests/mcp_replays/npx-y@modelcontextprotocol_server-everything.results.json	Updated results showing successful sampling response
crates/goose/tests/mcp_replays/uvxmcp-server-fetch	Updated replay file with cleaner output (removed verbose setup logs)
crates/goose/tests/mcp_replays/uvxmcp-server-fetch.results.json	Updated error message for fetch failure
crates/goose/tests/mcp_replays/cargorun--quiet-pgoose-server--bingoosed--mcpdeveloper	Updated replay file with newer timestamps and different user environment
crates/goose/tests/mcp_replays/cargorun--quiet-pgoose-server--bingoosed--mcpdeveloper.results.json	Updated list_windows output with more window items
.github/workflows/pr-smoke-test.yml	Adds MCP test step to CI workflow with required API keys and configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…se into dkatz/manual-compact-fix-usage * 'dkatz/manual-compact-fix-usage' of github.com:block/goose: (35 commits) Fix image processing (#5544) docs: AI attribution for PRs (#5547) chore(tests/mcp): testing for MCP sampling (#5456) docs: adding HOWTOAI.md (#5533) added configuration content, also added signoff, fix merging issue with another commit by creating a clean branch. removed and closed commits that caused signoff issues. (#5519) Fixes Gemini API parse issue by converting nullable type arrays to single types in tool schemas (#5530) Troubleshooting diagnostics doc (#5526) fix link to Ollama FAQ (#5531) docs: remove speech-mcp (#5514) fix: adds ProviderRetry to openai provider (#5518) docs: extensions directory minor updates (#5466) Docs/json recipe support (#5492) docs: recipe buttons (#5507) Improve system theme detection and fallback (#5427) [Autovisualiser] remove unnecessary content from mermaid HTML template (#5505) Improve subagents docs (#5484) FIX: prefer linux in WSL and add INSTALL_OS override for CLI (#5215) Propagate session ID in LLM and MCP requests (#5165) feat: YT Short for Canva MCP + goose (#5495) Change Recipes Test Script (#5457) ...

* 'main' of github.com:block/goose: (21 commits) Manual compaction counting fix + cli cleanup (#5480) chore(deps): bump prismjs and react-syntax-highlighter in /ui/desktop (#5549) fix: remove qwen3-coder from provider/mcp smoke tests (#5551) fix: do not build unsigned desktop app bundles on every PR in ci. add manual option. (#5550) fix: update Husky prepare script to v9 format (#5522) Fix 404 for responsible coding guide (#5543) fix hermit `text file busy` issues on linux (#5372) Fix image processing (#5544) docs: AI attribution for PRs (#5547) chore(tests/mcp): testing for MCP sampling (#5456) docs: adding HOWTOAI.md (#5533) added configuration content, also added signoff, fix merging issue with another commit by creating a clean branch. removed and closed commits that caused signoff issues. (#5519) Fixes Gemini API parse issue by converting nullable type arrays to single types in tool schemas (#5530) Troubleshooting diagnostics doc (#5526) fix link to Ollama FAQ (#5531) docs: remove speech-mcp (#5514) fix: adds ProviderRetry to openai provider (#5518) docs: extensions directory minor updates (#5466) Docs/json recipe support (#5492) docs: recipe buttons (#5507) ...

Signed-off-by: fbalicchia <[email protected]>

Signed-off-by: Blair Allan <[email protected]>

alexhancock requested a review from jamadeo October 29, 2025 21:12

alexhancock marked this pull request as draft October 29, 2025 21:13

alexhancock mentioned this pull request Oct 29, 2025

feat(mcp): support sampling in a scoped way #5367

Merged

alexhancock removed the request for review from jamadeo October 29, 2025 21:13

alexhancock force-pushed the alexhancock/mcp-sampling-testing-approaches branch 5 times, most recently from f0d73d5 to dcb2259 Compare October 30, 2025 15:25

alexhancock requested a review from jamadeo October 30, 2025 15:28

alexhancock changed the title ~~tests: testing approaches for mcp sampling~~ chore(tests/mcp): testing for MCP sampling Oct 30, 2025

alexhancock requested a review from DOsinga October 30, 2025 15:29

alexhancock marked this pull request as ready for review October 30, 2025 15:38

alexhancock force-pushed the alexhancock/mcp-sampling-testing-approaches branch 2 times, most recently from 1cc48e1 to 5810586 Compare October 30, 2025 18:02

alexhancock force-pushed the alexhancock/mcp-sampling-testing-approaches branch from 5810586 to 47b8945 Compare November 3, 2025 16:31

Copilot AI review requested due to automatic review settings November 3, 2025 16:31

chore(tests/mcp): testing for MCP sampling

ca1b761

alexhancock force-pushed the alexhancock/mcp-sampling-testing-approaches branch from 47b8945 to ca1b761 Compare November 3, 2025 16:33

Copilot AI reviewed Nov 3, 2025

View reviewed changes

DOsinga approved these changes Nov 3, 2025

View reviewed changes

alexhancock merged commit c1c1371 into main Nov 3, 2025
18 of 19 checks passed

github-actions bot mentioned this pull request Nov 5, 2025

chore(release): release version 1.13.0 (minor) #5582

Merged

fbalicchia pushed a commit to fbalicchia/goose that referenced this pull request Nov 7, 2025

chore(tests/mcp): testing for MCP sampling (block#5456)

5ddb0e8

Signed-off-by: fbalicchia <[email protected]>

BlairAllan pushed a commit to BlairAllan/goose that referenced this pull request Nov 29, 2025

chore(tests/mcp): testing for MCP sampling (block#5456)

349d04b

Signed-off-by: Blair Allan <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(tests/mcp): testing for MCP sampling #5456

chore(tests/mcp): testing for MCP sampling #5456

Uh oh!

alexhancock commented Oct 29, 2025 •

edited

Loading

Uh oh!

alexhancock commented Oct 30, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chore(tests/mcp): testing for MCP sampling #5456

chore(tests/mcp): testing for MCP sampling #5456

Uh oh!

Conversation

alexhancock commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexhancock commented Oct 30, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alexhancock commented Oct 29, 2025 •

edited

Loading