Conversation
Co-authored-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: George Armstrong <georgea@nvidia.com>
📝 WalkthroughWalkthroughAdds a new MCP server tool for Tavily web search integration. The implementation provides an async search handler that calls the Tavily API with authentication, returns standardized responses via an Changes
Sequence DiagramsequenceDiagram
actor Client
participant MCP Server
participant Handler
participant Tavily API
Client->>MCP Server: Call tavily-search tool with query
MCP Server->>Handler: answer(query)
Handler->>Handler: Validate API key
Handler->>Tavily API: POST /search (with query & API key)
alt API Success
Tavily API-->>Handler: 200 + search results
Handler->>Handler: Parse results
Handler-->>MCP Server: ExecutionResult(result=data)
else API Error
Tavily API-->>Handler: Non-200 status
Handler-->>MCP Server: ExecutionResult(error=message)
end
MCP Server-->>Client: Return ExecutionResult
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
nemo_skills/mcp/servers/tavily_search_tool.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
nemo_skills/mcp/servers/tavily_search_tool.py (1)
nemo_skills/mcp/tool_providers.py (2)
MCPClientTool(26-124)apply_config_updates(54-58)
🪛 Ruff (0.14.8)
nemo_skills/mcp/servers/tavily_search_tool.py
94-94: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: pre-commit
- GitHub Check: unit-tests
🔇 Additional comments (4)
nemo_skills/mcp/servers/tavily_search_tool.py (4)
30-33: LGTM!Clean dataclass for standardizing tool responses with optional error/result fields.
36-39: LGTM!Global API key pattern is appropriate here since it's set once at startup before the server begins processing requests.
74-85: LGTM!The class correctly extends
MCPClientTooland configures the stdio transport. Note that the subprocess will rely on theTAVILY_API_KEYenvironment variable since no--api-keyargument is passed—this is appropriate as it avoids exposing the key in process listings.
88-99: LGTM!The CLI setup provides flexible API key configuration with proper validation. The static analysis hint (TRY003) about exception message length is a minor style preference—the current inline message is clear and appropriate for this context.
| async def answer( | ||
| query: Annotated[str, Field(description="Search query.")], | ||
| ) -> ExecutionResult: | ||
| """Get a summary of search results from the web using Tavily.""" | ||
|
|
||
| api_url = "https://api.tavily.com/search" | ||
|
|
||
| headers = { | ||
| "Authorization": f"Bearer {TAVILY_API_KEY}", | ||
| "Content-Type": "application/json", | ||
| } | ||
|
|
||
| payload = { | ||
| "query": query, | ||
| # "auto_parameters": False, | ||
| "search_depth": "basic", | ||
| "include_answer": "basic", ## or advanced. | ||
| } | ||
|
|
||
| async with httpx.AsyncClient() as client: | ||
| response = await client.post(api_url, headers=headers, json=payload) | ||
| if response.status_code != 200: | ||
| return {"error": response.json()["error"]} | ||
|
|
||
| result = response.json()["answer"] | ||
|
|
||
| return {"result": result} |
There was a problem hiding this comment.
Return type mismatch: function returns dicts but declares ExecutionResult.
The function signature declares ExecutionResult as the return type, but lines 67 and 71 return plain dictionaries. This creates a type inconsistency.
Apply this diff to return proper ExecutionResult instances:
- async with httpx.AsyncClient() as client:
- response = await client.post(api_url, headers=headers, json=payload)
+ async with httpx.AsyncClient(timeout=30.0) as client:
+ response = await client.post(api_url, headers=headers, json=payload)
if response.status_code != 200:
- return {"error": response.json()["error"]}
+ error_detail = response.json().get("error", response.text)
+ return ExecutionResult(error=str(error_detail))
- result = response.json()["answer"]
+ result = response.json()["answer"]
- return {"result": result}
+ return ExecutionResult(result=result)This also:
- Adds a timeout to prevent indefinite hangs.
- Uses
.get()with fallback for safer error extraction. - Moves
response.json()inside theasync withblock for consistency.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def answer( | |
| query: Annotated[str, Field(description="Search query.")], | |
| ) -> ExecutionResult: | |
| """Get a summary of search results from the web using Tavily.""" | |
| api_url = "https://api.tavily.com/search" | |
| headers = { | |
| "Authorization": f"Bearer {TAVILY_API_KEY}", | |
| "Content-Type": "application/json", | |
| } | |
| payload = { | |
| "query": query, | |
| # "auto_parameters": False, | |
| "search_depth": "basic", | |
| "include_answer": "basic", ## or advanced. | |
| } | |
| async with httpx.AsyncClient() as client: | |
| response = await client.post(api_url, headers=headers, json=payload) | |
| if response.status_code != 200: | |
| return {"error": response.json()["error"]} | |
| result = response.json()["answer"] | |
| return {"result": result} | |
| async def answer( | |
| query: Annotated[str, Field(description="Search query.")], | |
| ) -> ExecutionResult: | |
| """Get a summary of search results from the web using Tavily.""" | |
| api_url = "https://api.tavily.com/search" | |
| headers = { | |
| "Authorization": f"Bearer {TAVILY_API_KEY}", | |
| "Content-Type": "application/json", | |
| } | |
| payload = { | |
| "query": query, | |
| # "auto_parameters": False, | |
| "search_depth": "basic", | |
| "include_answer": "basic", ## or advanced. | |
| } | |
| async with httpx.AsyncClient(timeout=30.0) as client: | |
| response = await client.post(api_url, headers=headers, json=payload) | |
| if response.status_code != 200: | |
| error_detail = response.json().get("error", response.text) | |
| return ExecutionResult(error=str(error_detail)) | |
| result = response.json()["answer"] | |
| return ExecutionResult(result=result) |
| async with httpx.AsyncClient() as client: | ||
| response = await client.post(api_url, headers=headers, json=payload) | ||
| if response.status_code != 200: | ||
| return {"error": response.json()["error"]} | ||
|
|
||
| result = response.json()["answer"] |
There was a problem hiding this comment.
Add error handling for JSON parsing and missing keys.
If the Tavily API returns a non-JSON error response or an unexpected JSON structure, the current code will raise unhandled exceptions (JSONDecodeError or KeyError).
Consider wrapping the response parsing in try-except:
async with httpx.AsyncClient(timeout=30.0) as client:
response = await client.post(api_url, headers=headers, json=payload)
+ try:
+ data = response.json()
+ except Exception as e:
+ return ExecutionResult(error=f"Failed to parse response: {e}")
+
if response.status_code != 200:
- error_detail = response.json().get("error", response.text)
+ error_detail = data.get("error", response.text)
return ExecutionResult(error=str(error_detail))
- result = response.json()["answer"]
+ result = data.get("answer")
+ if result is None:
+ return ExecutionResult(error="No answer in response")
return ExecutionResult(result=result)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In nemo_skills/mcp/servers/tavily_search_tool.py around lines 64 to 69, the code
assumes response.json() succeeds and the "error" or "answer" keys exist; wrap
the JSON parsing and key access in a try/except block that catches
JSONDecodeError and KeyError (or use response.is_json/content-type check),
attempt to parse JSON safely for both non-200 and 200 cases, and return a clear
error dict when parsing fails or keys are missing (e.g., include response.text
and status_code), otherwise extract and return result = parsed.get("answer")
after validating it's present.
Signed-off-by: George Armstrong <georgea@nvidia.com> WIP return string Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> try direct tool call Signed-off-by: George Armstrong <georgea@nvidia.com> remove comment line Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> Co-authored-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> Co-authored-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: George Armstrong <georgea@nvidia.com> Co-authored-by: Sanyam Kapoor <sanyamk@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> Co-authored-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: George Armstrong <georgea@nvidia.com> Co-authored-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> Co-authored-by: Sanyam Kapoor <sanyamk@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> Co-authored-by: Sanyam Kapoor <sanyamk@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> Co-authored-by: Sanyam Kapoor <sanyamk@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>
Adds Tavily Search tool implementation from @activatedgeek with a change to the tool name to match the Tavily remote MCP.
The
exclude_domains_configparameter is a required configuration for the Tavily search tool. It must point to a JSON file specifying domains to exclude from all search queries.This is intentionally mandatory to prevent accidentally training models on content from restricted domains. Users must explicitly configure domain exclusions before using the search tool.
Usage with nemo-skills
Specify
exclude_domains_configviatool_overrides:ns generate \ --cluster local \ --model Qwen/Qwen3-8B \ --server_type vllm \ --server_gpus 1 \ --server_args '--enable-auto-tool-choice --tool-call-parser hermes' \ --input_file data.jsonl \ --output_dir outputs \ ++tool_modules=[nemo_skills.mcp.servers.tavily_search_tool.TavilySearchTool] \ ++tool_overrides.TavilySearchTool.exclude_domains_config=/path/to/exclude_domains.jsonOr via Python API:
Exclude Domains JSON Format
The JSON file should follow this structure:
{ "notices": [ { "properties": [ { "type": "domain", "value": "example.com" }, { "type": "domain", "value": "restricted-site.org" } ] } ] }Behavior
ValueErrorand refuses to runexclude_domainsargument is hidden viahide_args, so the model cannot see or override the exclusion listSummary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.