feat(mcp): introduce `get_wren_guide` tool for default prompt by goldmedal · Pull Request #1360 · Canner/wren-engine

goldmedal · 2025-10-30T05:44:16Z

introduce get_wren_guide tool to provide the guide and tips for AI agent.
Fix dryRun issue
Fix get_available_tables issue

Summary by CodeRabbit

New Features
- Retrieve available functions from the engine
- Retrieve current data source type
- Get a customized usage guide/tips per data source (including source-specific tips)
- Dry-run now returns actual query validation output
- Available tables now returned as a structured list
Documentation
- Expanded environment variable setup guidance with multiple configuration options and example env block
- Added VsCode MCP Extension to compatible agents
- Added usage tips, recommended prompts, examples and follow-ups
Chores
- Updated MCP dependency to a newer version

coderabbitai · 2025-10-30T05:44:26Z

Walkthrough

This PR updates docs and server behavior: README expands env/config guidance and usage tips; wren.py adds GET /functions helper and endpoints (get_available_functions, get_current_data_source_type, get_wren_guide), renames dry_run→dryRun, adds header x-wren-fallback_disable, and changes get_available_tables return type to list[str]; pyproject bumps mcp[cli].

Changes

Cohort / File(s)	Summary
Documentation `mcp-server/README.md`	Expanded environment variable setup (root `.env` and MCP config), injected example env block (`WREN_URL`, `CONNECTION_INFO_FILE`, `MDL_PATH`), added VsCode MCP Extension to agent list, plus usage tips, a recommended prompt, and optional follow-ups.
Server API & Logic `mcp-server/app/wren.py`	Added `make_get_available_functions_request()` helper; added endpoints `get_available_functions()`, `get_current_data_source_type()`, `get_wren_guide()`; renamed query param `dry_run` → `dryRun`; added header `x-wren-fallback_disable: "true"` to requests; changed dry-run behavior to return Wren's textual response; changed `get_available_tables()` return type to `list[str]`.
Dependencies `mcp-server/pyproject.toml`	Upgraded dependency `mcp[cli]` from `>=1.4.1` to `>=1.19.0`.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant MCP_Server as "MCP Server"
    participant Wren as "Wren Engine"

    rect rgb(210,230,255)
    note over Client,MCP_Server: Get Available Functions (new)
    Client->>MCP_Server: get_available_functions()
    MCP_Server->>Wren: GET /functions
    Wren-->>MCP_Server: functions list
    MCP_Server-->>Client: return functions
    end

    rect rgb(220,255,230)
    note over Client,Wren: Dry-run query (modified)
    Client->>MCP_Server: query(sql, dryRun=true)
    MCP_Server->>Wren: POST /query (header: x-wren-fallback_disable: "true", param dryRun=true)
    Wren-->>MCP_Server: textual validation response
    MCP_Server-->>Client: return validation text
    end

    rect rgb(255,245,200)
    note over Client,MCP_Server: Get Guide (new, assembled server-side)
    Client->>MCP_Server: get_wren_guide()
    MCP_Server->>MCP_Server: assemble guide (data-source specific)
    MCP_Server-->>Client: return guide text
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Attention areas:
- Verify callers of get_available_tables() for updated list[str] return type.
- Confirm any tool/registration exposure for new endpoints (get_available_functions, get_wren_guide, get_current_data_source_type).
- Validate renaming of query param dry_run → dryRun and addition of x-wren-fallback_disable header.
- Check compatibility with bumped mcp[cli] version.

Possibly related PRs

feat(mcp): introduce MCP server #1094: Implements similar additions/changes to mcp-server/app/wren.py (new endpoints, renaming to dryRun, added headers) and appears directly related.

Suggested reviewers

douenergy

Poem

🐰 a hop, a nibble, then a cheer,

New guides and functions now appear,
dryRun whispers where old flags stood,
Tables tidy, lists feel good,
I bounce — the Wren grows wise and clear. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "feat(mcp): introduce `get_wren_guide` tool for default prompt" directly corresponds to the primary objective listed in the PR description, which is to introduce the new `get_wren_guide` tool. The title is specific and clear about what feature is being added and its purpose, without using vague or generic terms. While the PR includes additional changes such as dependency updates, bug fixes to dryRun and get_available_tables, and other new endpoints, the title appropriately captures the main feature that drives this changeset. The title is concise (61 characters, 7 words) and would help teammates quickly understand the primary purpose of the PR when scanning history.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

mcp-server/README.md (1)
127-133: Add a language hint to the fenced block

markdownlint complains (MD040) because this block lacks a language identifier. Please mark it (for example text) so the linter passes.
-```
+```text
 Use the get_wren_guide() tool to learn how to use Wren Engine and discover available tools and examples.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c14e99 and eeadb90.

⛔ Files ignored due to path filters (1)

mcp-server/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (3)

mcp-server/README.md (4 hunks)
mcp-server/app/wren.py (5 hunks)
mcp-server/pyproject.toml (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

mcp-server/app/wren.py (3)

ibis-server/wren/session/__init__.py (1)

sql (37-52)

wren-core/core/src/mdl/mod.rs (1)

mdl (197-232)

wren-core-py/src/extractor.rs (1)

mdl_base64 (174-234)

🪛 markdownlint-cli2 (0.18.1)

mcp-server/README.md

130-130: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: ci

mcp-server/app/wren.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

mcp-server/app/wren.py (1)

329-329: Already fixed: String comparison is correct.

The past review comment flagged this line for using is instead of ==, but the current code correctly uses == for string equality comparison.

🧹 Nitpick comments (2)

mcp-server/app/wren.py (2)
197-197: LGTM! Accurate type signature.

The return type change from str to list[str] correctly reflects what the function returns, improving type safety and IDE support.

Minor: Remove trailing whitespace.

Line 203 contains trailing spaces.

Apply this diff:
-    return [table["name"] for table in mdl["models"]]   
+    return [table["name"] for table in mdl["models"]]
Also applies to: 203-203

22-29: Consider failing fast if required configuration is missing.

If MDL_PATH is not provided, data_source (line 25) and mdl_base64 (line 26) remain undefined, causing NameError when any tool that depends on them is called. Consider raising an error at startup or providing default values if the MCP server cannot function without this configuration.

Apply this diff to fail fast with a clear error message:
 if mdl_path:
     with open(mdl_path) as f:
         mdl_schema = json.load(f)
         data_source = mdl_schema["dataSource"].lower()
         mdl_base64 = dict_to_base64_string(mdl_schema)
         print(f"Loaded MDL {f.name}")  # noqa: T201
 else:
-    print("No MDL_PATH environment variable found")
+    raise RuntimeError("MDL_PATH environment variable is required but not found")
Alternatively, if the server should support running without MDL_PATH for certain operations, initialize with safe defaults:
 if mdl_path:
     with open(mdl_path) as f:
         mdl_schema = json.load(f)
         data_source = mdl_schema["dataSource"].lower()
         mdl_base64 = dict_to_base64_string(mdl_schema)
         print(f"Loaded MDL {f.name}")  # noqa: T201
 else:
     print("No MDL_PATH environment variable found")
+    data_source = None
+    mdl_base64 = None
Then add guards in functions that require these values.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eeadb90 and 37d91e4.

📒 Files selected for processing (1)

mcp-server/app/wren.py (5 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

mcp-server/app/wren.py (2)

ibis-server/wren/session/__init__.py (1)

sql (37-52)

wren-core-py/src/extractor.rs (1)

mdl_base64 (174-234)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: ci

🔇 Additional comments (5)

mcp-server/app/wren.py (5)

40-40: LGTM! Backend compatibility improvements.

The addition of the x-wren-fallback_disable header and the parameter rename from dry_run to dryRun align with backend API requirements.

Also applies to: 44-44

85-95: LGTM! Consistent implementation.

The new helper function follows the established pattern for making HTTP requests and appropriately uses GET for fetching available functions.

162-163: LGTM! Improved validation feedback.

Returning the actual response text instead of a static message provides more informative feedback to users about query validation results.

302-308: LGTM! Useful new tool.

The function provides a straightforward way for users to discover available functions for their data source type.

317-362: LGTM! Comprehensive usage guide.

The guide provides helpful tips for using Wren Engine, with data source-specific guidance for Snowflake and generic fallback for other data sources.

mcp-server/app/wren.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

mcp-server/app/wren.py (2)
22-26: Critical: Initialize data_source at module level to prevent NameError.

This issue was flagged in a previous review but remains unresolved. The variable data_source is only assigned when mdl_path is truthy (line 25). If the MDL_PATH environment variable is not set or empty, data_source remains undefined, causing a NameError at runtime when accessed on lines 44, 63, 77, 90, 315, 326-328, 331, and 340-341.

Apply this diff to initialize data_source at module level:
 MDL_SCHEMA_PATH = "mdl.schema.json"
 connection_info_path = os.getenv("CONNECTION_INFO_FILE")
 # TODO: maybe we should log the number of tables and columns
 mdl_path = os.getenv("MDL_PATH")
+data_source = None
 
 if mdl_path:
310-317: The None check will fail if data_source is undefined.

Line 315 checks if data_source is None:, but if MDL_PATH was not set, data_source is never assigned, causing a NameError before the comparison executes. Once data_source is initialized at module level (as noted in the previous comment), this check will function correctly.

🧹 Nitpick comments (1)

mcp-server/app/wren.py (1)

332-337: Fix grammar in user-facing documentation.

Several grammatical issues reduce clarity:

Lines 335-336: "For process" should be "To process"
Lines 347, 355: "Avoid to use" should be "Avoid using"

Apply this diff:

-        3. For process semi-structure data type (e.g. `VARIANT`), you can use `get_path` function to extract the value from the semi-structure data.
-        4. For process array data type (e.g. `ARRAY`), you can use `UNNEST` function to flatten the array data. `UNNEST` only accepts array column as input. If you extract an array value by `get_path` function, you need to cast it to array type (by `to_array` function) before using `UNNEST`.
+        3. To process semi-structured data types (e.g. `VARIANT`), you can use the `get_path` function to extract values from the semi-structured data.
+        4. To process array data types (e.g. `ARRAY`), you can use the `UNNEST` function to flatten array data. `UNNEST` only accepts array columns as input. If you extract an array value with the `get_path` function, you need to cast it to an array type (using the `to_array` function) before using `UNNEST`.

-    Avoid to use database specific SQL syntax in your Wren SQL.
+    Avoid using database-specific SQL syntax in your Wren SQL.

-    5. Avoid to use `LATERAL` statement in your queries, as Wren Engine may not support it well. Use normal `JOIN` or `CROSS JOIN UNNEST` instead.
+    5. Avoid using `LATERAL` statements in your queries, as Wren Engine may not support them well. Use normal `JOIN` or `CROSS JOIN UNNEST` instead.

Also applies to: 347-347, 355-355

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 37d91e4 and b28d8c7.

📒 Files selected for processing (1)

mcp-server/app/wren.py (5 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

mcp-server/app/wren.py (2)

wren-core/core/src/mdl/mod.rs (1)

mdl (197-232)

wren-core-py/src/extractor.rs (1)

mdl_base64 (174-234)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: ci

🔇 Additional comments (5)

mcp-server/app/wren.py (5)

85-95: LGTM!

The new make_get_available_functions_request function follows the same pattern as other request functions in the file, using consistent error handling and HTTP client usage.

162-163: Good improvement: returning actual response text.

Returning response.text instead of a static validation message provides more actionable feedback to users about what the dry run validated.

197-197: Return type correctly updated to match implementation.

Changing the return type from str to list[str] aligns with the actual return value on line 203. Note that this is a breaking change to the function signature that may affect existing callers.

Also applies to: 203-203

302-308: LGTM!

The new get_available_functions tool is well-documented and follows the established patterns in the file.

40-40: Header name is correct; no changes needed.

The header x-wren-fallback_disable with an underscore is the intentional design of the Wren Engine API. It is defined as a constant in ibis-server/app/dependencies.py and used consistently throughout the test suite. Similar headers like x-wren-db-statement_timeout follow the same underscore naming pattern, confirming this is the codebase's deliberate convention, not a deviation.

coderabbitai · 2025-10-30T06:35:22Z

mcp-server/app/wren.py

+    tips = f"""
+    ## Tips for using Wren Engine with {data_source.capitalize()}
+    You are connected to a {data_source.capitalize()} database via Wren Engine.
+    Here are some tips to use {data_source.capitalize()} effectively:
+    """
+
+    if data_source == "snowflake":
+        tips += """
+        1. Snowflake supports semi-structured data types like VARIANT, OBJECT, and ARRAY. You can use these data types to store and query JSON data.
+        2. Snowflake has a rich set of built-in functions to process semi-structured data. You can use functions like GET_PATH, TO_VARIANT, TO_ARRAY, etc.
+        3. For process semi-structure data type (e.g. `VARIANT`), you can use `get_path` function to extract the value from the semi-structure data.
+        4. For process array data type (e.g. `ARRAY`), you can use `UNNEST` function to flatten the array data. `UNNEST` only accepts array column as input. If you extract an array value by `get_path` function, you need to cast it to array type (by `to_array` function) before using `UNNEST`.
+        """
+    else:
+        tips += f"""
+        1. Use {data_source.capitalize()}'s specific functions and features to optimize your queries.
+        2. Refer to {data_source.capitalize()}'s documentation for more details on how to use its features effectively.
+        """


⚠️ Potential issue | 🔴 Critical

Add guard for None data source to prevent AttributeError.

Lines 326-328 and 340-341 call data_source.capitalize() without checking if data_source is None. After fixing the module-level initialization, you must add a guard in this function to handle the case where MDL hasn't been deployed.

Apply this diff to add the guard:

@mcp.tool() async def get_wren_guide() -> str: """ Understand how to use Wren Engine effectively to query your database """ + + if data_source is None: + return "No data source connected. Please deploy the MDL first and assign `dataSource` field." tips = f""" ## Tips for using Wren Engine with {data_source.capitalize()}

🤖 Prompt for AI Agents

In mcp-server/app/wren.py around lines 325 to 342, the code calls data_source.capitalize() and compares data_source to "snowflake" without guarding against data_source being None; add a guard at the top of this block that normalizes data_source into a safe variable (e.g. if not data_source: safe_name = "Unknown" and safe_source = "" or safe_source = data_source.lower()) or return/emit a default tips string when MDL isn't deployed, then use safe_name for display (safe_name.capitalize()) and safe_source for the snowflake comparison (safe_source == "snowflake") so no AttributeError occurs when data_source is None.

douenergy · 2025-10-31T08:36:01Z

Thanks @goldmedal

…#1360) Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

goldmedal added 2 commits October 30, 2025 13:36

enhance tools for snowflake usage

bf07ffe

enhance readme

eeadb90

github-actions bot added documentation Improvements or additions to documentation dependencies Pull requests that update a dependency file labels Oct 30, 2025

goldmedal changed the title ~~feat(mcp): introduce usage guide tool for default prompt~~ feat(mcp): introduce get_wren_guide tool for default prompt Oct 30, 2025

coderabbitai bot reviewed Oct 30, 2025

View reviewed changes

mcp-server/app/wren.py Outdated Show resolved Hide resolved

Update mcp-server/app/wren.py

37d91e4

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai bot reviewed Oct 30, 2025

View reviewed changes

mcp-server/app/wren.py Show resolved Hide resolved

add check for data source check

b28d8c7

coderabbitai bot reviewed Oct 30, 2025

View reviewed changes

goldmedal requested a review from douenergy October 31, 2025 02:26

douenergy approved these changes Oct 31, 2025

View reviewed changes

douenergy merged commit ae298f4 into Canner:main Oct 31, 2025
5 checks passed

goldmedal deleted the feat/enhance-mcp-for-snowflake branch November 5, 2025 05:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): introduce `get_wren_guide` tool for default prompt#1360

feat(mcp): introduce `get_wren_guide` tool for default prompt#1360
douenergy merged 4 commits intoCanner:mainfrom
goldmedal:feat/enhance-mcp-for-snowflake

goldmedal commented Oct 30, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 30, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 30, 2025

Uh oh!

Uh oh!

douenergy commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

goldmedal commented Oct 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

douenergy commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

goldmedal commented Oct 30, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 30, 2025 •

edited

Loading