Conversation
📝 WalkthroughWalkthroughThis PR adds a new AGENTS.md documentation file providing developer guidance for Codex interactions, and substantially rewrites README.md to reposition the project from MCP-centric messaging toward a broader agent-building narrative, including architecture details and updated use-case descriptions. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip Migrating from UI to YAML configuration.Use the |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
AGENTS.md (1)
80-85: Platform-specific prerequisites.The macOS-specific setup instructions are clear. Consider noting whether these are optional or required depending on the target platform.
Optional clarification
**Prerequisites for local strategy (one-time setup):** +*(Required only when building for Linux on macOS)* ```bash brew install zig rustup target add aarch64-unknown-linux-gnu # Apple Silicon rustup target add x86_64-unknown-linux-gnu # Intel Mac</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@AGENTS.mdaround lines 80 - 85, Update the "Prerequisites for local strategy
(one-time setup)" section to explicitly state whether the macOS-specific
commands are required or optional depending on the target platform; mention that
"brew install zig" and the two "rustup target add ..." commands apply only to
macOS (Apple Silicon vs Intel) and are unnecessary on other OSes, and add a
short note indicating which targets are required for cross-compilation versus
optional developer convenience.</details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@AGENTS.md:
- Line 7: The AGENTS.md claim "22+ data sources" is inconsistent with
README.md's list of 17 connectors; verify the actual connector count by counting
connector implementations (e.g., files under ibis-server/app/model/metadata) and
either update AGENTS.md to the correct number or expand the README list to
match; specifically edit the phrase "Wren Engine (OSS) ... executes them against
22+ data sources" in AGENTS.md and the connector list in README.md (lines
enumerating sources) so both reflect the verified count and ensure any summary
header (e.g., "22+ data sources") and the detailed list stay consistent.- Line 57: Update the test marker list string that starts with "Available test
markers:" to match the pytest config by replacing the inaccurate entries and
adding the missing ones so it reads: postgres, mysql, mssql, bigquery,
snowflake, clickhouse, trino, oracle, athena, duckdb, databricks, spark, doris,
local_file, s3_file, gcs_file, minio_file, functions, beta, canner, redshift;
modify the marker list text in the AGENTS.md line containing that sentence to
exactly reflect this corrected set.- Around line 110-126: Remove or update the incorrect
- \sql/` — SQL parsing
and analysisbullet in the list of wren-core internals: delete that line and replace it with a brief clarification that SQL parsing is delegated to DataFusion’s sqlparser (re-exported inlib.rs) rather than handled by awren-core/core/src/sql/module, so references to SQL parsing should point to the DataFusion/sqlparser re-export inlib.rs`.In
@README.md:
- Around line 156-178: The README's "Supported Data Sources" section mentions
DuckDB but there is no DuckDB connector implementation under
ibis-server/app/model/metadata/, so either remove "DuckDB" from that list or add
a proper connector module (implement the DuckDB connector in
ibis-server/app/model/metadata/ with the same interface/exports as existing
connectors) and ensure it is registered where other connectors are discovered;
additionally add missing documented entries for ClickHouse and Doris to the same
"Supported Data Sources" list to reflect actual connectors present.
Nitpick comments:
In@AGENTS.md:
- Around line 80-85: Update the "Prerequisites for local strategy (one-time
setup)" section to explicitly state whether the macOS-specific commands are
required or optional depending on the target platform; mention that "brew
install zig" and the two "rustup target add ..." commands apply only to macOS
(Apple Silicon vs Intel) and are unnecessary on other OSes, and add a short note
indicating which targets are required for cross-compilation versus optional
developer convenience.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: defaults **Review profile**: CHILL **Plan**: Pro **Run ID**: `ae269e45-1f7a-4199-8038-6e5fb1294643` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 3dda278e18284fd8cbbb1744e9bb1821ccc24e76 and 7868c240c1aaed3265309c820361c30e68275b0b. </details> <details> <summary>📒 Files selected for processing (2)</summary> * `AGENTS.md` * `README.md` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
|
|
||
| ## Project Overview | ||
|
|
||
| Wren Engine (OSS) is an open source semantic engine for MCP clients and AI agents. It translates SQL queries through a semantic layer (MDL - Modeling Definition Language) and executes them against 22+ data sources (PostgreSQL, BigQuery, Snowflake, Spark, etc.). The engine is powered by Apache DataFusion (Canner fork). |
There was a problem hiding this comment.
Inconsistent data source count.
Line 7 claims "22+ data sources" but README.md lines 162-178 only list 17 sources. Verify the correct count and ensure consistency across documentation.
Suggested verification
#!/bin/bash
# Description: Count actual connector implementations
echo "Connectors in ibis-server:"
fd -t f -e py . ibis-server/app/model/metadata/ --exec basename {} .py | grep -v '__' | grep -v 'factory' | grep -v 'base' | sort | wc -l
echo -e "\nListed in README:"
echo "17"🧰 Tools
🪛 LanguageTool
[grammar] ~7-~7: Use a hyphen to join words.
Context: ...t Overview Wren Engine (OSS) is an open source semantic engine for MCP clients a...
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AGENTS.md` at line 7, The AGENTS.md claim "22+ data sources" is inconsistent
with README.md's list of 17 connectors; verify the actual connector count by
counting connector implementations (e.g., files under
ibis-server/app/model/metadata) and either update AGENTS.md to the correct
number or expand the README list to match; specifically edit the phrase "Wren
Engine (OSS) ... executes them against 22+ data sources" in AGENTS.md and the
connector list in README.md (lines enumerating sources) so both reflect the
verified count and ensure any summary header (e.g., "22+ data sources") and the
detailed list stay consistent.
| just format # ruff auto-fix + taplo | ||
| ``` | ||
|
|
||
| Available test markers: `postgres`, `mysql`, `mssql`, `bigquery`, `snowflake`, `clickhouse`, `trino`, `oracle`, `athena`, `duckdb`, `athena_spark`, `databricks`, `spark`, `doris`, `local_file`, `s3_file`, `gcs_file`, `minio_file`, `functions`, `profile`, `cache`, `unit`, `enterprise`, `beta`. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Extract registered pytest markers from pyproject.toml
rg -A 50 '^\[tool\.pytest\.ini_options\]' ibis-server/pyproject.toml | rg 'markers\s*=' -A 30Repository: Canner/wren-engine
Length of output: 1142
Update test marker list to match pytest configuration.
The listed markers contain inaccuracies: athena_spark, profile, cache, unit, and enterprise are not registered in the pytest configuration, while canner and redshift are missing from the list. Update to: postgres, mysql, mssql, bigquery, snowflake, clickhouse, trino, oracle, athena, duckdb, databricks, spark, doris, local_file, s3_file, gcs_file, minio_file, functions, beta, canner, redshift.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AGENTS.md` at line 57, Update the test marker list string that starts with
"Available test markers:" to match the pytest config by replacing the inaccurate
entries and adding the missing ones so it reads: postgres, mysql, mssql,
bigquery, snowflake, clickhouse, trino, oracle, athena, duckdb, databricks,
spark, doris, local_file, s3_file, gcs_file, minio_file, functions, beta,
canner, redshift; modify the marker list text in the AGENTS.md line containing
that sentence to exactly reflect this corrected set.
| **wren-core internals** (`wren-core/core/src/`): | ||
| - `mdl/` — Core MDL processing: `WrenMDL` (manifest + symbol table), `AnalyzedWrenMDL` (with lineage), function definitions (scalar/aggregate/window per dialect), type planning | ||
| - `logical_plan/analyze/` — DataFusion analyzer rules: `ModelAnalyzeRule` (TableScan → ModelPlanNode), scope tracking, access control (RLAC/CLAC), view expansion, relationship chain resolution | ||
| - `logical_plan/optimize/` — Optimization passes: type coercion, timestamp simplification | ||
| - `sql/` — SQL parsing and analysis | ||
|
|
||
| **ibis-server internals** (`ibis-server/app/`): | ||
| - `routers/v3/connector.py` — Main API endpoints (query, validate, dry-plan, metadata) | ||
| - `model/metadata/` — Per-connector implementations (22 connectors), each with its own metadata handling | ||
| - `model/metadata/factory.py` — Connector instantiation | ||
| - `mdl/` — MDL processing: `core.py` (session context), `rewriter.py` (query rewriting), `substitute.py` (model substitution) | ||
| - `custom_ibis/`, `custom_sqlglot/` — Ibis and SQLGlot extensions for Wren-specific behavior | ||
|
|
||
| **Manifest types** (`wren-core-base/src/mdl/`): | ||
| - `manifest.rs` — `Manifest`, `Model`, `Column`, `Metric`, `Relationship`, `View`, `RowLevelAccessControl`, `ColumnLevelAccessControl` | ||
| - `builder.rs` — Fluent `ManifestBuilder` API | ||
| - Uses `wren-manifest-macro` for auto-generating Pydantic-compatible Python classes |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Verify key architecture paths mentioned in documentation
paths=(
"wren-core/core/src/mdl"
"wren-core/core/src/logical_plan/analyze"
"wren-core/core/src/logical_plan/optimize"
"wren-core/core/src/sql"
"ibis-server/app/routers/v3/connector.py"
"ibis-server/app/model/metadata/factory.py"
"ibis-server/app/mdl/core.py"
"wren-core-base/src/mdl/manifest.rs"
"wren-core-base/src/mdl/builder.rs"
)
for path in "${paths[@]}"; do
if [ -e "$path" ]; then
echo "✓ $path"
else
echo "✗ Missing: $path"
fi
doneRepository: Canner/wren-engine
Length of output: 394
🏁 Script executed:
# Find sql-related directories and files in wren-core/core/src
find wren-core/core/src -type d -name "*sql*" 2>/dev/null | head -20Repository: Canner/wren-engine
Length of output: 44
🏁 Script executed:
# Search for sql parser/analyzer in the codebase
rg -l "sql.*pars|parser.*sql" wren-core/core/src --type rust | head -20Repository: Canner/wren-engine
Length of output: 514
🏁 Script executed:
# Check the directory structure of wren-core/core/src to understand module organization
ls -la wren-core/core/src/Repository: Canner/wren-engine
Length of output: 338
🏁 Script executed:
# Check the module structure in lib.rs
cat wren-core/core/src/lib.rsRepository: Canner/wren-engine
Length of output: 384
🏁 Script executed:
# Search for sql-related code across wren-core/core/src
rg -i "sql" wren-core/core/src --type rust -l | head -20Repository: Canner/wren-engine
Length of output: 989
Remove inaccurate path reference for SQL parsing.
The wren-core/core/src/sql/ module does not exist. SQL parsing is handled via DataFusion's sqlparser library (re-exported in lib.rs), not a dedicated wren-core module. Remove this bullet point or replace it with clarification that SQL parsing is delegated to DataFusion.
Current (incorrect) snippet
- `sql/` — SQL parsing and analysis
All other referenced paths are accurate and verified.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AGENTS.md` around lines 110 - 126, Remove or update the incorrect `- \`sql/\`
— SQL parsing and analysis` bullet in the list of wren-core internals: delete
that line and replace it with a brief clarification that SQL parsing is
delegated to DataFusion’s sqlparser (re-exported in `lib.rs`) rather than
handled by a `wren-core/core/src/sql/` module, so references to SQL parsing
should point to the DataFusion/sqlparser re-export in `lib.rs`.
| ## Supported Data Sources | ||
|
|
||
| Wren Engine is built to work across modern data stacks, including warehouses, databases, and file-based sources. | ||
|
|
||
| Current open source support includes connectors such as: | ||
|
|
||
| - Amazon S3 | ||
| - Apache Spark | ||
| - Athena | ||
| - BigQuery | ||
| - Databricks | ||
| - DuckDB | ||
| - Google Cloud Storage | ||
| - Local files | ||
| - MinIO | ||
| - MySQL | ||
| - Oracle | ||
| - PostgreSQL | ||
| - SQL Server | ||
| - Snowflake | ||
| - Trino | ||
| - Redshift | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check for connector implementations in ibis-server
fd -t f -e py . ibis-server/app/model/metadata/ --exec basename {} .py | sort | head -20Repository: Canner/wren-engine
Length of output: 216
Remove DuckDB from the list or add a DuckDB connector implementation.
The README lists DuckDB as a supported data source, but no DuckDB connector exists in ibis-server/app/model/metadata/. Either remove it from the supported sources list or ensure the connector implementation is present. Additionally, ClickHouse and Doris connectors exist but are missing from the documentation.
🧰 Tools
🪛 LanguageTool
[grammar] ~160-~160: Use a hyphen to join words.
Context: ...s, and file-based sources. Current open source support includes connectors such ...
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` around lines 156 - 178, The README's "Supported Data Sources"
section mentions DuckDB but there is no DuckDB connector implementation under
ibis-server/app/model/metadata/, so either remove "DuckDB" from that list or add
a proper connector module (implement the DuckDB connector in
ibis-server/app/model/metadata/ with the same interface/exports as existing
connectors) and ensure it is registered where other connectors are discovered;
additionally add missing documented entries for ClickHouse and Doris to the same
"Supported Data Sources" list to reflect actual connectors present.
Summary by CodeRabbit