From 5f877b78bfd8daac2e274f7e9f3050c11ff3d271 Mon Sep 17 00:00:00 2001
From: Jax Liu <liugs963@gmail.com>
Date: Wed, 11 Mar 2026 10:58:42 +0800
Subject: [PATCH 1/3] docs: expand CLAUDE.md with full build, architecture, and
 skills reference

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 .claude/CLAUDE.md | 180 +++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 155 insertions(+), 25 deletions(-)

diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md
index 84bbb94df..6b3672491 100644
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@@ -1,28 +1,94 @@
 # CLAUDE.md
 
-Wren Engine (OSS) — open-source semantic SQL engine for MCP clients and AI agents. Translates queries through MDL (Modeling Definition Language) against 22+ data sources, powered by Apache DataFusion (Canner fork).
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+Wren Engine is a semantic engine for MCP clients and AI agents. It translates SQL queries through a semantic layer (MDL - Modeling Definition Language) and executes them against 22+ data sources (PostgreSQL, BigQuery, Snowflake, Spark, etc.). The engine is powered by Apache DataFusion (Canner fork).
 
 ## Repository Structure
 
-- **wren-core/** — Rust semantic engine: MDL analysis, query planning, DataFusion integration
-- **wren-core-base/** — Shared Rust crate: manifest types (`Model`, `Column`, `Metric`, `Relationship`, `View`)
-- **wren-core-py/** — PyO3 Python bindings for wren-core, built with Maturin
-- **ibis-server/** — FastAPI REST server: query execution, validation, 22 connector backends
-- **mcp-server/** — MCP server exposing Wren Engine to AI agents
-- **skills/** — User-facing MCP skills (wren-sql, wren-quickstart, etc.)
+Four main modules:
+
+- **wren-core/** — Rust semantic engine (Cargo workspace: `core/`, `sqllogictest/`, `benchmarks/`, `wren-example/`). Handles MDL analysis, query planning, logical plan optimization, and SQL generation via DataFusion.
+- **wren-core-base/** — Shared Rust crate with manifest types (`Model`, `Column`, `Metric`, `Relationship`, `View`). Has optional `python-binding` feature for PyO3 compatibility.
+- **wren-core-py/** — PyO3 bindings exposing wren-core to Python. Built with Maturin.
+- **ibis-server/** — FastAPI web server (Python 3.11). Provides REST API for query execution, validation, and metadata. Uses Ibis framework for data source connectivity.
+- **mcp-server/** — MCP server exposing Wren Engine to AI agents (Claude, Cline, Cursor).
+
+Supporting modules: `wren-core-legacy/` (Java engine, fallback for v2 queries), `mock-web-server/`, `benchmark/`, `example/`.
+
+## Build & Development Commands
 
-Supporting: `wren-core-legacy/` (Java fallback for v2), `mock-web-server/`, `benchmark/`, `example/`
+### wren-core (Rust)
+```bash
+cd wren-core
+cargo check --all-targets           # Compile check
+cargo test --lib --tests --bins     # Run tests (set RUST_MIN_STACK=8388608)
+cargo fmt --all                     # Format Rust code
+cargo clippy --all-targets --all-features -- -D warnings  # Lint
+taplo fmt                           # Format Cargo.toml files
+```
+
+Most unit tests are in `wren-core/core/src/mdl/mod.rs`. SQL end-to-end tests use sqllogictest files in `wren-core/sqllogictest/test_files/`.
 
-## Build Quick Reference
+### wren-core-py (Python bindings)
+```bash
+cd wren-core-py
+just install                        # Poetry install
+just develop                        # Build dev wheel with maturin
+just test-rs                        # Rust tests (cargo test --no-default-features)
+just test-py                        # Python tests (pytest)
+just test                           # Both
+just format                         # cargo fmt + ruff + taplo
+```
+
+### ibis-server (FastAPI)
+```bash
+cd ibis-server
+just install                        # Poetry install + build wren-core-py wheel + cython rebuild
+just dev                            # Dev server on port 8000
+just run                            # Production server on port 8000
+just test <MARKER>                  # Run pytest with marker (e.g., just test postgres)
+just lint                           # ruff format check + ruff check
+just format                         # ruff auto-fix + taplo
+```
 
-| Module | Install | Test | Format / Lint |
-|--------|---------|------|---------------|
-| wren-core | — | `cargo test --lib --tests --bins` | `cargo fmt --all` / `cargo clippy` |
-| wren-core-py | `just install` | `just test` | `just format` |
-| ibis-server | `just install` | `just test <MARKER>` | `just format` / `just lint` |
-| mcp-server | `uv sync` | — | — |
+Available test markers: `postgres`, `mysql`, `mssql`, `bigquery`, `snowflake`, `clickhouse`, `trino`, `oracle`, `athena`, `duckdb`, `athena_spark`, `databricks`, `spark`, `local_file`, `s3_file`, `gcs_file`, `minio_file`, `functions`, `profile`, `cache`, `unit`, `enterprise`, `beta`.
+
+### mcp-server
+Uses `uv` for dependency management. See `mcp-server/README.md`.
+
+### Docker (ibis-server image)
+
+```bash
+cd ibis-server
+just docker-build                                      # current platform, Rust built locally
+just docker-build linux/amd64                          # single specific platform
+just docker-build linux/amd64,linux/arm64 --push       # multi-arch (must --push, cannot load locally)
+```
+
+**Two build strategies controlled by `WHEEL_SOURCE` build-arg:**
+
+| Scenario | Strategy | Speed |
+|----------|----------|-------|
+| Target == host platform | `WHEEL_SOURCE=local` — Rust built on host via maturin+zig | Fast (reuses host cargo cache) |
+| Cross-platform / multi-arch | `WHEEL_SOURCE=docker` — Rust built inside Docker via BuildKit cache mounts | Slow first build, incremental after |
+
+`just docker-build` auto-detects host platform and chooses the right strategy.
+
+**Prerequisites for local strategy (one-time setup):**
+```bash
+brew install zig
+rustup target add aarch64-unknown-linux-gnu   # Apple Silicon
+rustup target add x86_64-unknown-linux-gnu    # Intel Mac
+```
 
-> Detailed commands, env vars, and test markers → see each module's README
+**Key constraints:**
+- Multi-arch builds always use `WHEEL_SOURCE=docker` (Rust compiled inside Docker)
+- Multi-arch images cannot be loaded locally — `--push` to a registry is required
+- BuildKit must be enabled (`DOCKER_BUILDKIT=1` or Docker Desktop default)
+- Build contexts required: `wren-core-py`, `wren-core`, `wren-core-base`, `mcp-server` (all relative to `ibis-server/`)
 
 ## Architecture: Query Flow
 
@@ -32,19 +98,83 @@ SQL Query → ibis-server (FastAPI v3 router)
   → wren-core-py (PyO3 FFI)
   → wren-core (Rust: MDL analysis → logical plan → optimization)
   → DataFusion (query planning)
-  → Connector (Ibis/sqlglot → dialect SQL)
+  → Connector (data source-specific SQL via Ibis/sqlglot)
   → Native execution (Postgres, BigQuery, etc.)
+  → Response (with optional query caching)
 ```
 
-Fallback: if wren-core (v3) fails, ibis-server retries via wren-core-legacy (Java, v2).
+If wren-core (v3) fails, ibis-server falls back to the legacy Java engine (v2).
 
-Key files: `ibis-server/app/routers/v3/connector.py`, `wren-core/core/src/logical_plan/analyze/`
+## Key Architecture Details
+
+**wren-core internals** (`wren-core/core/src/`):
+- `mdl/` — Core MDL processing: `WrenMDL` (manifest + symbol table), `AnalyzedWrenMDL` (with lineage), function definitions (scalar/aggregate/window per dialect), type planning
+- `logical_plan/analyze/` — DataFusion analyzer rules: `ModelAnalyzeRule` (TableScan → ModelPlanNode), scope tracking, access control (RLAC/CLAC), view expansion, relationship chain resolution
+- `logical_plan/optimize/` — Optimization passes: type coercion, timestamp simplification
+- `sql/` — SQL parsing and analysis
+
+**ibis-server internals** (`ibis-server/app/`):
+- `routers/v3/connector.py` — Main API endpoints (query, validate, dry-plan, metadata)
+- `model/metadata/` — Per-connector implementations (22 connectors), each with its own metadata handling
+- `model/metadata/factory.py` — Connector instantiation
+- `mdl/` — MDL processing: `core.py` (session context), `rewriter.py` (query rewriting), `substitute.py` (model substitution)
+- `custom_ibis/`, `custom_sqlglot/` — Ibis and SQLGlot extensions for Wren-specific behavior
+
+**Manifest types** (`wren-core-base/src/mdl/`):
+- `manifest.rs` — `Manifest`, `Model`, `Column`, `Metric`, `Relationship`, `View`, `RowLevelAccessControl`, `ColumnLevelAccessControl`
+- `builder.rs` — Fluent `ManifestBuilder` API
+- Uses `wren-manifest-macro` for auto-generating Pydantic-compatible Python classes
+
+## Running ibis-server Tests Locally
+
+Required environment variables (see `.github/workflows/ibis-ci.yml` for CI values):
+```bash
+export QUERY_CACHE_STORAGE_TYPE=local
+export WREN_ENGINE_ENDPOINT=http://localhost:8080
+export WREN_WEB_ENDPOINT=http://localhost:3000
+export PROFILING_STORE_PATH=file:///tmp/profiling
+```
+
+On macOS, `psycopg2` may fail to build due to missing OpenSSL linkage:
+```bash
+LDFLAGS="-L$(brew --prefix openssl)/lib" CPPFLAGS="-I$(brew --prefix openssl)/include" just install
+```
+
+Connector tests use testcontainers (Docker required). Example running a single connector:
+```bash
+just test clickhouse   # runs pytest -m clickhouse
+```
+
+TPCH test data is generated via DuckDB's TPCH extension (`CALL dbgen(sf=0.01)`) and loaded into the testcontainer at module scope. See `tests/routers/v3/connector/clickhouse/conftest.py` for the pattern.
+
+## Known wren-core Limitations
+
+**ModelAnalyzeRule — correlated subquery column resolution**: The `ModelAnalyzeRule` in `wren-core/core/src/logical_plan/analyze/` cannot resolve outer column references inside correlated subqueries. It only sees the subquery's own table scope. This affects TPCH Q2, Q4, Q15, Q17, Q20, Q21, Q22. See `ibis-server/tests/routers/v3/connector/clickhouse/TPCH_ISSUES.md`.
 
 ## Conventions
 
-- **Commits**: Conventional commits — `feat:`, `fix:`, `chore:`, `refactor:`, `test:`, `docs:`, `perf:`, `deps:`
-- **Releases**: Automated via release-please
-- **Rust**: `cargo fmt`, `clippy -D warnings`, `taplo fmt` for TOML
-- **Python**: `ruff` (line-length 88, Python 3.11 target), Poetry for deps
-- **Snapshot tests**: wren-core uses `insta`
-- **CI**: Rust CI on `wren-core/**`; ibis CI on all PRs; core-py CI on `wren-core-py/**` or `wren-core/**`
+- **Commits**: Conventional commits (`feat:`, `fix:`, `chore:`, `refactor:`, `test:`, `docs:`, `perf:`, `deps:`). Releases are automated via release-please.
+- **Rust**: Format with `cargo fmt`, lint with `clippy -D warnings`, TOML formatting with `taplo`.
+- **Python**: Format and lint with `ruff` (line-length 88, target Python 3.11). Poetry for dependency management.
+- **DataFusion fork**: `https://github.com/Canner/datafusion.git` branch `canner/v49.0.1`. Also forked Ibis: `https://github.com/Canner/ibis.git` branch `canner/10.8.1`.
+- **Snapshot testing**: wren-core uses `insta` for Rust snapshot tests.
+- **CI**: Rust CI runs on `wren-core/**` changes. ibis CI runs on all PRs. Core-py CI runs on `wren-core-py/**` or `wren-core/**` changes.
+
+## Skills
+
+Project-level skills are stored in `.claude/skills/`. Use these when working with Wren Engine SQL:
+
+- **wren-text-to-sql** — Rules for generating SQL queries targeting Wren Engine (MDL models, filter strategies, data types, aggregation). Trigger when asked to write SQL for Wren.
+- **wren-sql-correction** — Diagnostic workflow for fixing SQL errors across parsing, planning, transpiling, and execution stages. Trigger when debugging a Wren SQL error.
+- **wren-bigquery-dialect** — BigQuery-specific SQL rules (TIMESTAMP intervals, type casting, DATE_DIFF argument order, GROUP BY alias restrictions). Trigger when `dataSource` is BigQuery.
+- **wren-array-types** — ARRAY literal syntax and UNNEST patterns. Trigger when writing SQL involving array columns.
+- **wren-calculated-fields** — How to interpret and use pre-computed Calculated Field columns in MDL. Trigger when the schema contains columns marked as Calculated Fields.
+- **wren-date-time** — DATE_TRUNC, EXTRACT, DATE_DIFF, interval arithmetic, epoch conversion. Trigger when writing date/time SQL.
+- **wren-semi-structured-types** — GET_PATH, AS_VARCHAR/AS_INTEGER/AS_ARRAY for JSON/VARIANT/OBJECT columns. Trigger when querying semi-structured data.
+- **wren-structured-types** — STRUCT type definition and dot-notation field access. Trigger when querying STRUCT columns.
+- **wren-mcp-usage** — Wren MCP server setup, tool reference, connection config, and query workflow. Trigger when working on or asking about the MCP server.
+
+Additional skills in `skills/` (agentskills.io format, portable across agent tools):
+
+- **generate-mdl** (`skills/generate-mdl/SKILL.md`) — Step-by-step workflow for generating a Wren MDL manifest from a database via ibis-server metadata endpoints (no local DB drivers required). Trigger when a user wants to create a new MDL, onboard a new data source, or scaffold a manifest from an existing database.
+- **wren-project** (`skills/wren-project/SKILL.md`) — Save, load, and build Wren MDL manifests as YAML project directories. Trigger when a user wants to persist MDL as YAML files, load a YAML project, or compile to `target/mdl.json`.

From d6c5c9a4854ab19c1b4eb366919ef530a2601b19 Mon Sep 17 00:00:00 2001
From: Jax Liu <liugs963@gmail.com>
Date: Wed, 11 Mar 2026 11:02:27 +0800
Subject: [PATCH 2/3] chore: remove non-exist skill

---
 .claude/CLAUDE.md | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md
index 6b3672491..7597e8d48 100644
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@@ -159,22 +159,3 @@ TPCH test data is generated via DuckDB's TPCH extension (`CALL dbgen(sf=0.01)`)
 - **DataFusion fork**: `https://github.com/Canner/datafusion.git` branch `canner/v49.0.1`. Also forked Ibis: `https://github.com/Canner/ibis.git` branch `canner/10.8.1`.
 - **Snapshot testing**: wren-core uses `insta` for Rust snapshot tests.
 - **CI**: Rust CI runs on `wren-core/**` changes. ibis CI runs on all PRs. Core-py CI runs on `wren-core-py/**` or `wren-core/**` changes.
-
-## Skills
-
-Project-level skills are stored in `.claude/skills/`. Use these when working with Wren Engine SQL:
-
-- **wren-text-to-sql** — Rules for generating SQL queries targeting Wren Engine (MDL models, filter strategies, data types, aggregation). Trigger when asked to write SQL for Wren.
-- **wren-sql-correction** — Diagnostic workflow for fixing SQL errors across parsing, planning, transpiling, and execution stages. Trigger when debugging a Wren SQL error.
-- **wren-bigquery-dialect** — BigQuery-specific SQL rules (TIMESTAMP intervals, type casting, DATE_DIFF argument order, GROUP BY alias restrictions). Trigger when `dataSource` is BigQuery.
-- **wren-array-types** — ARRAY literal syntax and UNNEST patterns. Trigger when writing SQL involving array columns.
-- **wren-calculated-fields** — How to interpret and use pre-computed Calculated Field columns in MDL. Trigger when the schema contains columns marked as Calculated Fields.
-- **wren-date-time** — DATE_TRUNC, EXTRACT, DATE_DIFF, interval arithmetic, epoch conversion. Trigger when writing date/time SQL.
-- **wren-semi-structured-types** — GET_PATH, AS_VARCHAR/AS_INTEGER/AS_ARRAY for JSON/VARIANT/OBJECT columns. Trigger when querying semi-structured data.
-- **wren-structured-types** — STRUCT type definition and dot-notation field access. Trigger when querying STRUCT columns.
-- **wren-mcp-usage** — Wren MCP server setup, tool reference, connection config, and query workflow. Trigger when working on or asking about the MCP server.
-
-Additional skills in `skills/` (agentskills.io format, portable across agent tools):
-
-- **generate-mdl** (`skills/generate-mdl/SKILL.md`) — Step-by-step workflow for generating a Wren MDL manifest from a database via ibis-server metadata endpoints (no local DB drivers required). Trigger when a user wants to create a new MDL, onboard a new data source, or scaffold a manifest from an existing database.
-- **wren-project** (`skills/wren-project/SKILL.md`) — Save, load, and build Wren MDL manifests as YAML project directories. Trigger when a user wants to persist MDL as YAML files, load a YAML project, or compile to `target/mdl.json`.

From e8d57a7a7439816a3afca928cefe75c2031c5676 Mon Sep 17 00:00:00 2001
From: Jax Liu <liugs963@gmail.com>
Date: Wed, 11 Mar 2026 11:05:33 +0800
Subject: [PATCH 3/3] docs: add doris test marker and clarify project
 description

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 .claude/CLAUDE.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md
index 7597e8d48..ba9eedd50 100644
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 
 ## Project Overview
 
-Wren Engine is a semantic engine for MCP clients and AI agents. It translates SQL queries through a semantic layer (MDL - Modeling Definition Language) and executes them against 22+ data sources (PostgreSQL, BigQuery, Snowflake, Spark, etc.). The engine is powered by Apache DataFusion (Canner fork).
+Wren Engine (OSS) is an open source semantic engine for MCP clients and AI agents. It translates SQL queries through a semantic layer (MDL - Modeling Definition Language) and executes them against 22+ data sources (PostgreSQL, BigQuery, Snowflake, Spark, etc.). The engine is powered by Apache DataFusion (Canner fork).
 
 ## Repository Structure
 
@@ -54,7 +54,7 @@ just lint                           # ruff format check + ruff check
 just format                         # ruff auto-fix + taplo
 ```
 
-Available test markers: `postgres`, `mysql`, `mssql`, `bigquery`, `snowflake`, `clickhouse`, `trino`, `oracle`, `athena`, `duckdb`, `athena_spark`, `databricks`, `spark`, `local_file`, `s3_file`, `gcs_file`, `minio_file`, `functions`, `profile`, `cache`, `unit`, `enterprise`, `beta`.
+Available test markers: `postgres`, `mysql`, `mssql`, `bigquery`, `snowflake`, `clickhouse`, `trino`, `oracle`, `athena`, `duckdb`, `athena_spark`, `databricks`, `spark`, `doris`, `local_file`, `s3_file`, `gcs_file`, `minio_file`, `functions`, `profile`, `cache`, `unit`, `enterprise`, `beta`.
 
 ### mcp-server
 Uses `uv` for dependency management. See `mcp-server/README.md`.