-
Notifications
You must be signed in to change notification settings - Fork 182
docs: expand CLAUDE.md with full build, architecture, and skills reference #1441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,28 +1,94 @@ | ||
| # CLAUDE.md | ||
|
|
||
| Wren Engine (OSS) β open-source semantic SQL engine for MCP clients and AI agents. Translates queries through MDL (Modeling Definition Language) against 22+ data sources, powered by Apache DataFusion (Canner fork). | ||
| This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. | ||
|
|
||
| ## Project Overview | ||
|
|
||
| Wren Engine (OSS) is an open source semantic engine for MCP clients and AI agents. It translates SQL queries through a semantic layer (MDL - Modeling Definition Language) and executes them against 22+ data sources (PostgreSQL, BigQuery, Snowflake, Spark, etc.). The engine is powered by Apache DataFusion (Canner fork). | ||
|
|
||
| ## Repository Structure | ||
|
|
||
| - **wren-core/** β Rust semantic engine: MDL analysis, query planning, DataFusion integration | ||
| - **wren-core-base/** β Shared Rust crate: manifest types (`Model`, `Column`, `Metric`, `Relationship`, `View`) | ||
| - **wren-core-py/** β PyO3 Python bindings for wren-core, built with Maturin | ||
| - **ibis-server/** β FastAPI REST server: query execution, validation, 22 connector backends | ||
| - **mcp-server/** β MCP server exposing Wren Engine to AI agents | ||
| - **skills/** β User-facing MCP skills (wren-sql, wren-quickstart, etc.) | ||
| Four main modules: | ||
|
|
||
| - **wren-core/** β Rust semantic engine (Cargo workspace: `core/`, `sqllogictest/`, `benchmarks/`, `wren-example/`). Handles MDL analysis, query planning, logical plan optimization, and SQL generation via DataFusion. | ||
| - **wren-core-base/** β Shared Rust crate with manifest types (`Model`, `Column`, `Metric`, `Relationship`, `View`). Has optional `python-binding` feature for PyO3 compatibility. | ||
| - **wren-core-py/** β PyO3 bindings exposing wren-core to Python. Built with Maturin. | ||
| - **ibis-server/** β FastAPI web server (Python 3.11). Provides REST API for query execution, validation, and metadata. Uses Ibis framework for data source connectivity. | ||
| - **mcp-server/** β MCP server exposing Wren Engine to AI agents (Claude, Cline, Cursor). | ||
|
|
||
| Supporting modules: `wren-core-legacy/` (Java engine, fallback for v2 queries), `mock-web-server/`, `benchmark/`, `example/`. | ||
|
|
||
| ## Build & Development Commands | ||
|
|
||
| ### wren-core (Rust) | ||
| ```bash | ||
| cd wren-core | ||
| cargo check --all-targets # Compile check | ||
| cargo test --lib --tests --bins # Run tests (set RUST_MIN_STACK=8388608) | ||
| cargo fmt --all # Format Rust code | ||
| cargo clippy --all-targets --all-features -- -D warnings # Lint | ||
| taplo fmt # Format Cargo.toml files | ||
| ``` | ||
|
|
||
| Most unit tests are in `wren-core/core/src/mdl/mod.rs`. SQL end-to-end tests use sqllogictest files in `wren-core/sqllogictest/test_files/`. | ||
|
|
||
| ### wren-core-py (Python bindings) | ||
| ```bash | ||
| cd wren-core-py | ||
| just install # Poetry install | ||
| just develop # Build dev wheel with maturin | ||
| just test-rs # Rust tests (cargo test --no-default-features) | ||
| just test-py # Python tests (pytest) | ||
| just test # Both | ||
| just format # cargo fmt + ruff + taplo | ||
| ``` | ||
|
|
||
| ### ibis-server (FastAPI) | ||
| ```bash | ||
| cd ibis-server | ||
| just install # Poetry install + build wren-core-py wheel + cython rebuild | ||
| just dev # Dev server on port 8000 | ||
| just run # Production server on port 8000 | ||
| just test <MARKER> # Run pytest with marker (e.g., just test postgres) | ||
| just lint # ruff format check + ruff check | ||
| just format # ruff auto-fix + taplo | ||
| ``` | ||
|
|
||
| Available test markers: `postgres`, `mysql`, `mssql`, `bigquery`, `snowflake`, `clickhouse`, `trino`, `oracle`, `athena`, `duckdb`, `athena_spark`, `databricks`, `spark`, `doris`, `local_file`, `s3_file`, `gcs_file`, `minio_file`, `functions`, `profile`, `cache`, `unit`, `enterprise`, `beta`. | ||
|
|
||
| ### mcp-server | ||
| Uses `uv` for dependency management. See `mcp-server/README.md`. | ||
|
|
||
| ### Docker (ibis-server image) | ||
|
|
||
| ```bash | ||
| cd ibis-server | ||
| just docker-build # current platform, Rust built locally | ||
| just docker-build linux/amd64 # single specific platform | ||
| just docker-build linux/amd64,linux/arm64 --push # multi-arch (must --push, cannot load locally) | ||
| ``` | ||
|
|
||
| **Two build strategies controlled by `WHEEL_SOURCE` build-arg:** | ||
|
|
||
| Supporting: `wren-core-legacy/` (Java fallback for v2), `mock-web-server/`, `benchmark/`, `example/` | ||
| | Scenario | Strategy | Speed | | ||
| |----------|----------|-------| | ||
| | Target == host platform | `WHEEL_SOURCE=local` β Rust built on host via maturin+zig | Fast (reuses host cargo cache) | | ||
| | Cross-platform / multi-arch | `WHEEL_SOURCE=docker` β Rust built inside Docker via BuildKit cache mounts | Slow first build, incremental after | | ||
|
|
||
| ## Build Quick Reference | ||
| `just docker-build` auto-detects host platform and chooses the right strategy. | ||
|
|
||
| | Module | Install | Test | Format / Lint | | ||
| |--------|---------|------|---------------| | ||
| | wren-core | β | `cargo test --lib --tests --bins` | `cargo fmt --all` / `cargo clippy` | | ||
| | wren-core-py | `just install` | `just test` | `just format` | | ||
| | ibis-server | `just install` | `just test <MARKER>` | `just format` / `just lint` | | ||
| | mcp-server | `uv sync` | β | β | | ||
| **Prerequisites for local strategy (one-time setup):** | ||
| ```bash | ||
| brew install zig | ||
| rustup target add aarch64-unknown-linux-gnu # Apple Silicon | ||
| rustup target add x86_64-unknown-linux-gnu # Intel Mac | ||
| ``` | ||
|
Comment on lines
+80
to
+85
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Scope these prerequisites to macOS or add a Linux variant.
π Suggested wording-**Prerequisites for local strategy (one-time setup):**
+**Prerequisites for local strategy on macOS (one-time setup):**+If Linux is also supported for the local strategy, add the equivalent package-manager command there as well. π€ Prompt for AI Agents |
||
|
|
||
| > Detailed commands, env vars, and test markers β see each module's README | ||
| **Key constraints:** | ||
| - Multi-arch builds always use `WHEEL_SOURCE=docker` (Rust compiled inside Docker) | ||
| - Multi-arch images cannot be loaded locally β `--push` to a registry is required | ||
| - BuildKit must be enabled (`DOCKER_BUILDKIT=1` or Docker Desktop default) | ||
| - Build contexts required: `wren-core-py`, `wren-core`, `wren-core-base`, `mcp-server` (all relative to `ibis-server/`) | ||
|
|
||
| ## Architecture: Query Flow | ||
|
|
||
|
|
@@ -32,19 +98,64 @@ SQL Query β ibis-server (FastAPI v3 router) | |
| β wren-core-py (PyO3 FFI) | ||
| β wren-core (Rust: MDL analysis β logical plan β optimization) | ||
| β DataFusion (query planning) | ||
| β Connector (Ibis/sqlglot β dialect SQL) | ||
| β Connector (data source-specific SQL via Ibis/sqlglot) | ||
| β Native execution (Postgres, BigQuery, etc.) | ||
| β Response (with optional query caching) | ||
| ``` | ||
|
|
||
| If wren-core (v3) fails, ibis-server falls back to the legacy Java engine (v2). | ||
|
|
||
| ## Key Architecture Details | ||
|
|
||
| **wren-core internals** (`wren-core/core/src/`): | ||
| - `mdl/` β Core MDL processing: `WrenMDL` (manifest + symbol table), `AnalyzedWrenMDL` (with lineage), function definitions (scalar/aggregate/window per dialect), type planning | ||
| - `logical_plan/analyze/` β DataFusion analyzer rules: `ModelAnalyzeRule` (TableScan β ModelPlanNode), scope tracking, access control (RLAC/CLAC), view expansion, relationship chain resolution | ||
| - `logical_plan/optimize/` β Optimization passes: type coercion, timestamp simplification | ||
| - `sql/` β SQL parsing and analysis | ||
|
|
||
| **ibis-server internals** (`ibis-server/app/`): | ||
| - `routers/v3/connector.py` β Main API endpoints (query, validate, dry-plan, metadata) | ||
| - `model/metadata/` β Per-connector implementations (22 connectors), each with its own metadata handling | ||
| - `model/metadata/factory.py` β Connector instantiation | ||
| - `mdl/` β MDL processing: `core.py` (session context), `rewriter.py` (query rewriting), `substitute.py` (model substitution) | ||
| - `custom_ibis/`, `custom_sqlglot/` β Ibis and SQLGlot extensions for Wren-specific behavior | ||
|
|
||
| **Manifest types** (`wren-core-base/src/mdl/`): | ||
| - `manifest.rs` β `Manifest`, `Model`, `Column`, `Metric`, `Relationship`, `View`, `RowLevelAccessControl`, `ColumnLevelAccessControl` | ||
| - `builder.rs` β Fluent `ManifestBuilder` API | ||
| - Uses `wren-manifest-macro` for auto-generating Pydantic-compatible Python classes | ||
|
|
||
| ## Running ibis-server Tests Locally | ||
|
|
||
| Required environment variables (see `.github/workflows/ibis-ci.yml` for CI values): | ||
| ```bash | ||
| export QUERY_CACHE_STORAGE_TYPE=local | ||
| export WREN_ENGINE_ENDPOINT=http://localhost:8080 | ||
| export WREN_WEB_ENDPOINT=http://localhost:3000 | ||
| export PROFILING_STORE_PATH=file:///tmp/profiling | ||
| ``` | ||
|
|
||
| On macOS, `psycopg2` may fail to build due to missing OpenSSL linkage: | ||
| ```bash | ||
| LDFLAGS="-L$(brew --prefix openssl)/lib" CPPFLAGS="-I$(brew --prefix openssl)/include" just install | ||
| ``` | ||
|
|
||
| Connector tests use testcontainers (Docker required). Example running a single connector: | ||
| ```bash | ||
| just test clickhouse # runs pytest -m clickhouse | ||
| ``` | ||
|
|
||
| Fallback: if wren-core (v3) fails, ibis-server retries via wren-core-legacy (Java, v2). | ||
| TPCH test data is generated via DuckDB's TPCH extension (`CALL dbgen(sf=0.01)`) and loaded into the testcontainer at module scope. See `tests/routers/v3/connector/clickhouse/conftest.py` for the pattern. | ||
|
|
||
| ## Known wren-core Limitations | ||
|
|
||
| Key files: `ibis-server/app/routers/v3/connector.py`, `wren-core/core/src/logical_plan/analyze/` | ||
| **ModelAnalyzeRule β correlated subquery column resolution**: The `ModelAnalyzeRule` in `wren-core/core/src/logical_plan/analyze/` cannot resolve outer column references inside correlated subqueries. It only sees the subquery's own table scope. This affects TPCH Q2, Q4, Q15, Q17, Q20, Q21, Q22. See `ibis-server/tests/routers/v3/connector/clickhouse/TPCH_ISSUES.md`. | ||
|
|
||
| ## Conventions | ||
|
|
||
| - **Commits**: Conventional commits β `feat:`, `fix:`, `chore:`, `refactor:`, `test:`, `docs:`, `perf:`, `deps:` | ||
| - **Releases**: Automated via release-please | ||
| - **Rust**: `cargo fmt`, `clippy -D warnings`, `taplo fmt` for TOML | ||
| - **Python**: `ruff` (line-length 88, Python 3.11 target), Poetry for deps | ||
| - **Snapshot tests**: wren-core uses `insta` | ||
| - **CI**: Rust CI on `wren-core/**`; ibis CI on all PRs; core-py CI on `wren-core-py/**` or `wren-core/**` | ||
| - **Commits**: Conventional commits (`feat:`, `fix:`, `chore:`, `refactor:`, `test:`, `docs:`, `perf:`, `deps:`). Releases are automated via release-please. | ||
| - **Rust**: Format with `cargo fmt`, lint with `clippy -D warnings`, TOML formatting with `taplo`. | ||
| - **Python**: Format and lint with `ruff` (line-length 88, target Python 3.11). Poetry for dependency management. | ||
| - **DataFusion fork**: `https://github.com/Canner/datafusion.git` branch `canner/v49.0.1`. Also forked Ibis: `https://github.com/Canner/ibis.git` branch `canner/10.8.1`. | ||
| - **Snapshot testing**: wren-core uses `insta` for Rust snapshot tests. | ||
| - **CI**: Rust CI runs on `wren-core/**` changes. ibis CI runs on all PRs. Core-py CI runs on `wren-core-py/**` or `wren-core/**` changes. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the module count in this section.
Line 11 says βFour main modules,β but the list contains five entries (
wren-core,wren-core-base,wren-core-py,ibis-server,mcp-server). Please make the heading and bullets agree.π€ Prompt for AI Agents