Skip to content
This repository was archived by the owner on May 7, 2026. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 10 additions & 8 deletions cli-skills/wren-usage/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,12 @@ The `wren` CLI queries databases through an MDL (Model Definition Language) sema

Two files drive everything (auto-discovered from `~/.wren/`):
- `mdl.json` — the semantic model
- `connection_info.json` — database credentials
- `connection_info.json` — database credentials + `datasource` field (e.g. `"datasource": "postgres"`)

The data source is always read from `connection_info.json`. There is no `--datasource` flag on execution commands (`query`, `dry-run`, `validate`). Only `dry-plan` accepts `--datasource` / `-d` as an override (for transpile-only use without a connection file).

For memory-specific decisions, see [references/memory.md](references/memory.md).
For SQL syntax, CTE-based modeling, and error diagnosis, see [references/wren-sql.md](references/wren-sql.md).

---

Expand Down Expand Up @@ -47,8 +50,6 @@ GROUP BY 1 ORDER BY 2 DESC LIMIT 5'

**SQL rules:**
- Target MDL model names, not database tables
- Use `CAST(x AS type)`, not `::type`
- Avoid correlated subqueries — use JOINs or CTEs
- Write dialect-neutral SQL — the engine translates

### Step 4 — Handle the result
Expand All @@ -74,15 +75,17 @@ GROUP BY 1 ORDER BY 2 DESC LIMIT 5'
### Connection error

1. Check: `cat ~/.wren/connection_info.json`
2. Test: `wren --sql "SELECT 1"`
3. Valid datasource values: `postgres`, `mysql`, `bigquery`, `snowflake`, `clickhouse`, `trino`, `mssql`, `databricks`, `redshift`, `spark`, `athena`, `oracle`, `duckdb`
2. Verify the `datasource` field is present and valid
3. Test: `wren --sql "SELECT 1"`
4. Valid datasource values: `postgres`, `mysql`, `bigquery`, `snowflake`, `clickhouse`, `trino`, `mssql`, `databricks`, `redshift`, `spark`, `athena`, `oracle`, `duckdb`
5. Both flat format (`{"datasource": ..., "host": ...}`) and MCP envelope format (`{"datasource": ..., "properties": {...}}`) are accepted

### SQL syntax / planning error

1. Isolate the layer:
- `wren dry-plan --sql "..."` — if this fails, it is an MDL-level issue
- If dry-plan succeeds but execution fails, the DB rejects the translated SQL
2. Common fixes: replace `::` with `CAST()`, replace correlated subqueries with JOINs
2. Compare dry-plan output with the DB error message — see [references/wren-sql.md](references/wren-sql.md) for the CTE rewrite pipeline and common error patterns

---

Expand Down Expand Up @@ -117,7 +120,7 @@ wren --sql "SELECT * FROM <changed_model> LIMIT 1"

```
Get data back → wren --sql "..."
See translated SQL only → wren dry-plan --sql "..."
See translated SQL only → wren dry-plan --sql "..." (accepts -d <datasource> if no connection file)
Validate against DB → wren dry-run --sql "..."
Schema context → wren memory fetch -q "..."
Filter by type/model → wren memory fetch -q "..." --type T --model M --threshold 0
Expand All @@ -134,5 +137,4 @@ Re-index after MDL change → wren memory index
- Do not guess model or column names — check context first
- Do not store queries the user has not confirmed — success != correctness
- Do not re-index before every query — once per MDL change
- Do not use database-specific syntax — write ANSI SQL
- Do not pass passwords via `--connection-info` if shell history is shared — use `--connection-file`
108 changes: 108 additions & 0 deletions cli-skills/wren-usage/references/wren-sql.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Wren SQL — How CTE-Based Modeling Works

Wren Engine rewrites your SQL by injecting CTEs (Common Table Expressions) that expand each MDL model into its underlying database query. Understanding this mechanism helps you diagnose errors and write correct SQL.

---

## The rewrite pipeline

```
Your SQL (target dialect, e.g. Postgres)
→ parse & qualify all column references (sqlglot)
→ identify which models and columns are referenced
→ per model: wren-core expands the model definition → CTE
→ inject model CTEs into your query
→ output final SQL in target dialect
```

**Example:** Given an MDL with model `orders` backed by table `public.orders` with columns `o_orderkey`, `o_custkey`, `o_totalprice`:

```sql
-- You write:
SELECT o_custkey, SUM(o_totalprice) FROM orders GROUP BY 1

-- Engine produces (via dry-plan):
WITH "orders" AS (
SELECT "public"."orders"."o_orderkey",
"public"."orders"."o_custkey",
"public"."orders"."o_totalprice"
FROM "public"."orders"
)
SELECT o_custkey, SUM(o_totalprice) FROM orders GROUP BY 1
```

The CTE named `"orders"` shadows the model name, so the rest of your SQL runs against the CTE as if it were a table.

---

## What the rewriter handles

| Feature | Supported |
|---------|-----------|
| `SELECT *` from a model | Yes — expands to all non-hidden, non-relationship columns |
| JOINs between models | Yes — each model gets its own CTE |
| Subqueries referencing models | Yes — outer column references are resolved |
| Table aliases (`FROM orders o`) | Yes — alias tracking maps back to models |
| User-defined CTEs (`WITH x AS (...)`) | Yes — model CTEs are prepended before user CTEs |
| `RECURSIVE` WITH clauses | Yes — preserved |
| Calculated fields / metrics | Yes — wren-core expands them inside the model CTE |
| `COUNT(*)` without columns | Yes — model CTE selects `1` (only needs rows) |

---

## SQL rules for writing queries

1. **Use model names, not database table names** — write `FROM orders`, not `FROM public.orders`
2. **Write dialect-neutral SQL** — the engine translates to the target database dialect
3. **Column names must match the MDL** — use the names defined in `mdl.json`, not the underlying database column names
4. **Hidden columns are excluded** — columns with `"isHidden": true` are not available in `SELECT *`
5. **Relationship columns are excluded** — relationship fields don't appear as selectable columns; use JOINs instead

---

## Diagnosing errors with dry-plan

`dry-plan` shows the expanded SQL without executing it. This separates MDL-level issues from database-level issues.

### Step 1 — Run dry-plan

```bash
wren dry-plan --sql "SELECT o_custkey, SUM(o_totalprice) FROM orders GROUP BY 1"
```

### Step 2 — Interpret the result

| dry-plan result | Meaning | Fix |
|-----------------|---------|-----|
| **Succeeds** with valid SQL | MDL layer is fine; if execution fails, the database rejects the translated SQL | Read the DB error against the dry-plan output — the issue is in the generated SQL or DB state |
| **Fails** with "No model references found" | Your FROM clause doesn't match any MDL model name | Check model names: `wren memory fetch -q "<name>" --type model --threshold 0` |
| **Fails** with column error | A column you referenced doesn't exist in the model | Check columns: `wren memory fetch -q "<col>" --model <name> --threshold 0` |
| **Fails** with qualify error | sqlglot can't resolve an ambiguous or unknown column | Qualify the column explicitly: `model_name.column_name` |

### Step 3 — Compare dry-plan output with DB error

When execution fails but dry-plan succeeds, compare them side by side:

```bash
# Get the expanded SQL
wren dry-plan --sql "SELECT ..." 2>&1

# Run against DB and capture the error
wren --sql "SELECT ..." 2>&1
```

Common patterns:
- **Type mismatch**: The CTE exposes the raw column type; a function may not accept it in the target dialect
- **Missing table**: The underlying table referenced in the model definition doesn't exist in the database
- **Permission denied**: The DB user lacks access to the underlying tables
- **Syntax difference**: Rare — usually means a sqlglot dialect translation gap

---

## Fallback behavior

If the rewriter detects no model references in your SQL (e.g. `SELECT 1` or queries against raw database tables), it falls back to passing the entire query through wren-core's `transform_sql()` directly. This means:

- Queries that don't reference any MDL model still work
- The fallback path does NOT use CTE injection — it transforms the whole query at once
- If you expect model expansion but get none, check that your FROM clause uses model names from the MDL
18 changes: 15 additions & 3 deletions wren/docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Translate MDL SQL to the native dialect SQL for your data source. No database co

```bash
wren dry-plan --sql 'SELECT order_id FROM "orders"'
wren dry-plan --sql 'SELECT order_id FROM "orders"' -d postgres # explicit datasource, no connection file needed
```

## `wren dry-run`
Expand All @@ -47,13 +48,14 @@ wren validate --sql 'SELECT * FROM "NonExistent"'

## Overriding defaults

All flags are optional when `~/.wren/mdl.json` and `~/.wren/connection_info.json` exist:
All flags are optional when `~/.wren/mdl.json` and `~/.wren/connection_info.json` exist.

The data source is always read from the `datasource` field in `connection_info.json` (or the inline `--connection-info` value). Only `dry-plan` accepts `--datasource` / `-d` as an override for transpile-only use without a connection file.

```bash
wren --sql '...' \
--mdl /path/to/other-mdl.json \
--connection-file /path/to/prod-connection_info.json \
--datasource postgres
--connection-file /path/to/prod-connection_info.json
```

Or pass connection info inline:
Expand All @@ -63,6 +65,16 @@ wren --sql 'SELECT COUNT(*) FROM "orders"' \
--connection-info '{"datasource":"mysql","host":"localhost","port":3306,"database":"mydb","user":"root","password":"secret"}'
```

Both flat and MCP/web envelope formats are accepted:

```bash
# Flat format
{"datasource": "postgres", "host": "localhost", "port": 5432, ...}

# Envelope format (auto-unwrapped)
{"datasource": "duckdb", "properties": {"url": "/data", "format": "duckdb"}}
```

---

## `wren memory` — Schema & Query Memory
Expand Down
36 changes: 36 additions & 0 deletions wren/docs/connections.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,42 @@

The `connection_info.json` file (or `--connection-info` / `--connection-file` flags) requires a `datasource` field plus the connector-specific fields below.

## Accepted formats

**Flat format** — all fields at the top level:

```json
{
"datasource": "postgres",
"host": "localhost",
"port": 5432,
"database": "mydb",
"user": "postgres",
"password": "secret"
}
```

**Envelope format** — connector fields nested under `properties` (used by MCP server and Wren web):

```json
{
"datasource": "postgres",
"properties": {
"host": "localhost",
"port": 5432,
"database": "mydb",
"user": "postgres",
"password": "secret"
}
}
```

Both formats are accepted. The CLI auto-flattens the envelope format.

---

## Per-connector fields

## MySQL

```json
Expand Down
Loading
Loading