Skip to content

chore(skills): add wren-dlt-connector skill v1.0#1535

Merged
douenergy merged 2 commits intoCanner:mainfrom
goldmedal:chore/dlt-skill
Apr 9, 2026
Merged

chore(skills): add wren-dlt-connector skill v1.0#1535
douenergy merged 2 commits intoCanner:mainfrom
goldmedal:chore/dlt-skill

Conversation

@goldmedal
Copy link
Copy Markdown
Contributor

@goldmedal goldmedal commented Apr 9, 2026

Summary

  • Add wren-dlt-connector skill (v1.0) with version/license/metadata in SKILL.md frontmatter
  • Register the new skill in versions.json, index.json, SKILLS.md, README.md, and install.sh
  • Passes check-versions.sh validation

Test plan

  • bash skills/check-versions.sh — all versions match
  • bash skills/install.sh wren-dlt-connector — installs correctly with dependency

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added wren-dlt-connector skill to import SaaS data via dlt into DuckDB and auto-generate a Wren semantic project; included in the default install set.
  • Documentation

    • Added full skill guide, usage references for popular dlt sources, CLI introspection workflow, evaluation test cases, and troubleshooting guidance.

Add version/license/metadata to SKILL.md frontmatter and register the
new skill in versions.json, index.json, SKILLS.md, README.md, and
install.sh.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 9, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 9, 2026

📝 Walkthrough

Walkthrough

A new skill wren-dlt-connector is added to the skills registry, enabling users to connect SaaS data sources via dlt pipelines into DuckDB and auto-generate Wren semantic projects. This includes metadata updates, documentation, a Python introspection script, and evaluation configurations.

Changes

Cohort / File(s) Summary
Skill Registry & Metadata
skills/README.md, skills/SKILLS.md, skills/index.json, skills/versions.json, skills/install.sh
Added wren-dlt-connector (version 1.0) entry with description, tags, and dependency on wren-generate-mdl. Updated ALL_SKILLS default list and installer logic to handle remote index.json.
Skill Documentation
skills/wren-dlt-connector/SKILL.md, skills/wren-dlt-connector/references/dlt_sources.md
Added comprehensive skill guide (four-phase workflow, constraints, commands, troubleshooting) and a reference of popular dlt sources with auth/credentials and pipeline examples.
Skill Implementation & Tests
skills/wren-dlt-connector/scripts/introspect_dlt.py, skills/wren-dlt-connector/evals/evals.json
Added CLI Python script to introspect dlt-produced DuckDB, normalize types, detect relationships, and generate Wren v2 project files; added three eval scenarios covering common ingestion workflows.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant dlt as dlt Pipeline
    participant DuckDB
    participant Introspector as introspect_dlt.py
    participant Wren as Wren CLI

    User->>dlt: configure & run pipeline
    dlt->>DuckDB: write tables (.duckdb)
    User->>Introspector: run introspect_dlt.py --duckdb-path
    Introspector->>DuckDB: read information_schema
    Introspector->>Introspector: normalize types & detect relationships
    Introspector->>Wren: write wren_project.yml + models + relationships
    User->>Wren: wren context build && run validation queries
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

python

Suggested reviewers

  • douenergy
  • wwwy3y3

Poem

🐰 I hopped through schemas, chased tables by name,
Dlt seeds the DuckDB, Wren tends the game.
Models appear as carrots, relationships bloom,
A tiny rabbit cheers—your pipeline found room! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a new skill named wren-dlt-connector at version 1.0, which aligns with the comprehensive changeset across all skill registration files and documentation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/install.sh`:
- Line 16: The installer skips dependency expansion in remote mode because
index.json is only read from the local filesystem; update the remote install
path so that before resolving dependencies (the logic that reads ALL_SKILLS and
expands required skills) the script fetches and loads index.json into the same
variable/data structure used for local installs (i.e., ensure index.json is
downloaded and parsed when running remote installs so dependency expansion for
skills like wren-dlt-connector correctly pulls wren-generate-mdl). Make this
change in the install flow that handles remote execution so both local and
remote use the same index.json data for dependency resolution.

In `@skills/wren-dlt-connector/references/dlt_sources.md`:
- Around line 87-103: The Slack usage example calls datetime(2024, 1, 1) but
does not import datetime, causing a NameError; add the missing import from
datetime at the top of the example so the call in the slack_source(...)
start_date argument works (update the example header alongside the existing
imports for dlt and slack_source).

In `@skills/wren-dlt-connector/scripts/introspect_dlt.py`:
- Around line 193-210: The code builds model identity and detects parent/child
relationships using only t.name and writes files using t.name, which causes
cross-schema collisions and unsafe file paths; change all uses of plain t.name
(including table_names set, the parent-detection loop around parts, Relationship
entries, and output path construction) to use a schema-qualified identifier
(e.g., combine t.schema or t.schema_name with t.name) for uniqueness, and
sanitize any filesystem segment derived from identifiers before creating
directories/files (use a slugify/safe-name routine or pathlib-safe escaping) so
lookup and path writing use the same schema-qualified, sanitized token.

In `@skills/wren-dlt-connector/SKILL.md`:
- Around line 107-110: Update the validation text so it accurately describes the
sample output: replace the line "Print how many rows were loaded" (and the
corresponding similar lines at 115-125) with wording that reflects what the SQL
actually prints, e.g., "Print the number of columns per table" or "Print column
counts per table"; ensure the bullet under "After the run, confirm:" and any
repeated phrasing refer to column counts rather than rows to match the sample
SQL output.
- Around line 223-231: The join example in SKILL.md uses hidden internal columns
`_dlt_parent_id` and `_dlt_id` which conflicts with the rule that `_dlt_*`
columns are not exposed in Wren models; update the example queries to use
visible model columns (e.g., explicit foreign key columns or publicly exposed id
fields) or add a note showing how to reference relationships via available
fields instead of `_dlt_parent_id`/`_dlt_id` (replace the JOIN example that
references `_dlt_*` with one that joins on public columns such as `parent_id`,
`id`, or documented relationship fields).
- Around line 180-182: When loading profiles, guard against yaml.safe_load
returning None by ensuring the loaded value falls back to an empty dict; update
the expression around yaml.safe_load(profiles_file.read_text()) (used with
profiles_file.exists()) so that the variable existing becomes a dict even when
the YAML is empty (e.g., treat the result as yaml.safe_load(...) or {}), then
proceed to call existing.setdefault("profiles", {}) as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ca96bbd-731b-4a03-92a2-c47274536d11

📥 Commits

Reviewing files that changed from the base of the PR and between 9d2639f and c91b92f.

📒 Files selected for processing (9)
  • skills/README.md
  • skills/SKILLS.md
  • skills/index.json
  • skills/install.sh
  • skills/versions.json
  • skills/wren-dlt-connector/SKILL.md
  • skills/wren-dlt-connector/evals/evals.json
  • skills/wren-dlt-connector/references/dlt_sources.md
  • skills/wren-dlt-connector/scripts/introspect_dlt.py

- Fetch index.json in remote install mode for dependency resolution
- Add missing datetime import in Slack dlt_sources example
- Schema-scope relationship detection to prevent cross-schema collisions
- Sanitize filesystem paths and qualify model names on collision
- Fix "rows loaded" text to match actual column-count output
- Guard yaml.safe_load against empty profiles.yml
- Replace _dlt_* column join example with queryable model examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/wren-dlt-connector/SKILL.md`:
- Around line 35-37: The SQL code fence containing the ATTACH DATABASE
'stripe_data.duckdb' AS "stripe_data" (READ_ONLY) statement should include a
language identifier; update the fenced block to start with ```sql instead of ```
so the snippet is marked as SQL for linting and readability in SKILL.md.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 671f8422-fd69-4132-8c74-c4db6d9ce212

📥 Commits

Reviewing files that changed from the base of the PR and between c91b92f and 959499f.

📒 Files selected for processing (4)
  • skills/install.sh
  • skills/wren-dlt-connector/SKILL.md
  • skills/wren-dlt-connector/references/dlt_sources.md
  • skills/wren-dlt-connector/scripts/introspect_dlt.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • skills/install.sh
  • skills/wren-dlt-connector/scripts/introspect_dlt.py

Comment on lines +35 to +37
```
ATTACH DATABASE 'stripe_data.duckdb' AS "stripe_data" (READ_ONLY)
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the SQL code fence.

This fence is a regular snippet (not a blockquote notification fence), so adding sql improves lint compliance and readability.

Proposed fix
-```
+```sql
 ATTACH DATABASE 'stripe_data.duckdb' AS "stripe_data" (READ_ONLY)
</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.22.0)</summary>

[warning] 35-35: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @skills/wren-dlt-connector/SKILL.md around lines 35 - 37, The SQL code fence
containing the ATTACH DATABASE 'stripe_data.duckdb' AS "stripe_data" (READ_ONLY)
statement should include a language identifier; update the fenced block to start
with sql instead of so the snippet is marked as SQL for linting and
readability in SKILL.md.


</details>

<!-- fingerprinting:phantom:triton:hawk:ae29aa70-e75f-47d5-b6c7-3bef260b3a7e -->

<!-- This is an auto-generated comment by CodeRabbit -->

@goldmedal goldmedal requested a review from douenergy April 9, 2026 08:38
@douenergy douenergy merged commit 8f20a33 into Canner:main Apr 9, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants