Skip to content

docs: restructure as single source of truth with auto-sync#1530

Merged
goldmedal merged 1 commit intomainfrom
docs/restructure-single-source
Apr 8, 2026
Merged

docs: restructure as single source of truth with auto-sync#1530
goldmedal merged 1 commit intomainfrom
docs/restructure-single-source

Conversation

@goldmedal
Copy link
Copy Markdown
Contributor

@goldmedal goldmedal commented Apr 8, 2026

Summary

  • Restructure docs/ to mirror the doc website layout (get_started/, concept/, guide/, reference/)
  • Move 5 existing doc files into the new structure; backport 11 pages from the doc website
  • Convert all absolute Docusaurus links to relative .md links (works in both GitHub preview and Docusaurus)
  • Add scripts/sync-docs.sh for local manual sync via gh CLI
  • Add .github/workflows/sync-docs.yml — auto-creates a PR on the doc website when docs change on main
  • Target repo/branch configured via repository variables (DOCS_REPO, DOCS_REPO_BRANCH), not hardcoded

Setup required

After merging, set the repository variables:

gh variable set DOCS_REPO -R Canner/wren-engine --body '<owner/repo>'
gh variable set DOCS_REPO_BRANCH -R Canner/wren-engine --body 'master'

And ensure the CROSS_REPO_TOKEN secret has push + PR access to the doc website repo.

Test plan

  • Verify all relative links resolve in the GitHub PR file preview
  • Run scripts/sync-docs.sh locally (dry-run) to confirm diff is clean against the doc website
  • After merge, verify the GitHub Action triggers and creates a correct sync PR

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Documentation

    • Added comprehensive documentation covering installation, quick start, and database connection workflows
    • Introduced new concept guides explaining context, MDL (Modeling Definition Language), and benefits for AI agents
    • Added architecture documentation detailing the engine's modular design and data flows
    • Published CLI and skills reference guides with command usage and examples
    • Added memory and profile management guides
  • Chores

    • Implemented automated documentation synchronization to the website

Add scripts/sync-docs.sh for local manual sync via gh CLI.
@github-actions github-actions bot added documentation Improvements or additions to documentation ci labels Apr 8, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 8, 2026

📝 Walkthrough

Walkthrough

This PR establishes documentation infrastructure and content for Wren Engine. It introduces a GitHub Actions workflow to automatically synchronize documentation changes from the main branch to a separate documentation website repository, along with supporting configuration and tooling. It also adds comprehensive documentation covering installation, core concepts, usage guides, and CLI references.

Changes

Cohort / File(s) Summary
Documentation Sync Automation
.github/workflows/sync-docs.yml, scripts/sync-docs.sh, docs/.sync.yml
GitHub Actions workflow, bash script, and configuration for automated docs synchronization to external documentation site. Workflow triggers on docs changes to main, validates diffs, and conditionally creates PRs; script enables local testing of sync.
Docs Landing Page
docs/README.md
Updated to point to external documentation site (docs.getwren.ai), document GitHub Actions sync behavior, reorganize navigation into grouped topics (Get Started, Concepts, Guides, Reference), and list non-synced files.
Concept Documentation
docs/concept/what_is_context.md, docs/concept/what_is_mdl.md, docs/concept/benefits_llm.md, docs/concept/architecture.md
Four new concept pages introducing context as structured business understanding, MDL as modeling language, LLM benefits and agent support, and detailed architecture/control flow with CLI, planning pipeline, connectors, and memory layer.
Getting Started Documentation
docs/get_started/installation.md, docs/get_started/connect.md
Two new guides covering CLI installation with prerequisites and extras, skill setup, and end-to-end workflow for connecting to databases, creating profiles, initializing MDL projects, and executing queries.
Guide Documentation
docs/guide/modeling/overview.md, docs/guide/profiles.md, docs/guide/memory.md
Three new guides: MDL modeling primitives and use cases, profile-based database connection management, and LanceDB-backed memory layer for schema context and query history.
Reference Documentation
docs/reference/cli.md, docs/reference/skills.md
Two new reference pages: comprehensive CLI command documentation (query, dry-plan, dry-run, memory subcommands) and skills framework describing available skills and installation/update mechanisms.
Documentation Link Fixes
docs/guide/modeling/model.md
Updated internal documentation link from ./relationship.md to ./relation.md in column reference table.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

documentation, ci

Suggested reviewers

  • douenergy

Poem

🐰 A hop through the docs, now neatly arranged,
From concepts to guides, beautifully changed!
With sync-automation and profiles so neat,
Our documentation's now synced and sweet! ✨
A rabbit-approved workflow, no hops to regret! 🌟

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main objective: restructuring docs as a single source of truth with automatic synchronization to the documentation website.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/restructure-single-source

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
docs/concept/what_is_context.md (1)

3-3: Minor: Consider simplifying "in order to" → "to".

Line 3 uses "in order to work" which could be simplified to "to work" for more concise writing. This is a stylistic preference, not an error.

✏️ Proposed simplification
-In Wren Engine, context is the structured business understanding an AI agent needs in order to work with data correctly.
+In Wren Engine, context is the structured business understanding an AI agent needs to work with data correctly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/concept/what_is_context.md` at line 3, Replace the phrase "in order to
work with data correctly" with the more concise "to work with data correctly" in
the sentence that begins "In Wren Engine, context is the structured business
understanding..." so the line reads "...context is the structured business
understanding an AI agent needs to work with data correctly." This is a simple
stylistic change—locate that sentence in what_is_context.md and update the
wording accordingly.
docs/concept/architecture.md (1)

99-117: Add language identifiers to fenced code blocks.

The ASCII/text diagrams in code blocks should specify a language identifier to satisfy markdown linters and improve clarity. While they render correctly without it, adding text would be more explicit.

📝 Proposed fix to add language identifiers

For line 99:

-```
+```text
 User SQL (e.g. SELECT * FROM orders WHERE status = 'pending')

Apply the same pattern to the other three code blocks at lines 176, 188, and 199.

Also applies to: 176-184, 188-195, 199-204

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/concept/architecture.md` around lines 99 - 117, Add the language
identifier `text` to each fenced code block that contains the ASCII/text
diagrams so markdown linters are satisfied; specifically update the block
starting with "User SQL (e.g. SELECT * FROM orders WHERE status = 'pending')"
and the subsequent diagram block that begins with the CTE/WITH example (the
block showing WITH "orders" AS (...)), plus the other three diagram blocks
referenced in the comment, by changing ``` to ```text for those fenced code
blocks (leave the diagram content unchanged).
scripts/sync-docs.sh (1)

82-85: Script may fail if PR already exists for this branch.

Same issue as the workflow: gh pr create will error if a PR from sync/engine-docs-${SHORT_SHA} already exists (e.g., on script re-run). Consider checking for existing PRs first or using --fill with an existing PR check.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/sync-docs.sh` around lines 82 - 85, The PR creation step using gh pr
create (the PR_URL assignment) will fail if a PR already exists for branch
sync/engine-docs-${SHORT_SHA}; update the script to first check for an existing
PR for that head (e.g., use gh pr list or gh pr view filtered by --head
"sync/engine-docs-${SHORT_SHA}" or by searching title/branch) and if found
capture its URL into PR_URL instead of calling gh pr create, otherwise call gh
pr create as before; reference the existing symbols PR_URL, gh pr create,
SHORT_SHA and TARGET_BRANCH when implementing this conditional flow.
.github/workflows/sync-docs.yml (2)

34-37: Directory list is duplicated in three places.

The sync directories are hardcoded here, in scripts/sync-docs.sh (line 30), and declared in docs/.sync.yml. If the sync scope changes, all three must be updated in lockstep.

Consider extracting to a shared source (e.g., reading from .sync.yml via yq) or at minimum adding a comment cross-referencing the other locations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/sync-docs.yml around lines 34 - 37, The hardcoded
directory list (get_started, concept, guide, reference) is duplicated across
.github/workflows/sync-docs.yml, scripts/sync-docs.sh and docs/.sync.yml; update
the workflow to read the canonical list from docs/.sync.yml (e.g., parse with
yq) or at minimum add a clear cross-reference comment pointing to
scripts/sync-docs.sh and docs/.sync.yml; locate the loop in sync-docs.yml (the
for dir in ...; do block) and replace the hardcoded list with a command that
loads directories from docs/.sync.yml via yq (or add the comment) and ensure the
same symbol names (get_started, concept, guide, reference) are used
consistently.

51-61: Workflow may fail on re-run if branch already exists.

If the workflow is re-triggered for the same commit (e.g., manual re-run), git push origin "${BRANCH}" will fail because the branch already exists on the remote. Similarly, gh pr create will fail if a PR from that branch is already open.

Consider adding guards:

🛠️ Proposed fix
          BRANCH="sync/engine-docs-${GITHUB_SHA::8}"
          git config user.name "github-actions[bot]"
          git config user.email "github-actions[bot]@users.noreply.github.com"
          git checkout -b "${BRANCH}"
          git add -A
          git commit -m "docs: sync from wren-engine@${GITHUB_SHA::8}"
-         git push origin "${BRANCH}"
-         gh pr create \
+         git push origin "${BRANCH}" --force-with-lease
+         
+         # Skip PR creation if one already exists for this branch
+         if ! gh pr list --head "${BRANCH}" --json number --jq '.[0].number' | grep -q .; then
+           gh pr create \
            --title "docs: sync Wren Engine docs from wren-engine" \
            --body "Auto-synced from [wren-engine@\`${GITHUB_SHA::8}\`](https://github.com/Canner/wren-engine/commit/${GITHUB_SHA})." \
            --base "${{ vars.DOCS_REPO_BRANCH }}"
+         fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/sync-docs.yml around lines 51 - 61, The workflow will fail
on re-run if the branch or PR already exists; update the job around the BRANCH
variable and the git/gh steps to guard against duplicates by first checking for
the remote branch and existing PR before pushing/creating: verify if remote
branch "${BRANCH}" exists and skip or update it (instead of blindly running git
push origin "${BRANCH}"), and before running gh pr create, check for an open PR
from that branch (use gh pr view / gh pr list or the GitHub API) and only call
gh pr create when no PR exists; ensure the logic still creates the branch and
commits when needed and handles idempotent re-runs gracefully.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/guide/profiles.md`:
- Around line 87-93: The JSON examples include invalid `//` comments; remove the
inline `// Flat format` and `// Envelope format (auto-unwrapped)` from inside
the JSON fences, split them into two separate ```json``` code blocks (one
containing {"datasource":"postgres",...} and the other containing
{"datasource":"duckdb",...}) and add a short plain-text sentence after the
blocks stating "The first example shows the flat format; the second shows the
envelope format (auto-unwrapped)." to preserve the explanations; update the
section in profiles.md where the two commented JSON examples appear.

In `@docs/reference/cli.md`:
- Around line 65-71: Replace the invalid commented JSON examples under the "Flat
format" and "Envelope format (auto-unwrapped)" blocks by removing the bash-code
comments, marking the fences as ```json, and providing valid JSON objects (e.g.,
include full fields instead of "..." in the flat example such as "database",
"user", "password"); specifically update the block titled "Flat format" to a
proper JSON code fence with a complete object and update the "Envelope format
(auto-unwrapped)" block to a ```json fence containing
{"datasource":"duckdb","properties":{"url":"/data","format":"duckdb"}} so both
snippets are valid JSON and free of inline comments.

In `@scripts/sync-docs.sh`:
- Around line 53-56: Guard against empty variables before calling rm by
validating TARGET and each dir are non-empty and not "/" and then use safe
parameter expansion; for example, in the loop over SYNC_DIRS check [ -n
"${TARGET:-}" ] && [ -n "${dir:-}" ] and optionally [ "${TARGET}" != "/" ] and [
"${dir}" != "" ] before running rm, or replace the risky call with a safe
expansion like rm -rf -- "${TARGET:?}/${dir:?}" (and ensure REPO_ROOT is also
validated similarly) so rm cannot accidentally target the filesystem root;
update the loop that references SYNC_DIRS, TARGET, and REPO_ROOT accordingly.

---

Nitpick comments:
In @.github/workflows/sync-docs.yml:
- Around line 34-37: The hardcoded directory list (get_started, concept, guide,
reference) is duplicated across .github/workflows/sync-docs.yml,
scripts/sync-docs.sh and docs/.sync.yml; update the workflow to read the
canonical list from docs/.sync.yml (e.g., parse with yq) or at minimum add a
clear cross-reference comment pointing to scripts/sync-docs.sh and
docs/.sync.yml; locate the loop in sync-docs.yml (the for dir in ...; do block)
and replace the hardcoded list with a command that loads directories from
docs/.sync.yml via yq (or add the comment) and ensure the same symbol names
(get_started, concept, guide, reference) are used consistently.
- Around line 51-61: The workflow will fail on re-run if the branch or PR
already exists; update the job around the BRANCH variable and the git/gh steps
to guard against duplicates by first checking for the remote branch and existing
PR before pushing/creating: verify if remote branch "${BRANCH}" exists and skip
or update it (instead of blindly running git push origin "${BRANCH}"), and
before running gh pr create, check for an open PR from that branch (use gh pr
view / gh pr list or the GitHub API) and only call gh pr create when no PR
exists; ensure the logic still creates the branch and commits when needed and
handles idempotent re-runs gracefully.

In `@docs/concept/architecture.md`:
- Around line 99-117: Add the language identifier `text` to each fenced code
block that contains the ASCII/text diagrams so markdown linters are satisfied;
specifically update the block starting with "User SQL (e.g. SELECT * FROM orders
WHERE status = 'pending')" and the subsequent diagram block that begins with the
CTE/WITH example (the block showing WITH "orders" AS (...)), plus the other
three diagram blocks referenced in the comment, by changing ``` to ```text for
those fenced code blocks (leave the diagram content unchanged).

In `@docs/concept/what_is_context.md`:
- Line 3: Replace the phrase "in order to work with data correctly" with the
more concise "to work with data correctly" in the sentence that begins "In Wren
Engine, context is the structured business understanding..." so the line reads
"...context is the structured business understanding an AI agent needs to work
with data correctly." This is a simple stylistic change—locate that sentence in
what_is_context.md and update the wording accordingly.

In `@scripts/sync-docs.sh`:
- Around line 82-85: The PR creation step using gh pr create (the PR_URL
assignment) will fail if a PR already exists for branch
sync/engine-docs-${SHORT_SHA}; update the script to first check for an existing
PR for that head (e.g., use gh pr list or gh pr view filtered by --head
"sync/engine-docs-${SHORT_SHA}" or by searching title/branch) and if found
capture its URL into PR_URL instead of calling gh pr create, otherwise call gh
pr create as before; reference the existing symbols PR_URL, gh pr create,
SHORT_SHA and TARGET_BRANCH when implementing this conditional flow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d91f4d8d-0fac-4401-985c-44dc78f14ef1

📥 Commits

Reviewing files that changed from the base of the PR and between 40bf46f and 1c070ed.

📒 Files selected for processing (20)
  • .github/workflows/sync-docs.yml
  • docs/.sync.yml
  • docs/README.md
  • docs/concept/architecture.md
  • docs/concept/benefits_llm.md
  • docs/concept/what_is_context.md
  • docs/concept/what_is_mdl.md
  • docs/get_started/connect.md
  • docs/get_started/installation.md
  • docs/get_started/quickstart.md
  • docs/guide/memory.md
  • docs/guide/modeling/model.md
  • docs/guide/modeling/overview.md
  • docs/guide/modeling/relation.md
  • docs/guide/modeling/view.md
  • docs/guide/modeling/wren_project.md
  • docs/guide/profiles.md
  • docs/reference/cli.md
  • docs/reference/skills.md
  • scripts/sync-docs.sh

@goldmedal goldmedal requested a review from douenergy April 8, 2026 08:02
@goldmedal goldmedal merged commit 9d2639f into main Apr 8, 2026
8 checks passed
@goldmedal goldmedal deleted the docs/restructure-single-source branch April 8, 2026 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants