Skip to content

feat(wren): add memory list, forget, dump & load commands#1531

Open
goldmedal wants to merge 4 commits intoCanner:mainfrom
goldmedal:feat/memory-list-forget-dump-load
Open

feat(wren): add memory list, forget, dump & load commands#1531
goldmedal wants to merge 4 commits intoCanner:mainfrom
goldmedal:feat/memory-list-forget-dump-load

Conversation

@goldmedal
Copy link
Copy Markdown
Contributor

@goldmedal goldmedal commented Apr 8, 2026

Summary

  • Add wren memory list to browse stored NL-SQL pairs with --source, --limit, --offset, --output flags
  • Add wren memory forget with interactive checkbox UI (InquirerPy) and non-interactive modes (--id, --source --force) for precise pair deletion
  • Add wren memory dump to export pairs as YAML (defaults to project queries.yml, or stdout with -o -)
  • Add wren memory load FILE to import from YAML with three modes: skip-duplicates (default), --upsert, --overwrite
  • Integrate queries.yml into project lifecycle: scaffolded by context init, auto-loaded by memory index (skippable via --no-queries)
  • Add interactive optional dependency group (InquirerPy>=0.3.4) with graceful fallback

Test plan

  • 19 new unit tests covering store methods (list, forget, dump, load) and YAML round-trip
  • All 61 memory tests pass (just test-memory)
  • Lint passes (just lint)
  • Manual test: wren memory list, wren memory list --source seed --output json
  • Manual test: wren memory forget --id 0 --force, wren memory forget --source seed --force
  • Manual test: wren memory dump -o queries.yml then wren memory load queries.yml
  • Manual test: wren memory load queries.yml --dry-run, --upsert, --overwrite
  • Manual test: wren context init creates queries.yml
  • Manual test: wren memory index auto-loads queries.yml

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Project init now creates a starter queries.yml for curated NL→SQL pairs.
    • New memory commands: list, forget, dump, and load to browse, delete, export, and import query pairs (interactive, id-, and source-based modes).
  • Options
    • memory index adds --no-queries to skip best-effort auto-loading of queries.
  • Chores
    • Added optional "interactive" install extra (included in the "all" extra).
  • Tests
    • Added unit tests for query management, import/export, deletion, and CLI YAML round-trip.
  • Documentation
    • Expanded memory guide with workflows, commands, and queries.yml practices.

@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Apr 8, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 8, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 59643930-ea14-4b84-8828-b7a0e344c6a0

📥 Commits

Reviewing files that changed from the base of the PR and between f70985d and 2cacbad.

📒 Files selected for processing (2)
  • docs/guide/memory.md
  • wren/src/wren/memory/store.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • wren/src/wren/memory/store.py

📝 Walkthrough

Walkthrough

Adds query-pair management: new MemoryStore query-history APIs, CLI subcommands to list/forget/dump/load pairs with YAML import/export and timestamps, scaffolded queries.yml on context init, and an optional interactive extra (InquirerPy).

Changes

Cohort / File(s) Summary
Dependency Management
wren/pyproject.toml
Added [project.optional-dependencies].interactive = ["InquirerPy>=0.3.4"] and included interactive in the all extra.
Project Scaffolding
wren/src/wren/context_cli.py
context init now creates a root queries.yml scaffold (version: 1, pairs: []) and lists it in the generated-files output.
Memory CLI
wren/src/wren/memory/cli.py
Added --no-queries to memory index; new subcommands memory list, memory forget, memory dump, memory load with YAML parsing/dumping, timestamps, --upsert/--overwrite/--dry-run, --source filtering, output handling, and optional interactive selection via InquirerPy.
Memory Store APIs
wren/src/wren/memory/store.py
Added query-history APIs: list_queries, count_queries_by_source, forget_queries_by_ids, forget_queries_by_source, dump_queries, _existing_pairs_index, and load_queries (supports skip/upsert/overwrite semantics and export ordering).
Tests
wren/tests/unit/test_memory.py
New unit tests for listing/pagination/filtering, delete-by-id/source, counting, dump/load behaviors, dedupe/upsert/overwrite semantics, YAML round-trip, and _existing_pairs_index.
Docs
docs/guide/memory.md
Expanded docs covering list/forget/dump/load commands, queries.yml scaffolding and auto-load notes, CLI flags, interactive UX guidance, and agent memory hygiene patterns.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI as "CLI"
    participant Inquirer as "InquirerPy"
    participant Store as "MemoryStore"
    participant DB as "LanceDB/Storage"

    alt Non-interactive ID (--id)
        User->>CLI: memory forget --id 3 [--force]
        CLI->>Store: forget_queries_by_ids([3])
        Store->>DB: DELETE rows by positional index
        DB-->>Store: deleted count
        Store-->>CLI: deleted count
    else Source + Force (--source --force)
        User->>CLI: memory forget --source appX --force
        CLI->>Store: forget_queries_by_source("appX")
        Store->>DB: DELETE where tags LIKE "source:appX"
        DB-->>Store: deleted count
        Store-->>CLI: deleted count
    else Interactive
        User->>CLI: memory forget
        CLI->>Store: list_queries()
        Store->>DB: SELECT paginated pairs
        DB-->>Store: pairs
        Store-->>CLI: pairs
        CLI->>Inquirer: prompt_checkbox(pairs)
        Inquirer-->>User: display & collect selection
        User->>Inquirer: selects
        Inquirer-->>CLI: selected IDs
        CLI->>Store: forget_queries_by_ids(selected IDs)
        Store->>DB: DELETE rows by positional index
        DB-->>Store: deleted count
        Store-->>CLI: deleted count
    end
    CLI-->>User: report result
Loading
sequenceDiagram
    actor User
    participant CLI as "CLI"
    participant YAML as "yaml.safe_load / yaml.dump"
    participant Store as "MemoryStore"
    participant DB as "LanceDB/Storage"

    rect rgba(100, 150, 200, 0.5)
    Note over CLI,YAML: Load workflow
    User->>CLI: memory load queries.yml [--upsert/--overwrite/--dry-run]
    CLI->>YAML: parse file -> pairs
    YAML-->>CLI: pairs list
    CLI->>Store: load_queries(pairs, overwrite, upsert, dry_run)
    Store->>DB: SELECT existing pairs
    DB-->>Store: existing rows
    Store->>Store: _existing_pairs_index() (dedupe/upsert mapping)
    alt dry_run
        Store-->>CLI: summary (loaded/updated/skipped)
    else apply changes
        Store->>DB: INSERT/DELETE as needed
        DB-->>Store: result counts
        Store-->>CLI: loaded/updated/skipped counts
    end
    end

    rect rgba(150, 200, 100, 0.5)
    Note over CLI,YAML: Dump workflow
    User->>CLI: memory dump [--source X] [--output file]
    CLI->>Store: dump_queries(source)
    Store->>DB: SELECT pairs (omit vector)
    DB-->>Store: pairs
    Store-->>CLI: pairs
    CLI->>YAML: _pairs_to_yaml(pairs)
    YAML-->>CLI: YAML string
    alt output specified
        CLI->>FS: write to file or stdout
    else default
        CLI->>FS: write to project/queries.yml
    end
    CLI-->>User: done
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • douenergy

Poem

🐇 I hopped through code with queries in tow,
I seeded a YAML where small pairs grow.
List, dump, load, forget — tidy paths to explore,
Checkboxes clicked, we prune the memory store.
Hop in, friend, the warren's neater than before.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 53.49% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main changes: adding four new memory subcommands (list, forget, dump, load), which aligns with the substantial work across CLI, store, tests, and documentation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@goldmedal goldmedal force-pushed the feat/memory-list-forget-dump-load branch from 91b9707 to 99366b9 Compare April 8, 2026 09:09
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
wren/src/wren/memory/store.py (2)

372-392: Table rebuild on delete is acceptable but worth noting.

The forget_queries_by_ids implementation drops and recreates the entire table to delete rows. This is simple and correct, but:

  • Loses any LanceDB-specific metadata or indexes
  • Not efficient for large tables

For the expected use case (small number of curated queries), this is likely fine. If this becomes a bottleneck, consider using LanceDB's native delete capability if available in newer versions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wren/src/wren/memory/store.py` around lines 372 - 392, forget_queries_by_ids
currently drops and recreates the entire _QUERY_TABLE (using self._db.drop_table
and create_table with _query_table_schema), which loses LanceDB-specific
metadata/indexes and is inefficient for large tables; modify this function to
use LanceDB's native row deletion API (or incremental delete methods) to remove
rows by their positional IDs while preserving table metadata and indexes, and
only fall back to the rebuild approach when the LanceDB version/engine doesn't
support native deletes.

473-478: O(n²) complexity when upserting multiple rows.

The load_queries upsert path calls _existing_pairs_index() after each deletion, rebuilding the entire index. For n upserts, this results in O(n²) complexity.

Consider collecting all row IDs to delete first, then performing a single batch delete, then inserting all new rows:

♻️ Suggested optimization for batch upsert
         if upsert:
-            if nl in nl_to_rowid:
-                self.forget_queries_by_ids([nl_to_rowid[nl]])
-                # Re-build index after deletion (row positions shift).
-                _, nl_to_rowid = self._existing_pairs_index()
-                updated += 1
-            else:
-                loaded += 1
+            # First pass: collect rows to delete
+            ids_to_delete = []
+            for p in pairs:
+                nl = p["nl"]
+                if nl in nl_to_rowid:
+                    ids_to_delete.append(nl_to_rowid[nl])
+            
+            # Batch delete
+            if ids_to_delete:
+                self.forget_queries_by_ids(ids_to_delete)
+                updated = len(ids_to_delete)
+            
+            # Insert all pairs
+            for p in pairs:
+                tags = f"source:{p.get('source', 'user')}"
+                self.store_query(
+                    nl_query=p["nl"],
+                    sql_query=p["sql"],
+                    datasource=p.get("datasource"),
+                    tags=tags,
+                )
+            loaded = len(pairs) - updated
+            return {"loaded": loaded, "skipped": 0, "updated": updated}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wren/src/wren/memory/store.py` around lines 473 - 478, The upsert path in
load_queries is rebuilding the full _existing_pairs_index() after every single
forget_queries_by_ids call, causing O(n²) behavior; instead, collect all row IDs
to remove by looking up nl in nl_to_rowid, call forget_queries_by_ids once with
the batch of IDs, then call _existing_pairs_index() a single time before
performing the new inserts and updating updated; update the logic around
nl_to_rowid, forget_queries_by_ids, and the section that increments updated to
operate on the batched delete result rather than per-item.
wren/src/wren/memory/cli.py (2)

489-494: Edge case: _parse_source may not handle multi-word tags correctly.

If tags contains multiple space-separated tags like "source:user verified", the function works correctly. However, if someone stores a tag like "verified source:user" without spaces (e.g., "verified,source:user"), it would fail to extract the source.

Given that the store itself uses the format f"source:{...}" consistently, this is unlikely to be an issue in practice, but worth noting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wren/src/wren/memory/cli.py` around lines 489 - 494, _parse_source currently
only splits tags by whitespace and misses cases like "verified,source:user";
update the function to robustly find a source token anywhere in the tags string
(e.g., use a regex search for a word boundary followed by "source:" and capture
the value, such as r'\bsource:([^,\s]+)') and return that capture if found,
otherwise return "user". Modify the _parse_source implementation to use this
search approach so it works with comma-separated or mixed delimiters while
keeping the same return behavior.

161-220: Consider narrowing the exception handling for queries.yml auto-load.

The broad except (SystemExit, Exception) at line 219 silently swallows all errors, including potential bugs like KeyError or TypeError in the load logic. While best-effort loading is the intent, completely hiding errors makes debugging difficult.

Consider logging a debug message or at least catching more specific exceptions:

♻️ Suggested improvement
         try:
             from wren.context import discover_project_path  # noqa: PLC0415

             project_path = discover_project_path(explicit=None)
             queries_file = project_path / "queries.yml"
             if queries_file.exists():
                 raw = queries_file.read_text(encoding="utf-8")
                 doc = yaml.safe_load(raw)
                 if doc and isinstance(doc, dict) and doc.get("pairs"):
                     load_result = mem_store.load_queries(doc["pairs"], upsert=False)
                     loaded = load_result["loaded"]
                     skipped = load_result["skipped"]
                     if loaded:
                         typer.echo(
                             f"Loaded {loaded} pair(s) from queries.yml"
                             f" ({skipped} skipped).",
                             err=True,
                         )
-        except (SystemExit, Exception):
-            pass  # queries.yml loading is best-effort
+        except (SystemExit, FileNotFoundError, PermissionError, yaml.YAMLError):
+            pass  # queries.yml loading is best-effort; ignore known failure modes
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wren/src/wren/memory/cli.py` around lines 161 - 220, The final broad except
(SystemExit, Exception) around the queries.yml auto-load should be replaced with
narrower, explicit exception handling: catch import/IO errors (e.g.,
FileNotFoundError, PermissionError, IsADirectoryError, UnicodeDecodeError), YAML
parsing errors (yaml.YAMLError), and the specific runtime errors you expect from
parsing/loading (e.g., KeyError, TypeError, ValueError) instead of Exception;
when one of these occurs, log a brief debug/error message including the
exception details (use typer.echo(..., err=True) or the project's logger) so
failures are visible but do not fail the command; update the try/except that
wraps discover_project_path, queries_file.read_text, yaml.safe_load, and
mem_store.load_queries and include the exception variable in the log to aid
debugging while preserving best-effort behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@wren/src/wren/memory/store.py`:
- Around line 353-362: list_queries currently resets the DataFrame index
(reset_index(drop=True)) so the assigned _row_id (offset + i) reflects the
position in the filtered/sorted view, which no longer matches the original table
indices used by forget_queries_by_ids; change list_queries to preserve the
original DataFrame index when sorting (e.g., use df.sort_values("created_at",
ascending=False).reset_index() or otherwise retain the original index column),
then when building results set each record's _row_id from that original index
value (the preserved index column) instead of offset + i so the IDs align with
the unfiltered table used by forget_queries_by_ids.

---

Nitpick comments:
In `@wren/src/wren/memory/cli.py`:
- Around line 489-494: _parse_source currently only splits tags by whitespace
and misses cases like "verified,source:user"; update the function to robustly
find a source token anywhere in the tags string (e.g., use a regex search for a
word boundary followed by "source:" and capture the value, such as
r'\bsource:([^,\s]+)') and return that capture if found, otherwise return
"user". Modify the _parse_source implementation to use this search approach so
it works with comma-separated or mixed delimiters while keeping the same return
behavior.
- Around line 161-220: The final broad except (SystemExit, Exception) around the
queries.yml auto-load should be replaced with narrower, explicit exception
handling: catch import/IO errors (e.g., FileNotFoundError, PermissionError,
IsADirectoryError, UnicodeDecodeError), YAML parsing errors (yaml.YAMLError),
and the specific runtime errors you expect from parsing/loading (e.g., KeyError,
TypeError, ValueError) instead of Exception; when one of these occurs, log a
brief debug/error message including the exception details (use typer.echo(...,
err=True) or the project's logger) so failures are visible but do not fail the
command; update the try/except that wraps discover_project_path,
queries_file.read_text, yaml.safe_load, and mem_store.load_queries and include
the exception variable in the log to aid debugging while preserving best-effort
behavior.

In `@wren/src/wren/memory/store.py`:
- Around line 372-392: forget_queries_by_ids currently drops and recreates the
entire _QUERY_TABLE (using self._db.drop_table and create_table with
_query_table_schema), which loses LanceDB-specific metadata/indexes and is
inefficient for large tables; modify this function to use LanceDB's native row
deletion API (or incremental delete methods) to remove rows by their positional
IDs while preserving table metadata and indexes, and only fall back to the
rebuild approach when the LanceDB version/engine doesn't support native deletes.
- Around line 473-478: The upsert path in load_queries is rebuilding the full
_existing_pairs_index() after every single forget_queries_by_ids call, causing
O(n²) behavior; instead, collect all row IDs to remove by looking up nl in
nl_to_rowid, call forget_queries_by_ids once with the batch of IDs, then call
_existing_pairs_index() a single time before performing the new inserts and
updating updated; update the logic around nl_to_rowid, forget_queries_by_ids,
and the section that increments updated to operate on the batched delete result
rather than per-item.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b4f0daaf-45e7-4d9c-8ce4-bc3e4ee2d762

📥 Commits

Reviewing files that changed from the base of the PR and between 40bf46f and 91b9707.

⛔ Files ignored due to path filters (1)
  • wren/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • wren/pyproject.toml
  • wren/src/wren/context_cli.py
  • wren/src/wren/memory/cli.py
  • wren/src/wren/memory/store.py
  • wren/tests/unit/test_memory.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@wren/src/wren/memory/store.py`:
- Around line 466-485: The loop builds exact_set once from existing rows but
never updates it when new rows are inserted, so duplicate pairs in the incoming
pairs list are all inserted; update exact_set (and nl_to_rowid when appropriate)
immediately after inserting a new pair so subsequent iterations see the newly
inserted pair: inside the non-upsert branch where you do loaded += 1, add
exact_set.add((nl, sql)) after the insert; similarly in the upsert branch when
you insert (the path where nl was not in nl_to_rowid and you increment loaded)
add exact_set.add((nl, sql)) and nl_to_rowid[nl] = new_row_id (or call
_existing_pairs_index() if you prefer) so duplicates in the same input batch are
detected and skipped.
- Line 399: The WHERE clause in wren/src/wren/memory/store.py currently uses a
loose LIKE pattern that can match substrings; update the assignment of the local
variable where (the line using f"tags LIKE '%source:{_esc(source)}%'") to use an
exact tag match consistent with list_queries — i.e., match the full tag token
delimited by tag boundaries (commas/starts/ends) rather than any substring.
Locate the assignment to where in the function that builds the query and replace
the pattern so it only selects rows where a tag equals "source:{source}" (or
matches as a complete token using leading/trailing delimiters), preserving the
use of _esc(source) for escaping.
- Around line 350-351: The current filter df =
df[df["tags"].str.contains(f"source:{source}", na=False)] does substring
matching and will match unintended sources; change the filter on the "tags"
column to match the exact source token (e.g., use a regex with boundaries and
escape the source value or split the comma-separated tags and check membership)
so only tags like "source:<exact>" match; update the expression that references
df["tags"].str.contains(...) accordingly (preserve df and "tags" column usage).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0e959a77-b6ac-41dc-88ab-688b4797598c

📥 Commits

Reviewing files that changed from the base of the PR and between 91b9707 and 99366b9.

⛔ Files ignored due to path filters (1)
  • wren/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • wren/pyproject.toml
  • wren/src/wren/context_cli.py
  • wren/src/wren/memory/cli.py
  • wren/src/wren/memory/store.py
  • wren/tests/unit/test_memory.py
✅ Files skipped from review due to trivial changes (4)
  • wren/src/wren/context_cli.py
  • wren/pyproject.toml
  • wren/tests/unit/test_memory.py
  • wren/src/wren/memory/cli.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
wren/src/wren/memory/cli.py (1)

393-403: Consider using _parse_source for consistency.

The source extraction logic here uses substring matching ("source:seed" in tags) while _parse_source (line 501-506) parses more robustly by splitting on whitespace. For consistency and to handle edge cases, consider reusing _parse_source:

 def _format_choice_label(row: dict, max_nl: int = 40, max_sql: int = 50) -> str:
     """Format a row as a human-readable choice label."""
-    source = "user"
-    tags = row.get("tags", "")
-    if "source:seed" in tags:
-        source = "seed"
-    elif "source:view" in tags:
-        source = "view"
+    source = _parse_source(row.get("tags", ""))
     nl = row.get("nl_query", "")[:max_nl]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wren/src/wren/memory/cli.py` around lines 393 - 403, _format_choice_label
currently determines source by substring checks on the tags string; replace that
ad-hoc logic with a call to the existing _parse_source helper to ensure
consistent parsing: call _parse_source(row.get("tags", "")) (or pass tags
variable) and use its return value (falling back to "user" if it returns falsy)
instead of the manual "source:seed" / "source:view" checks; keep the rest of the
function unchanged so nl and sql slicing behavior remains intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@wren/src/wren/memory/cli.py`:
- Around line 393-403: _format_choice_label currently determines source by
substring checks on the tags string; replace that ad-hoc logic with a call to
the existing _parse_source helper to ensure consistent parsing: call
_parse_source(row.get("tags", "")) (or pass tags variable) and use its return
value (falling back to "user" if it returns falsy) instead of the manual
"source:seed" / "source:view" checks; keep the rest of the function unchanged so
nl and sql slicing behavior remains intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0d084560-fc36-4071-990b-c47db8e9875c

📥 Commits

Reviewing files that changed from the base of the PR and between 99366b9 and db5afe2.

📒 Files selected for processing (2)
  • wren/src/wren/memory/cli.py
  • wren/src/wren/memory/store.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • wren/src/wren/memory/store.py

goldmedal and others added 3 commits April 8, 2026 21:47
Enable browsing, deleting, exporting, and importing NL-SQL pairs in
the LanceDB query_history. This gives users precise control over memory
quality instead of relying solely on `memory reset`.

New CLI commands:
- `wren memory list` — browse pairs with --source, --limit, --offset
- `wren memory forget` — interactive (InquirerPy) or non-interactive
  (--id/--source --force) deletion
- `wren memory dump` — export pairs to YAML (project queries.yml or stdout)
- `wren memory load FILE` — import from YAML with skip/upsert/overwrite modes

Project integration:
- `context init` scaffolds queries.yml
- `memory index` auto-loads queries.yml (--no-queries to skip)
- `memory dump` defaults to project queries.yml when inside a project

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix _row_id mismatch: preserve original DataFrame index in list_queries
  so forget_queries_by_ids deletes the correct rows when source filter is
  applied
- Use exact tag match (== / =) instead of substring matching (contains /
  LIKE) across all source-filtering methods
- Prevent duplicate pairs within input list during default load mode by
  updating exact_set after each insert
- Batch upsert: collect all IDs to delete first, single batch delete,
  then insert all — eliminates O(n²) index rebuilding
- Narrow exception handling in queries.yml auto-load to explicit types

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…es.yml

Add documentation for the new memory management commands and project
integration workflow: browsing pairs, interactive/non-interactive forget,
YAML dump/load, queries.yml scaffolding, auto-load on index, agent
hygiene tips, and full command reference table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@goldmedal goldmedal force-pushed the feat/memory-list-forget-dump-load branch from db5afe2 to f70985d Compare April 8, 2026 13:48
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/guide/memory.md`:
- Around line 234-244: Add a language specifier to the fenced code block that
shows the directory tree (the block starting with triple backticks and the
project_root/ tree) by changing the opening fence from ``` to ```text (or
```plaintext) so the markdown linter recognizes it as a non-code block; leave
the block contents unchanged and only update the opening fence.

In `@wren/src/wren/memory/store.py`:
- Around line 470-489: The upsert block inserts duplicates when the input pairs
contains repeated nl_query values because nl_to_rowid only covers existing rows;
fix by deduplicating or tracking during the upsert: before deleting/ inserting,
build a map from nl (use p["nl"] / nl_query) to the last pair (or iterate pairs
and keep the last occurrence) so you only delete once and insert once per nl, or
alternatively maintain an inserted_nl set inside the insertion loop (around
store_query calls) to skip subsequent pairs with the same nl; update references
to nl_to_rowid, ids_to_delete, and the store_query loop accordingly (use the
deduped_pairs map or inserted_nl set to compute updated/loaded counts
correctly).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: db25b879-dddc-46d4-bbb2-2d1cab3677e1

📥 Commits

Reviewing files that changed from the base of the PR and between db5afe2 and f70985d.

⛔ Files ignored due to path filters (1)
  • wren/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • docs/guide/memory.md
  • wren/pyproject.toml
  • wren/src/wren/context_cli.py
  • wren/src/wren/memory/cli.py
  • wren/src/wren/memory/store.py
  • wren/tests/unit/test_memory.py
✅ Files skipped from review due to trivial changes (1)
  • wren/src/wren/context_cli.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • wren/tests/unit/test_memory.py
  • wren/src/wren/memory/cli.py

- Upsert mode now deduplicates input pairs by nl_query (last wins)
  before batch delete + insert, preventing duplicate rows when the
  same nl_query appears multiple times in the input
- Add language specifier to fenced code block in memory docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant