feat(memory): add observer-driven recall planning and retrieval prototype by Aaronontheweb · Pull Request #198 · netclaw-dev/netclaw

Aaronontheweb · 2026-03-11T00:06:43Z

Summary

add sidecar-driven memory observation and recall planning with deterministic policy gates, expiry handling, and redesigned eval coverage
sync memory and identity operational guidance, OpenSpec tasks, and rollout eval tooling for the new recall model
add an isolated SQLite-backed deterministic retrieval proof of concept to validate hot-path alternatives against realistic hit/no-hit cases

Validation

dotnet test src/Netclaw.Actors.Tests/Netclaw.Actors.Tests.csproj --filter "FullyQualifiedName~~MemoryRedesignedEvalSuiteTests|FullyQualifiedName~~MemoryEvalSeedSuiteTests|FullyQualifiedName~~MemoryPolicyGatesTests|FullyQualifiedName~~SqliteMemoryToolsTests|FullyQualifiedName~~SQLiteMemoryStoreTests|FullyQualifiedName~~MemorySidecarPromptBuilderTests|FullyQualifiedName~LlmSessionIntegrationTests"
dotnet test src/Netclaw.Daemon.Tests/Netclaw.Daemon.Tests.csproj --filter "FullyQualifiedName~DaemonRuntimeStatusServiceTests"
dotnet test src/Netclaw.MemoryRetrievalPoC.Tests/Netclaw.MemoryRetrievalPoC.Tests.csproj
openspec validate "add-memory-observer-and-recall-planner" --type change --strict
dotnet slopwatch analyze

Add sidecar-based memory observation and recall planning so durable facts and evidence can be routed through deterministic gates instead of weak lexical recall alone.

Derive default expiry windows for evidence and trace memories, align curation classes with the new memory model, and exclude expired items from automatic recall.

Finish the sidecar-driven memory model with stale evidence controls, identity-boundary routing, redesigned eval coverage, and synced operational guidance for rollout.

Use the configured prompt timeout for eval warmup requests so local Ollama smoke runs do not fail before the actual cases execute.

Add an isolated SQLite-backed proof of concept that seeds realistic memories and validates deterministic graph-based recall against hit and no-hit expectations.

Use configured recall planner timeouts, normalize common sidecar JSON wrappers, and add focused tests for sidecar response parsing.

Aaronontheweb · 2026-03-11T00:10:10Z

+{
+    private static readonly Regex MarkerRegex = new("\\b[A-Z][A-Z0-9_]{2,}\\b", RegexOptions.Compiled);
+    private static readonly Regex TokenRegex = new("[A-Za-z0-9][A-Za-z0-9_-]*", RegexOptions.Compiled);
+    private static readonly HashSet<string> StopWords =


Consider replacing the hand-curated StopWords set with a public baseline list (for example Lucene/Snowball English stop words) plus a Netclaw-specific overlay. That would make this prototype easier to evolve deterministically: generic filler terms come from a standard source, while domain/prompt boilerplate terms like reply, sentence, and similar eval phrasing stay test-driven in a local overlay.

Expand the retrieval proof of concept with realistic travel preference cases, inferred facet propagation, and diversified result selection to explore deterministic hot-path alternatives.

Extend the retrieval proof of concept with inferred facets, dynamic slot detection, and bundle retrieval for composite prompts such as travel planning and preference recall.

Extend the retrieval proof of concept with explanation output for ranked hits, inferred facets, bundle slots, and corpus-derived neighborhoods so behavior can be inspected case by case.

Shift more of the deterministic retrieval proof of concept toward query-signature-driven facet activation and corpus-derived neighborhoods while preserving the current retrieval and bundle behavior.

Extend the retrieval proof of concept with a deterministic planning layer that derives hard scope, soft scope, retrieval mode, facets, and anchor hints from runtime context and prompt text.

Extend the retrieval proof of concept with a coarse candidate filter driven by the request plan so planning, filtering, reranking, and bundle retrieval can be exercised together.

Add an executable end-to-end trace for the retrieval proof of concept covering request planning, candidate selection, reranking, bundle assembly, and explain output for a representative trip-planning scenario.

Document a production architecture based on the retrieval proof-of-concept work, covering hard and soft scope, candidate selection, deterministic reranking, bundle retrieval, and explainable tracing.

Aaronontheweb · 2026-03-11T14:09:34Z

Added a deterministic retrieval architecture write-up and a chain of executable PoCs that shift the discussion away from per-turn LLM recall planning and toward a layered hot-path design.\n\nWhat is now on this branch:\n- — proposed real-system architecture\n- — concrete scenario bank for DM/channel/thread retrieval behavior\n- — executable PoCs covering:\n - deterministic reranking\n - bundle retrieval for composite prompts\n - explainable retrieval traces\n - deterministic scope/request planning\n - coarse candidate selection\n - end-to-end pipeline snapshots\n\nMain architectural conclusion from the experiments:\n1. hard scope should be runtime-owned (workspace/channel/thread/DM), not LLM-owned\n2. soft scope should come from thread/topic/title + prompt activation\n3. SQLite should do cheap candidate narrowing first\n4. deterministic reranking should handle most retrieval\n5. some prompts are not top-N doc search problems; they are bundle/slot retrieval problems\n6. LLMs are better placed at write time (metadata extraction) than in the read-path hot loop\n\nIf you want a review path through the code, I’d start with:\n- \n- \n- \n- \n- \n- \n\nThe intent here is not to claim the production design is finished, but to make the retrieval problem executable, inspectable, and test-driven before changing the real hot path.

Update the retrieval architecture with stronger entity and speaker-priority guidance, define the write-time extractor contract, and outline the first minimal production slice for deterministic request planning.

Add an implementation-ready OpenSpec change for deterministic memory retrieval, including proposal, design, tasks, and spec deltas for memory, session, and testing behavior.

Add runtime-owned hard-scope and soft-scope request planning, feature-flagged request-plan logging, and structured degraded-stage observability for the first deterministic recall integration slice.

Add retrieval metadata fields to memory proposals and SQLite persistence, enforce fail-closed validation for malformed durable memory, and update test seeds to the new contract.

Add deterministic candidate filtering on the recall path, persist retrieval metadata needed for ranking, and update the eval harness to parse the new recall telemetry format.

Persist slot metadata through SQLite writes and add formation-then-recall tests for high-signal travel preference cases so the post-extraction deterministic pipeline is verified end to end.

Aaronontheweb · 2026-03-12T22:10:37Z

Pushed latest memory + provider changes. Current PR checks are failing due to GitHub Actions billing/account limits, not code-level test failures. Latest local validation completed successfully for the touched areas:
Scan complete: 0 issue(s) found, targeted memory tests, and Determining projects to restore...
All projects are up-to-date for restore.
Netclaw.Tools.Abstractions -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Tools.Abstractions/bin/Debug/net10.0/Netclaw.Tools.Abstractions.dll
Netclaw.Configuration -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Configuration/bin/Debug/net10.0/Netclaw.Configuration.dll
Netclaw.Search -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Search/bin/Debug/net10.0/Netclaw.Search.dll
Netclaw.Security -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Security/bin/Debug/net10.0/Netclaw.Security.dll
Netclaw.Tools.Generators -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Tools.Generators/bin/Debug/netstandard2.0/Netclaw.Tools.Generators.dll
Netclaw.Actors -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Actors/bin/Debug/net10.0/Netclaw.Actors.dll
Netclaw.Channels -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Channels/bin/Debug/net10.0/Netclaw.Channels.dll
Netclaw.Daemon -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Daemon/bin/Debug/net10.0/netclawd.dll

Build succeeded.
0 Warning(s)
0 Error(s)

Time Elapsed 00:00:15.08.

Aaronontheweb added 6 commits March 10, 2026 00:17

feat(memory): scaffold observer-driven recall planning

1323574

Add sidecar-based memory observation and recall planning so durable facts and evidence can be routed through deterministic gates instead of weak lexical recall alone.

feat(memory): add expiry-aware evidence handling

4db16fb

Derive default expiry windows for evidence and trace memories, align curation classes with the new memory model, and exclude expired items from automatic recall.

feat(memory): complete observer recall planning change

3fb8551

Finish the sidecar-driven memory model with stale evidence controls, identity-boundary routing, redesigned eval coverage, and synced operational guidance for rollout.

fix(evals): honor configured warmup timeouts

995145d

Use the configured prompt timeout for eval warmup requests so local Ollama smoke runs do not fail before the actual cases execute.

test(retrieval): add deterministic memory retrieval prototype

4dcf897

Add an isolated SQLite-backed proof of concept that seeds realistic memories and validates deterministic graph-based recall against hit and no-hit expectations.

fix(memory): harden sidecar runtime handling

86d5008

Use configured recall planner timeouts, normalize common sidecar JSON wrappers, and add focused tests for sidecar response parsing.

Aaronontheweb commented Mar 11, 2026

View reviewed changes

Aaronontheweb added 8 commits March 11, 2026 01:49

test(retrieval): evolve deterministic recall prototype

9caa4ca

Expand the retrieval proof of concept with realistic travel preference cases, inferred facet propagation, and diversified result selection to explore deterministic hot-path alternatives.

test(retrieval): add bundle-aware deterministic recall

8e4f06d

Extend the retrieval proof of concept with inferred facets, dynamic slot detection, and bundle retrieval for composite prompts such as travel planning and preference recall.

test(retrieval): add explainable deterministic recall traces

9b369ed

Extend the retrieval proof of concept with explanation output for ranked hits, inferred facets, bundle slots, and corpus-derived neighborhoods so behavior can be inspected case by case.

test(retrieval): reduce hardcoded recall grouping rules

9995225

Shift more of the deterministic retrieval proof of concept toward query-signature-driven facet activation and corpus-derived neighborhoods while preserving the current retrieval and bundle behavior.

test(retrieval): add deterministic scope request planner

491d1fa

Extend the retrieval proof of concept with a deterministic planning layer that derives hard scope, soft scope, retrieval mode, facets, and anchor hints from runtime context and prompt text.

test(retrieval): add deterministic candidate selection stage

5ea44cc

Extend the retrieval proof of concept with a coarse candidate filter driven by the request plan so planning, filtering, reranking, and bundle retrieval can be exercised together.

test(retrieval): add end-to-end deterministic recall snapshot

e9d3957

Add an executable end-to-end trace for the retrieval proof of concept covering request planning, candidate selection, reranking, bundle assembly, and explain output for a representative trip-planning scenario.

docs(research): propose deterministic memory retrieval architecture

0335b0a

Document a production architecture based on the retrieval proof-of-concept work, covering hard and soft scope, candidate selection, deterministic reranking, bundle retrieval, and explainable tracing.

Aaronontheweb added 10 commits March 11, 2026 15:59

docs(research): refine deterministic memory retrieval design

92a48b8

Update the retrieval architecture with stronger entity and speaker-priority guidance, define the write-time extractor contract, and outline the first minimal production slice for deterministic request planning.

spec(memory): plan deterministic retrieval integration

57586fa

Add an implementation-ready OpenSpec change for deterministic memory retrieval, including proposal, design, tasks, and spec deltas for memory, session, and testing behavior.

feat(memory): add deterministic retrieval planning slice

4dee6ff

Add runtime-owned hard-scope and soft-scope request planning, feature-flagged request-plan logging, and structured degraded-stage observability for the first deterministic recall integration slice.

feat(memory): persist deterministic retrieval metadata

08b9bcb

Add retrieval metadata fields to memory proposals and SQLite persistence, enforce fail-closed validation for malformed durable memory, and update test seeds to the new contract.

feat(memory): wire deterministic recall candidate selection

4062f72

Add deterministic candidate filtering on the recall path, persist retrieval metadata needed for ranking, and update the eval harness to parse the new recall telemetry format.

test(memory): validate deterministic formation metadata

8263748

Persist slot metadata through SQLite writes and add formation-then-recall tests for high-signal travel preference cases so the post-extraction deterministic pipeline is verified end to end.

feat(memory): improve sidecar formation and project recall quality

708af18

fix(providers): add openai-compatible local endpoint support

e7a4335

Merge branch 'fix/lemonade-chat-provider' into dev

e6961c8

Merge branch 'dev' into feature/add-memory-observer-recall-planner

768f3ba

Aaronontheweb added 3 commits March 12, 2026 22:51

feat(providers): add raw openai-compatible transport

608c4f2

fix(memory): widen subject recall across domains

d5958f7

Merge origin/dev into feature/add-memory-observer-recall-planner

91932d2

Aaronontheweb marked this pull request as ready for review March 13, 2026 00:27

test(cli): include openai-compatible provider in listings

db33b7f

Aaronontheweb enabled auto-merge (squash) March 13, 2026 00:30

Aaronontheweb merged commit ee1589f into dev Mar 13, 2026
3 checks passed

Aaronontheweb deleted the feature/add-memory-observer-recall-planner branch March 13, 2026 00:33

Aaronontheweb mentioned this pull request Mar 13, 2026

chore(release): prepare v0.5.0 #221

Merged

4 tasks

Aaronontheweb mentioned this pull request Apr 10, 2026

refactor(memory): delete dead sidecar infrastructure and HardScope plumbing #586

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(memory): add observer-driven recall planning and retrieval prototype#198

feat(memory): add observer-driven recall planning and retrieval prototype#198
Aaronontheweb merged 28 commits into
devfrom
feature/add-memory-observer-recall-planner

Aaronontheweb commented Mar 11, 2026

Uh oh!

Aaronontheweb Mar 11, 2026

Uh oh!

Aaronontheweb commented Mar 11, 2026

Uh oh!

Aaronontheweb commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aaronontheweb commented Mar 11, 2026

Summary

Validation

Uh oh!

Aaronontheweb Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Aaronontheweb commented Mar 11, 2026

Uh oh!

Aaronontheweb commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant