Skip to content

feat(memory): add observer-driven recall planning and retrieval prototype#198

Merged
Aaronontheweb merged 28 commits into
devfrom
feature/add-memory-observer-recall-planner
Mar 13, 2026
Merged

feat(memory): add observer-driven recall planning and retrieval prototype#198
Aaronontheweb merged 28 commits into
devfrom
feature/add-memory-observer-recall-planner

Conversation

@Aaronontheweb

Copy link
Copy Markdown
Collaborator

Summary

  • add sidecar-driven memory observation and recall planning with deterministic policy gates, expiry handling, and redesigned eval coverage
  • sync memory and identity operational guidance, OpenSpec tasks, and rollout eval tooling for the new recall model
  • add an isolated SQLite-backed deterministic retrieval proof of concept to validate hot-path alternatives against realistic hit/no-hit cases

Validation

  • dotnet test src/Netclaw.Actors.Tests/Netclaw.Actors.Tests.csproj --filter "FullyQualifiedNameMemoryRedesignedEvalSuiteTests|FullyQualifiedNameMemoryEvalSeedSuiteTests|FullyQualifiedNameMemoryPolicyGatesTests|FullyQualifiedNameSqliteMemoryToolsTests|FullyQualifiedNameSQLiteMemoryStoreTests|FullyQualifiedNameMemorySidecarPromptBuilderTests|FullyQualifiedName~LlmSessionIntegrationTests"
  • dotnet test src/Netclaw.Daemon.Tests/Netclaw.Daemon.Tests.csproj --filter "FullyQualifiedName~DaemonRuntimeStatusServiceTests"
  • dotnet test src/Netclaw.MemoryRetrievalPoC.Tests/Netclaw.MemoryRetrievalPoC.Tests.csproj
  • openspec validate "add-memory-observer-and-recall-planner" --type change --strict
  • dotnet slopwatch analyze

Add sidecar-based memory observation and recall planning so durable facts and evidence can be routed through deterministic gates instead of weak lexical recall alone.
Derive default expiry windows for evidence and trace memories, align curation classes with the new memory model, and exclude expired items from automatic recall.
Finish the sidecar-driven memory model with stale evidence controls, identity-boundary routing, redesigned eval coverage, and synced operational guidance for rollout.
Use the configured prompt timeout for eval warmup requests so local Ollama smoke runs do not fail before the actual cases execute.
Add an isolated SQLite-backed proof of concept that seeds realistic memories and validates deterministic graph-based recall against hit and no-hit expectations.
Use configured recall planner timeouts, normalize common sidecar JSON wrappers, and add focused tests for sidecar response parsing.
{
private static readonly Regex MarkerRegex = new("\\b[A-Z][A-Z0-9_]{2,}\\b", RegexOptions.Compiled);
private static readonly Regex TokenRegex = new("[A-Za-z0-9][A-Za-z0-9_-]*", RegexOptions.Compiled);
private static readonly HashSet<string> StopWords =

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider replacing the hand-curated StopWords set with a public baseline list (for example Lucene/Snowball English stop words) plus a Netclaw-specific overlay. That would make this prototype easier to evolve deterministically: generic filler terms come from a standard source, while domain/prompt boilerplate terms like reply, sentence, and similar eval phrasing stay test-driven in a local overlay.

Expand the retrieval proof of concept with realistic travel preference cases, inferred facet propagation, and diversified result selection to explore deterministic hot-path alternatives.
Extend the retrieval proof of concept with inferred facets, dynamic slot detection, and bundle retrieval for composite prompts such as travel planning and preference recall.
Extend the retrieval proof of concept with explanation output for ranked hits, inferred facets, bundle slots, and corpus-derived neighborhoods so behavior can be inspected case by case.
Shift more of the deterministic retrieval proof of concept toward query-signature-driven facet activation and corpus-derived neighborhoods while preserving the current retrieval and bundle behavior.
Extend the retrieval proof of concept with a deterministic planning layer that derives hard scope, soft scope, retrieval mode, facets, and anchor hints from runtime context and prompt text.
Extend the retrieval proof of concept with a coarse candidate filter driven by the request plan so planning, filtering, reranking, and bundle retrieval can be exercised together.
Add an executable end-to-end trace for the retrieval proof of concept covering request planning, candidate selection, reranking, bundle assembly, and explain output for a representative trip-planning scenario.
Document a production architecture based on the retrieval proof-of-concept work, covering hard and soft scope, candidate selection, deterministic reranking, bundle retrieval, and explainable tracing.
@Aaronontheweb

Copy link
Copy Markdown
Collaborator Author

Added a deterministic retrieval architecture write-up and a chain of executable PoCs that shift the discussion away from per-turn LLM recall planning and toward a layered hot-path design.\n\nWhat is now on this branch:\n- — proposed real-system architecture\n- — concrete scenario bank for DM/channel/thread retrieval behavior\n- — executable PoCs covering:\n - deterministic reranking\n - bundle retrieval for composite prompts\n - explainable retrieval traces\n - deterministic scope/request planning\n - coarse candidate selection\n - end-to-end pipeline snapshots\n\nMain architectural conclusion from the experiments:\n1. hard scope should be runtime-owned (workspace/channel/thread/DM), not LLM-owned\n2. soft scope should come from thread/topic/title + prompt activation\n3. SQLite should do cheap candidate narrowing first\n4. deterministic reranking should handle most retrieval\n5. some prompts are not top-N doc search problems; they are bundle/slot retrieval problems\n6. LLMs are better placed at write time (metadata extraction) than in the read-path hot loop\n\nIf you want a review path through the code, I’d start with:\n- \n- \n- \n- \n- \n- \n\nThe intent here is not to claim the production design is finished, but to make the retrieval problem executable, inspectable, and test-driven before changing the real hot path.

Update the retrieval architecture with stronger entity and speaker-priority guidance, define the write-time extractor contract, and outline the first minimal production slice for deterministic request planning.
Add an implementation-ready OpenSpec change for deterministic memory retrieval, including proposal, design, tasks, and spec deltas for memory, session, and testing behavior.
Add runtime-owned hard-scope and soft-scope request planning, feature-flagged request-plan logging, and structured degraded-stage observability for the first deterministic recall integration slice.
Add retrieval metadata fields to memory proposals and SQLite persistence, enforce fail-closed validation for malformed durable memory, and update test seeds to the new contract.
Add deterministic candidate filtering on the recall path, persist retrieval metadata needed for ranking, and update the eval harness to parse the new recall telemetry format.
Persist slot metadata through SQLite writes and add formation-then-recall tests for high-signal travel preference cases so the post-extraction deterministic pipeline is verified end to end.
@Aaronontheweb

Copy link
Copy Markdown
Collaborator Author

Pushed latest memory + provider changes. Current PR checks are failing due to GitHub Actions billing/account limits, not code-level test failures. Latest local validation completed successfully for the touched areas:
Scan complete: 0 issue(s) found, targeted memory tests, and Determining projects to restore...
All projects are up-to-date for restore.
Netclaw.Tools.Abstractions -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Tools.Abstractions/bin/Debug/net10.0/Netclaw.Tools.Abstractions.dll
Netclaw.Configuration -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Configuration/bin/Debug/net10.0/Netclaw.Configuration.dll
Netclaw.Search -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Search/bin/Debug/net10.0/Netclaw.Search.dll
Netclaw.Security -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Security/bin/Debug/net10.0/Netclaw.Security.dll
Netclaw.Tools.Generators -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Tools.Generators/bin/Debug/netstandard2.0/Netclaw.Tools.Generators.dll
Netclaw.Actors -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Actors/bin/Debug/net10.0/Netclaw.Actors.dll
Netclaw.Channels -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Channels/bin/Debug/net10.0/Netclaw.Channels.dll
Netclaw.Daemon -> /home/petabridge/repositories/stannardlabs/netclaw/src/Netclaw.Daemon/bin/Debug/net10.0/netclawd.dll

Build succeeded.
0 Warning(s)
0 Error(s)

Time Elapsed 00:00:15.08.

@Aaronontheweb Aaronontheweb marked this pull request as ready for review March 13, 2026 00:27
@Aaronontheweb Aaronontheweb enabled auto-merge (squash) March 13, 2026 00:30
@Aaronontheweb Aaronontheweb merged commit ee1589f into dev Mar 13, 2026
3 checks passed
@Aaronontheweb Aaronontheweb deleted the feature/add-memory-observer-recall-planner branch March 13, 2026 00:33
@Aaronontheweb Aaronontheweb mentioned this pull request Mar 13, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant