fix(memory): add composite-score floor to recall ranker (#582) by Aaronontheweb · Pull Request #585 · netclaw-dev/netclaw

Aaronontheweb · 2026-04-10T19:55:52Z

Summary

Adds a composite-score floor in SQLiteMemoryRecallCoordinator so automatic recall can return zero items when nothing is a strong enough match (default 10.0; nullable Session.Tuning.MinimumRecallCompositeScore override for power users).
Filters sentence-start capitalized stopwords (How, Can, Which, The, My, etc.) out of the planner's anchor-hint regex — they were being treated as semantic anchors worth +8 selector points plus +3.5 soft-scope points, pulling unrelated ops docs into recall on any English question.
Disables DomainAffinityWeight in DeterministicCandidateSelector. The domain scoping concept is half-implemented (tracked in Memory domain scoping is half-implemented: all sessions collapse to project:default #584 — Protocol.SessionId.ToMemoryDomain() always returns project:default), and the +5 affinity boost was making the floor unable to discriminate in-domain single-lexical collisions from legitimate two-lexical cross-domain matches.

New test coverage:

MemoryRecallScenarioTests (new): 17-scenario theory against a 16-memory corpus modeled on the Memory recall unconditionally injects top-N items with no score floor, polluting unrelated sessions #582 pollution shape (ops/eval trivia + two topical bands). Seeded in project:default so it actually reproduces the production coordinator's hard-scope normalization.
DeterministicCandidateSelectorTests: four score-geometry cases documenting the lexical/facet/anchor gradient as ordered inequalities.

The memory_retrieval_final log now reports filteredByFloor={Count} appliedFloor={Floor:F1} so the floor's behavior is observable in daemon logs.

Closes #582. Opens #584 for the domain-scoping follow-up.

Test plan

dotnet test src/Netclaw.Actors.Tests/Netclaw.Actors.Tests.csproj — 843 passed / 0 failed
dotnet slopwatch analyze — 0 issues
Baseline run on the unpatched coordinator reproduced 7 scenario failures (the known pollution patterns); all 7 now pass
CI green on this PR

Automatic recall was unconditionally injecting the top-N candidates on every turn regardless of how weak the match was, so sparse or eval-seeded DBs polluted unrelated sessions with operational trivia. Three related fixes land together: 1. Apply a minimum composite-score floor in SQLiteMemoryRecallCoordinator so recall can legitimately return zero items when nothing clears the bar. Default floor is 10.0 (rejects single-lexical-only matches; accepts two-lexical, lexical+facet, or lexical+anchor matches). Power-user override: Session.Tuning.MinimumRecallCompositeScore (nullable, no value written in config by default). 2. Filter sentence-start capitalized stopwords out of the planner's anchor-hint regex. Words like "How", "Can", "Which", "The", "My" were being treated as semantic anchors worth +8 selector points plus +3.5 soft-scope points, pulling unrelated ops/eval docs into recall on any English question. Adds a planner-local AnchorHintStopWords set that is broader than the tokenizer's narrow stopword list so noun meanings of "can" and "will" remain lexically matchable. 3. Disable DomainAffinityWeight in DeterministicCandidateSelector. The concept is half-implemented (#584): Protocol.SessionId.ToMemoryDomain() unconditionally returns project:default, so affinity was a coin flip rather than a real "prefer same-project memories" signal, and the +5 boost made the floor unable to discriminate in-domain single-lexical collisions from legitimate two-lexical cross-domain matches. New test coverage in Netclaw.Actors.Tests: - MemoryRecallScenarioTests: 17-scenario theory driven by a 16-memory corpus that mirrors the #582 pollution shape (ops/eval band + two topical bands). Seeds into project:default so it actually reproduces the production coordinator's hard-scope normalization, not a cross-domain regime that silently measures different math. - DeterministicCandidateSelectorTests: four score-geometry cases documenting the lexical/facet/anchor gradient as ordered inequalities so the floor can be tuned against stable reference points. memory_retrieval_final log now reports filteredByFloor and appliedFloor so the floor's behavior is observable in daemon logs. Closes #582.

Aaronontheweb enabled auto-merge (squash) April 10, 2026 19:56

Merge branch 'dev' into claude-wt-memory-cleanup

77bb6bd

Aaronontheweb merged commit 99b9053 into dev Apr 10, 2026
3 checks passed

Aaronontheweb deleted the claude-wt-memory-cleanup branch April 10, 2026 20:33

Aaronontheweb mentioned this pull request Apr 10, 2026

refactor(memory): delete dead sidecar infrastructure and HardScope plumbing #586

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(memory): add composite-score floor to recall ranker (#582)#585

fix(memory): add composite-score floor to recall ranker (#582)#585
Aaronontheweb merged 2 commits into
devfrom
claude-wt-memory-cleanup

Aaronontheweb commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aaronontheweb commented Apr 10, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant