refactor: harden Mem0 adapter with pre-PR review fixes by Aureliolo · Pull Request #332 · Aureliolo/synthorg

Aureliolo · 2026-03-12T23:25:38Z

Summary

Remove vendor-specific defaults from Mem0EmbedderConfig — provider and model are now required fields (no "openai" / "text-embedding-3-small" defaults), enforcing the vendor-agnostic rule
Harden adapter with defensive parsing (invalid categories, malformed datetimes, missing IDs, empty content), path traversal validation, explicit ImportError handling, real health check probes, and immutable metadata construction
Improve logging — replace logger.exception with logger.warning for self-raised errors, add structured error/error_type kwargs to all re-raise logs
Update factory to require embedder config parameter (no vendor defaults baked in)
Remove dead code (store_request_to_mem0_args) and extract _validate_add_result helper
Add comprehensive tests for all new defensive paths, protocol conformance, and edge cases

Changes

File	Change
`backends/mem0/config.py`	`Mem0EmbedderConfig` fields now required; `Mem0BackendConfig` path traversal validator; `build_config_from_company_config` takes `embedder` kwarg
`backends/mem0/adapter.py`	16 fixes: immutable metadata, `ImportError` branch, health probe, publisher check in retract, generator expression in `search_shared`, meaningful variable names, defensive `_extract_category`, `_validate_add_result` helper
`backends/mem0/mappers.py`	Defensive `try/except` in `parse_mem0_datetime` and `parse_mem0_metadata`; ID/content validation in `mem0_result_to_entry`; remove dead `store_request_to_mem0_args`
`memory/factory.py`	Accept optional `embedder` param, validate type, raise `MemoryConfigError` if missing for mem0 backend
`memory/__init__.py`	Export `Mem0EmbedderConfig`
Tests (6 files)	+14 new test cases, fixtures updated for required embedder, vendor names removed from assertions

Test plan

uv run ruff check src/ tests/ — passes
uv run ruff format src/ tests/ — clean
uv run mypy src/ tests/ — no errors
uv run pytest tests/ -n auto --cov=ai_company --cov-fail-under=80 — 7213 passed, 94.59% coverage

Review coverage

Pre-reviewed by 9 agents (code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, docs-consistency). 33 findings identified, 32 implemented, 1 deferred (_client: Any — Mem0 has no typed interface).

Closes #206

github-actions · 2026-03-12T23:25:52Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

Scorecard details

Package

Version

Score

Details

pip/backoff

2.2.1

🟢 3.4

Details

Check	Score	Reason
Code-Review	⚠️ 2	Found 7/24 approved changesets -- score normalized to 2
Maintained	⚠️ 0	project is archived
Packaging	⚠️ -1	packaging workflow not detected
Token-Permissions	⚠️ 0	detected GitHub workflow tokens with excessive permissions
Dangerous-Workflow	🟢 10	no dangerous workflow patterns detected
Binary-Artifacts	🟢 10	no binaries found in the repo
Pinned-Dependencies	⚠️ 0	dependency not pinned by hash detected -- score normalized to 0
CII-Best-Practices	⚠️ 0	no effort to earn an OpenSSF best practices badge detected
Security-Policy	⚠️ 0	security policy file not detected
Fuzzing	⚠️ 0	project is not fuzzed
License	🟢 10	license file detected
Signed-Releases	⚠️ -1	no releases found
Branch-Protection	⚠️ 0	branch protection not enabled on development/release branches
SAST	🟢 7	SAST tool detected but not run on all commits

pip/greenlet

3.3.2

Unknown

pip/grpcio

1.78.0

Unknown

pip/h2

4.3.0

Unknown

pip/hpack

4.1.0

Unknown

pip/hyperframe

6.1.0

Unknown

pip/mem0ai

1.0.5

Unknown

pip/numpy

2.4.3

Unknown

pip/portalocker

3.2.0

Unknown

pip/posthog

7.9.12

Unknown

pip/protobuf

5.29.6

Unknown

pip/pytz

2026.1.post1

Unknown

pip/qdrant-client

1.17.0

Unknown

pip/sqlalchemy

2.0.48

Unknown

Scanned Files

uv.lock

coderabbitai · 2026-03-12T23:25:57Z

Caution

Review failed

The head commit changed during the review from f7d017c to 28eba7a.

📝 Walkthrough

Walkthrough

Adds a concrete Mem0 memory backend: new Mem0 backend package (config, mappers, adapter), factory dispatch and exports, tests (unit + integration), docs/CI updates, and a pyproject dependency for mem0ai.

Changes

Cohort / File(s)	Summary
Dependency & CI `pyproject.toml`, `.github/workflows/dependency-review.yml`	Adds `mem0ai==1.0.5`, mypy override for `mem0.*`, updates dependency-review allow-list and licenses.
Top-level & backend exports `src/ai_company/memory/__init__.py`, `src/ai_company/memory/backends/__init__.py`, `src/ai_company/memory/backends/mem0/__init__.py`	Re-exports Mem0 symbols (`Mem0EmbedderConfig`, `Mem0MemoryBackend`) from package initializers.
Mem0 Backend: adapter `src/ai_company/memory/backends/mem0/adapter.py`	New `Mem0MemoryBackend` implementing lifecycle, connection guards, per-agent CRUD, shared-knowledge ops, metadata handling, validation, and observability hooks.
Mem0 Backend: config `src/ai_company/memory/backends/mem0/config.py`	Adds `Mem0EmbedderConfig` and `Mem0BackendConfig` (Pydantic), path traversal checks, and helpers `build_mem0_config_dict` and `build_config_from_company_config`.
Mem0 Backend: mappers `src/ai_company/memory/backends/mem0/mappers.py`	Mapping utilities for metadata, datetime parsing, relevance normalization, result→MemoryEntry conversion, query translation, post-filters, and validation helpers.
Factory integration `src/ai_company/memory/factory.py`	`create_memory_backend` now accepts an `embedder` arg and dispatches to construct and return a `Mem0MemoryBackend`, with embedder validation and new observability events.
Observability `src/ai_company/observability/events/memory.py`	Adds `MEMORY_BACKEND_CONFIG_INVALID` event constant.
Tests — integration & unit `tests/integration/memory/test_mem0_backend.py`, `tests/unit/memory/backends/mem0/`, `tests/unit/memory/`	Extensive new tests covering adapter behavior, config validation, mappers, factory, and retrieval pipeline flows using mocked Mem0 client.
Docs & README `CLAUDE.md`, `README.md`, `docs/design/memory.md`, `docs/roadmap/index.md`	Docs updated to mark Mem0 implemented/optional and adjust roadmap/status text.
Other `tests/unit/memory/test_init.py`	Tests updated to assert new top-level exports (`Mem0EmbedderConfig`, `Mem0MemoryBackend`).

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Backend as Mem0MemoryBackend
    participant Mem0 as Mem0<br/>(Memory Instance)
    participant Qdrant as Qdrant<br/>(Vector Store)
    participant SQLite as SQLite<br/>(History DB)

    rect rgba(52, 152, 219, 0.5)
        Note over Client,SQLite: Store Memory Flow
        Client->>Backend: store(agent_id, request)
        Backend->>Mem0: add(namespace, name, description, metadata)
        Mem0->>Qdrant: embed & index description/metadata
        Mem0->>SQLite: persist history record
        Mem0-->>Backend: result with memory_id
        Backend-->>Client: memory_id
    end

    rect rgba(46, 204, 113, 0.5)
        Note over Client,SQLite: Retrieve Memory Flow
        Client->>Backend: retrieve(agent_id, query)
        Backend->>Mem0: search(namespace, text, limit, filters)
        Mem0->>Qdrant: vector search on embeddings
        Qdrant-->>Mem0: ranked results
        Mem0->>SQLite: fetch full records & metadata
        Mem0-->>Backend: results with scores & metadata
        Backend->>Backend: parse_mem0_datetime / parse_mem0_metadata
        Backend->>Backend: apply_post_filters
        Backend-->>Client: filtered MemoryEntry tuples
    end

sequenceDiagram
    actor PublisherAgent as Publisher<br/>(Agent A)
    actor SearcherAgent as Searcher<br/>(Agent B)
    participant Backend as Mem0MemoryBackend
    participant Mem0 as Mem0<br/>(Memory Instance)
    participant Qdrant as Qdrant<br/>(Vector Store)

    rect rgba(231, 76, 60, 0.5)
        Note over PublisherAgent,Qdrant: Publish Shared Memory
        PublisherAgent->>Backend: publish(agent_id, request)
        Backend->>Backend: build_mem0_metadata (publisher_key=agent_id)
        Backend->>Mem0: add(SHARED_NAMESPACE, description, metadata)
        Mem0->>Qdrant: embed & index with publisher metadata
        Mem0-->>Backend: memory_id
        Backend-->>PublisherAgent: memory_id
    end

    rect rgba(155, 89, 182, 0.5)
        Note over SearcherAgent,Qdrant: Search Shared Memory (Exclude Self)
        SearcherAgent->>Backend: search_shared(query, exclude_agent=agent_id)
        Backend->>Mem0: search(SHARED_NAMESPACE, text, limit)
        Mem0->>Qdrant: vector search
        Qdrant-->>Mem0: results with scores
        Mem0-->>Backend: results
        Backend->>Backend: filter out publisher==exclude_agent
        Backend-->>SearcherAgent: shared entries (excluding self)
    end

    rect rgba(241, 196, 15, 0.5)
        Note over PublisherAgent,Qdrant: Retract Shared Memory
        PublisherAgent->>Backend: retract(agent_id, memory_id)
        Backend->>Mem0: get(SHARED_NAMESPACE, memory_id)
        Mem0-->>Backend: result with publisher metadata
        Backend->>Backend: validate publisher_key == agent_id
        Backend->>Mem0: delete(SHARED_NAMESPACE, memory_id)
        Mem0->>Qdrant: remove indexed entry
        Mem0-->>Backend: success
        Backend-->>PublisherAgent: true
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 34.40% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change: hardening the Mem0 adapter with pre-PR review fixes, addressing defensive parsing, logging improvements, and validation enhancements.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, detailing vendor defaults removal, adapter hardening, logging improvements, factory updates, test additions, and providing clear verification of the changes.
Linked Issues check	✅ Passed	The PR successfully implements the concrete Mem0MemoryBackend adapter meeting all requirements: MemoryBackend protocol implementation [`#206`], MemoryCapabilities protocol support, SharedKnowledgeStore integration, retrieval pipeline integration, and comprehensive test coverage.
Out of Scope Changes check	✅ Passed	All changes are scoped to the Mem0 adapter implementation and hardening. Documentation updates, workflow adjustments for dependencies, and test infrastructure additions are directly supporting the core implementation objective.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/mem0-adapter

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/mem0-adapter

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-12T23:26:41Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors and hardens the Mem0 memory adapter, enhancing its reliability and maintainability. The changes ensure a more robust and vendor-agnostic integration with Mem0, improving error handling, configuration clarity, and overall system resilience. This work addresses pre-PR review findings and includes comprehensive testing to validate the improvements.

Highlights

Vendor-Agnostic Configuration: Removed vendor-specific defaults from Mem0EmbedderConfig, making provider and model required fields to enforce a vendor-agnostic approach.
Adapter Hardening: Improved the Mem0 adapter with defensive parsing for invalid categories, malformed datetimes, missing IDs, and empty content, alongside path traversal validation and explicit ImportError handling.
Improved Logging: Replaced logger.exception with logger.warning for self-raised errors and added structured error/error_type kwargs to all re-raise logs for better observability.
Factory Update: Updated the memory factory to require an embedder configuration parameter for the Mem0 backend, removing baked-in vendor defaults.
Code Cleanup and Refactoring: Removed dead code (store_request_to_mem0_args) and extracted a _validate_add_result helper function for clarity and maintainability.
Comprehensive Testing: Added extensive test cases covering new defensive paths, protocol conformance, and various edge cases to ensure robustness.

Changelog

pyproject.toml
- Added mem0ai to project dependencies.
- Configured MyPy to ignore missing imports for mem0 module.
src/ai_company/memory/init.py
- Updated docstring to include Mem0MemoryBackend.
- Exported Mem0EmbedderConfig and Mem0MemoryBackend.
src/ai_company/memory/backends/init.py
- Added Mem0MemoryBackend to the list of concrete backend implementations.
src/ai_company/memory/backends/mem0/init.py
- Defined the Mem0 backend package structure, exporting its adapter and config components.
src/ai_company/memory/backends/mem0/adapter.py
- Implemented the Mem0MemoryBackend adapter, conforming to MemoryBackend, MemoryCapabilities, and SharedKnowledgeStore protocols.
- Added _validate_add_result helper for robust memory ID extraction.
- Integrated real health check probes and explicit ImportError handling.
- Ensured immutable metadata construction and publisher checks in shared memory operations.
- Improved variable names and defensive parsing for categories and datetimes.
src/ai_company/memory/backends/mem0/config.py
- Introduced Mem0EmbedderConfig with required provider and model fields.
- Defined Mem0BackendConfig including a path traversal validator for data_dir.
- Provided build_mem0_config_dict to format configuration for the Mem0 SDK.
- Added build_config_from_company_config to derive Mem0-specific configuration.
src/ai_company/memory/backends/mem0/mappers.py
- Created stateless mapping functions for converting between SynthOrg domain models and Mem0 dicts.
- Implemented build_mem0_metadata for serializing metadata.
- Added parse_mem0_datetime with robust error handling for malformed datetimes.
- Introduced normalize_relevance_score to clamp scores within a valid range.
- Developed parse_mem0_metadata for deserializing Mem0 metadata.
- Implemented mem0_result_to_entry with validation for missing IDs and empty content.
- Provided query_to_mem0_search_args and query_to_mem0_getall_args for query argument conversion.
- Added apply_post_filters for post-retrieval filtering by category, tags, time, and relevance.
src/ai_company/memory/factory.py
- Modified create_memory_backend to correctly instantiate Mem0MemoryBackend.
- Updated create_memory_backend to require an embedder configuration parameter for Mem0.
tests/integration/memory/test_mem0_backend.py
- Added new integration tests for the Mem0 backend, covering store, retrieve, shared knowledge flow, and counting.
tests/unit/memory/backends/mem0/test_adapter.py
- Added new unit tests for the Mem0 adapter, verifying properties, capabilities, lifecycle, connection guards, CRUD, and shared knowledge operations.
tests/unit/memory/backends/mem0/test_config.py
- Added new unit tests for Mem0 configuration, including validation of embedder and backend settings, and path traversal prevention.
tests/unit/memory/backends/mem0/test_mappers.py
- Added new unit tests for Mem0 mappers, covering metadata handling, datetime parsing, relevance scoring, and query argument generation.
tests/unit/memory/test_factory.py
- Updated unit tests for create_memory_backend to reflect the new Mem0 backend implementation and embedder requirements.
tests/unit/memory/test_init.py
- Updated __all__ exports to include Mem0EmbedderConfig and Mem0MemoryBackend.
uv.lock
- Updated locked dependencies to include mem0ai, backoff, greenlet, grpcio, h2, hpack, hyperframe, numpy, portalocker, posthog, protobuf, pytz, qdrant-client, and sqlalchemy.

Activity

The pull request was pre-reviewed by 9 agents (code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, docs-consistency).
33 findings were identified during the pre-review process.
32 of the identified findings have been implemented.
1 finding (_client: Any — Mem0 has no typed interface) was deferred.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

codecov · 2026-03-12T23:28:22Z

Codecov Report

❌ Patch coverage is 96.27660% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.81%. Comparing base (7a68d34) to head (28eba7a).
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/ai_company/memory/backends/mem0/adapter.py	96.55%	7 Missing ⚠️
src/ai_company/memory/backends/mem0/mappers.py	94.35%	5 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #332      +/-   ##
==========================================
+ Coverage   93.76%   93.81%   +0.04%     
==========================================
  Files         434      439       +5     
  Lines       19850    20222     +372     
  Branches     1915     1945      +30     
==========================================
+ Hits        18613    18971     +358     
- Misses        956      968      +12     
- Partials      281      283       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

greptile-apps · 2026-03-12T23:29:29Z

Greptile Summary

This PR implements and hardens the Mem0 memory backend adapter, completing the backends/mem0/ package with an adapter, config, and mapper layer that plugs into the existing MemoryBackend / SharedKnowledgeStore protocols. The changes are well-scoped and the overall quality is high — required embedder config, immutable models, path-traversal validation, a real health probe, structured logging with error/error_type kwargs, and 94% test coverage.

Key changes:

Mem0EmbedderConfig provider/model are now required fields (no vendor-name defaults), enforcing the vendor-agnostic convention from CLAUDE.md
Adapter hardened with defensive parsing for missing IDs, blank content, malformed datetimes, invalid categories, and ImportError handling
_validate_add_result extracted to mappers.py as validate_add_result, keeping adapter.py under the 800-line limit
retract() ownership-check errors correctly bypass the re-logging handler
Factory updated to validate and require an Mem0EmbedderConfig for the mem0 backend

Issues found:

Logic: In retrieve() and search_shared(), a single malformed Mem0 result item (missing/blank ID or empty content) raises MemoryRetrievalError inside the tuple() call, aborting the entire retrieval and discarding all valid results — a particular concern for search_shared() where one corrupt shared-knowledge entry renders the store inaccessible to all agents.
Style: apply_post_filters silently skips the min_relevance check for entries with relevance_score = None (all get_all() results), which is undocumented and may surprise callers who set min_relevance > 0 on unfiltered queries.

Confidence Score: 3/5

Safe to merge with the caveat that a single malformed shared-knowledge entry can render search_shared inaccessible until manually remediated.
The implementation is thorough, well-tested, and architecturally sound. The one logic issue — batch retrieval failing entirely on a single malformed item — is a real operational risk in production (especially for the shared knowledge store), though it won't cause data loss. The style issue with min_relevance and unscored entries is low-severity. All previously flagged issues in the prior review threads have been addressed except for the ones explicitly deferred.
adapter.py lines 333–335 (retrieve) and 659–665 (search_shared) — single-item failure aborts entire batch

Important Files Changed

Filename	Overview
src/ai_company/memory/backends/mem0/adapter.py	New 785-line adapter implementing MemoryBackend, MemoryCapabilities, and SharedKnowledgeStore protocols. Well-structured with defensive error handling, PEP 758 multi-exception syntax, and clean logging. One logic issue: single malformed Mem0 result inside retrieve()/search_shared() aborts the entire batch instead of skipping the bad entry.
src/ai_company/memory/backends/mem0/mappers.py	New 386-line stateless mapper module with strong defensive parsing for IDs, content, datetime, and metadata. The min_relevance filter silently passes unscored (get_all) entries — undocumented behaviour that could surprise callers. extract_publisher returns blank strings without stripping (separate existing thread).
src/ai_company/memory/backends/mem0/config.py	Mem0EmbedderConfig and Mem0BackendConfig with required embedder fields (no vendor defaults) and a path-traversal validator using both PurePosixPath and PureWindowsPath to guard data_dir. Clean and safe.
src/ai_company/memory/factory.py	Factory updated to accept optional embedder kwarg, validates type and presence for the mem0 backend, and raises MemoryConfigError with descriptive messages. Dead placeholder code removed.
tests/unit/memory/backends/mem0/test_adapter.py	Comprehensive 1117-line unit test suite covering lifecycle, connection guard, CRUD, shared knowledge, and error paths. Uses vendor-agnostic embedder config in all fixtures.
tests/integration/memory/test_mem0_backend.py	Integration tests covering the full store→retrieve→rank→format pipeline and the publish→search_shared→retract flow, using a mocked Mem0 client for isolation.
src/ai_company/memory/init.py	Adds Mem0EmbedderConfig and Mem0MemoryBackend to the public memory package exports. Clean and consistent with existing export style.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant Adapter as Mem0MemoryBackend
    participant Mappers
    participant Mem0 as Mem0 Client (asyncio.to_thread)

    Note over Caller,Mem0: store() / publish()
    Caller->>Adapter: store(agent_id, request)
    Adapter->>Mappers: build_mem0_metadata(request)
    Mappers-->>Adapter: metadata dict
    Adapter->>Mem0: client.add(messages, user_id, metadata, infer=False)
    Mem0-->>Adapter: {results: [{id, memory, event}]}
    Adapter->>Mappers: validate_add_result(result, context)
    Mappers-->>Adapter: NotBlankStr(memory_id)
    Adapter-->>Caller: memory_id

    Note over Caller,Mem0: retrieve() — text search path
    Caller->>Adapter: retrieve(agent_id, query)
    Adapter->>Mem0: client.search(query, user_id, limit)
    Mem0-->>Adapter: {results: [...]}
    loop for each item in results
        Adapter->>Mappers: mem0_result_to_entry(item, agent_id)
        Mappers->>Mappers: parse_mem0_datetime, parse_mem0_metadata, normalize_relevance_score
        Mappers-->>Adapter: MemoryEntry
    end
    Adapter->>Mappers: apply_post_filters(entries, query)
    Mappers-->>Adapter: filtered tuple
    Adapter-->>Caller: tuple[MemoryEntry, ...]

    Note over Caller,Mem0: retract() — ownership-verified delete
    Caller->>Adapter: retract(agent_id, memory_id)
    Adapter->>Mem0: client.get(memory_id)
    Mem0-->>Adapter: raw dict or None
    Adapter->>Mappers: extract_publisher(raw)
    Mappers-->>Adapter: publisher str or None
    alt publisher is None
        Adapter-->>Caller: MemoryStoreError (not shared)
    else publisher != agent_id
        Adapter-->>Caller: MemoryStoreError (ownership mismatch)
    else ownership confirmed
        Adapter->>Mem0: client.delete(memory_id)
        Adapter-->>Caller: True
    end

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/ai_company/memory/backends/mem0/adapter.py
Line: 333-335

Comment:
**Single malformed item aborts entire batch retrieval**

Both `retrieve()` and `search_shared()` build the result tuple by eagerly evaluating a generator that calls `mem0_result_to_entry()` per item. `mem0_result_to_entry` raises `MemoryRetrievalError` for any item with a missing/blank ID or empty content. Because the exception is thrown inside the `tuple()` call, it propagates out of the `try` block and the entire retrieval fails — all valid results from the same batch are discarded.

This is particularly impactful for `search_shared()`: if even one stored shared-knowledge entry is malformed (due to a direct Mem0 write, data corruption, or a bug in a prior version), the whole shared knowledge store becomes **completely inaccessible** to all agents until the corrupt record is manually removed.

`mem0_result_to_entry` already logs a warning before raising, so a try/except skip-and-warn pattern at the call site would preserve that observability while returning partial results instead of failing completely:

```python
# retrieve() – lines 333–335
entries_list: list[MemoryEntry] = []
for item in raw_list:
    try:
        entries_list.append(mem0_result_to_entry(item, str(agent_id)))
    except MemoryRetrievalError:
        pass  # warning already logged inside mem0_result_to_entry
entries = tuple(entries_list)
```

The same pattern should be applied to `search_shared()` lines 659–665.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/ai_company/memory/backends/mem0/mappers.py
Line: 311-316

Comment:
**Unscored entries bypass `min_relevance` filter silently**

The short-circuit on line 313 (`and entry.relevance_score is not None`) means that any entry without a relevance score — which includes **all** results from `get_all()` since Mem0 only computes scores during vector search — will pass the `min_relevance` filter unconditionally, regardless of how high the caller sets it.

A caller who invokes `retrieve()` without `query.text` (which takes the `get_all` path) and also sets `query.min_relevance = 0.5` expecting to restrict results to high-confidence hits will silently receive all stored entries. The current behaviour is defensible ("can't filter what wasn't scored"), but since it diverges from what the parameter name implies and is not documented, it is likely to produce confusing results in practice.

At minimum the docstring for `apply_post_filters` should document that `min_relevance` is a no-op for entries that were not retrieved via vector search.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: 28eba7a}

gemini-code-assist

Code Review

This is an excellent pull request that significantly hardens the Mem0 adapter. The changes are well-structured, introducing clear separation of concerns with new config and mappers modules. The code is robust, with thorough defensive programming, explicit error handling, and improved logging. The addition of comprehensive unit and integration tests is also a major plus. I have only a couple of minor suggestions to further improve robustness against potentially malformed API responses.

gemini-code-assist · 2026-03-12T23:29:45Z

+    results_list = result.get("results", [])
+    if not results_list:
+        msg = f"Mem0 add returned no results for {context}"
+        raise MemoryStoreError(msg)


The code assumes that result.get("results") will either be a list or None. If the mem0 API returns a non-list, non-None value for "results" (e.g., a string or a dictionary), this code will raise a TypeError on results_list[0] instead of the intended MemoryStoreError. It would be more robust to also check the type of results_list.

Suggested change

results_list = result.get("results", [])

if not results_list:

msg = f"Mem0 add returned no results for {context}"

raise MemoryStoreError(msg)

results_list = result.get("results")

if not isinstance(results_list, list) or not results_list:

msg = f"Mem0 add returned no or invalid 'results' for {context}"

raise MemoryStoreError(msg)

gemini-code-assist · 2026-03-12T23:29:45Z

+    raw_tags = raw_metadata.get(f"{_PREFIX}tags", ())
+    tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip())


The code assumes that the value for the tags key in the metadata will be an iterable. While the default is a tuple, if the key exists in raw_metadata with a non-iterable value (e.g., an integer), the list comprehension will raise a TypeError. Adding a type check would make this more robust against malformed data from the backend.

Suggested change

raw_tags = raw_metadata.get(f"{_PREFIX}tags", ())

tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip())

raw_tags = raw_metadata.get(f"{_PREFIX}tags")

tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip()) if isinstance(raw_tags, (list, tuple)) else ()

Copilot

Pull request overview

This PR finishes and hardens a concrete mem0 memory backend implementation, making the memory layer functional end-to-end (factory → backend adapter → retrieval pipeline) while enforcing vendor-agnostic configuration and adding broad test coverage.

Changes:

Adds a Mem0 backend package (adapter.py, config.py, mappers.py) and wires it into the memory factory and public exports.
Requires explicit embedder configuration (no vendor defaults) and adds path traversal validation for data_dir.
Adds comprehensive unit + integration tests for the Mem0 adapter, config, and mapping behavior.

Reviewed changes

Copilot reviewed 14 out of 17 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
`pyproject.toml`	Adds `mem0ai` dependency and mypy ignore for `mem0.*`.
`uv.lock`	Locks new dependency graph for `mem0ai` and transitive deps.
`src/ai_company/memory/factory.py`	Creates a real Mem0 backend (requires `embedder=`) instead of placeholder errors.
`src/ai_company/memory/backends/mem0/adapter.py`	Implements Mem0-backed `MemoryBackend` + shared store operations with logging and guards.
`src/ai_company/memory/backends/mem0/mappers.py`	Introduces stateless mapping/parsing utilities between Mem0 dicts and domain models.
`src/ai_company/memory/backends/mem0/config.py`	Defines Mem0-specific config + builder for Mem0 SDK config dict.
`src/ai_company/memory/backends/mem0/__init__.py`	Exposes Mem0 backend/config types.
`src/ai_company/memory/backends/__init__.py`	Exposes concrete backends package-level.
`src/ai_company/memory/__init__.py`	Re-exports `Mem0MemoryBackend` and `Mem0EmbedderConfig`.
`tests/unit/memory/test_init.py`	Updates export expectations for new Mem0 symbols.
`tests/unit/memory/test_factory.py`	Updates factory tests for new required `embedder=` behavior and Mem0 backend creation.
`tests/unit/memory/backends/mem0/test_config.py`	Adds config validation tests (required fields, traversal checks, config dict builder).
`tests/unit/memory/backends/mem0/test_mappers.py`	Adds mapper/metadata/datetime parsing tests + filtering tests.
`tests/unit/memory/backends/mem0/test_adapter.py`	Adds extensive adapter unit tests using a mocked Mem0 client.
`tests/integration/memory/test_mem0_backend.py`	Adds retrieval-pipeline integration tests with mocked Mem0 client.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        try:
+            config_dict = build_mem0_config_dict(self._mem0_config)
+            client = await asyncio.to_thread(Memory.from_config, config_dict)
+        except Exception as exc:
+            logger.warning(
+                MEMORY_BACKEND_CONNECTION_FAILED,
+                backend="mem0",
+                error=str(exc),
+                error_type=type(exc).__name__,
+            )
+            msg = f"Failed to connect to Mem0: {exc}"
+            raise MemoryConnectionError(msg) from exc


+        except Exception as exc:
+            logger.warning(
+                MEMORY_ENTRY_STORE_FAILED,
+                agent_id=agent_id,
+                error=str(exc),
+                error_type=type(exc).__name__,
+            )
+            msg = f"Failed to store memory: {exc}"
+            raise MemoryStoreError(msg) from exc


+        except Exception as exc:
+            logger.warning(
+                MEMORY_ENTRY_RETRIEVAL_FAILED,
+                agent_id=agent_id,
+                error=str(exc),
+                error_type=type(exc).__name__,
+            )
+            msg = f"Failed to retrieve memories: {exc}"
+            raise MemoryRetrievalError(msg) from exc


+        except Exception as exc:
+            logger.warning(
+                MEMORY_ENTRY_FETCH_FAILED,
+                agent_id=agent_id,
+                memory_id=memory_id,
+                error=str(exc),
+                error_type=type(exc).__name__,
+            )
+            msg = f"Failed to get memory {memory_id}: {exc}"
+            raise MemoryRetrievalError(msg) from exc


+        except Exception as exc:
+            logger.warning(
+                MEMORY_ENTRY_DELETE_FAILED,
+                agent_id=agent_id,
+                memory_id=memory_id,
+                error=str(exc),
+                error_type=type(exc).__name__,
+            )
+            msg = f"Failed to delete memory {memory_id}: {exc}"
+            raise MemoryStoreError(msg) from exc


+    confidence = raw_metadata.get(f"{_PREFIX}confidence", 1.0)
+    source = raw_metadata.get(f"{_PREFIX}source")
+    raw_tags = raw_metadata.get(f"{_PREFIX}tags", ())
+    tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip())


+        except Exception as exc:
+            logger.warning(
+                MEMORY_SHARED_PUBLISH_FAILED,
+                agent_id=agent_id,
+                error=str(exc),
+                error_type=type(exc).__name__,
+            )
+            msg = f"Failed to publish shared memory: {exc}"
+            raise MemoryStoreError(msg) from exc


+        except Exception as exc:
+            logger.warning(
+                MEMORY_SHARED_RETRACT_FAILED,
+                agent_id=agent_id,
+                memory_id=memory_id,
+                error=str(exc),
+                error_type=type(exc).__name__,
+            )
+            msg = f"Failed to retract shared memory {memory_id}: {exc}"
+            raise MemoryStoreError(msg) from exc


+        try:
+            await asyncio.to_thread(
+                self._client.get_all,
+                user_id=_SHARED_NAMESPACE,
+                limit=1,
+            )
+        except Exception:
+            logger.debug(
+                MEMORY_BACKEND_HEALTH_CHECK,
+                backend="mem0",
+                healthy=False,
+            )
+            return False


+    if "id" not in raw:
+        msg = f"Mem0 result missing required 'id' field: keys={list(raw.keys())}"
+        raise MemoryRetrievalError(msg)
+    memory_id = NotBlankStr(str(raw["id"]))


coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/memory/backends/mem0/adapter.py`:
- Around line 495-555: The count method currently uses self._client.get_all with
limit=self._max_memories_per_agent which can cap results and undercount when an
agent has more memories than _max_memories_per_agent; update MemoryAdapter.count
to fetch the full set (e.g. use the backend's pagination or a no-limit/count
endpoint) instead of passing the capped limit — specifically change the call
site using self._client.get_all (and any helpers around _max_memories_per_agent)
to iterate pages or request an unbounded count, then compute total (and preserve
the category filter logic using _extract_category) and keep existing error
handling and logging in count, MEMORY_ENTRY_COUNT_FAILED, and
MEMORY_ENTRY_COUNTED.

In `@src/ai_company/memory/backends/mem0/config.py`:
- Around line 110-128: The builder build_config_from_company_config currently
always constructs Mem0BackendConfig without setting collection_name, so add an
optional parameter (e.g., collection_name: Optional[str] = None) to
build_config_from_company_config and pass it into the returned
Mem0BackendConfig(...) (use the provided collection_name when non-empty,
otherwise allow Mem0BackendConfig to keep its default); update the function
signature and the call site(s) to wire through CompanyMemoryConfig or explicit
overrides as needed so deployments can customize collection_name.

In `@src/ai_company/memory/factory.py`:
- Around line 51-61: The embedder validation in the Mem0 branch currently raises
MemoryConfigError without logging; before each raise in the checks that test "if
embedder is None" and "if not isinstance(embedder, Mem0EmbedderConfig)" add a
log call (matching the style used for the unknown-backend path) that emits
WARNING or ERROR with context including the backend name, the embedder
value/type (use type(embedder).__name__ or repr(embedder) as appropriate) and
the intended requirement, then raise MemoryConfigError as before; reference the
embedder variable, Mem0EmbedderConfig, and MemoryConfigError in your changes so
the log provides the same contextual info as the other error path.

In `@tests/unit/memory/backends/mem0/test_config.py`:
- Around line 96-108: Combine the two tests test_rejects_parent_traversal_unix
and test_rejects_parent_traversal_windows into a single parametrized test using
pytest.mark.parametrize: create one test (e.g., test_rejects_parent_traversal)
that takes data_dir as a parameter, parametrize it with the two values
("/data/../escape" and "C:\\data\\..\\escape"), and keep the existing assertion
that constructing Mem0BackendConfig(data_dir=data_dir, embedder=_embedder())
raises ValidationError matching "parent-directory traversal"; this removes
duplicated setup/assertion while still testing both paths.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 61c4af4d-53a3-4b4b-ae55-fe78a42ca8ae

📥 Commits

Reviewing files that changed from the base of the PR and between fa8bf1d and b04e3e6.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (16)

pyproject.toml
src/ai_company/memory/__init__.py
src/ai_company/memory/backends/__init__.py
src/ai_company/memory/backends/mem0/__init__.py
src/ai_company/memory/backends/mem0/adapter.py
src/ai_company/memory/backends/mem0/config.py
src/ai_company/memory/backends/mem0/mappers.py
src/ai_company/memory/factory.py
tests/integration/memory/test_mem0_backend.py
tests/unit/memory/backends/__init__.py
tests/unit/memory/backends/mem0/__init__.py
tests/unit/memory/backends/mem0/test_adapter.py
tests/unit/memory/backends/mem0/test_config.py
tests/unit/memory/backends/mem0/test_mappers.py
tests/unit/memory/test_factory.py
tests/unit/memory/test_init.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Agent
GitHub Check: Greptile Review
GitHub Check: Test (Python 3.14)

🧰 Additional context used

📓 Path-based instructions (4)

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark tests with @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, and @pytest.mark.slow
Prefer @pytest.mark.parametrize for testing similar cases

Files:

tests/unit/memory/backends/mem0/test_mappers.py
tests/unit/memory/backends/mem0/test_config.py
tests/unit/memory/test_factory.py
tests/unit/memory/backends/mem0/test_adapter.py
tests/integration/memory/test_mem0_backend.py
tests/unit/memory/test_init.py

{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names like example-provider, example-large-001, test-provider, test-small-001

Files:

tests/unit/memory/backends/mem0/test_mappers.py
src/ai_company/memory/backends/__init__.py
src/ai_company/memory/backends/mem0/config.py
tests/unit/memory/backends/mem0/test_config.py
src/ai_company/memory/__init__.py
tests/unit/memory/test_factory.py
src/ai_company/memory/backends/mem0/__init__.py
tests/unit/memory/backends/mem0/test_adapter.py
tests/integration/memory/test_mem0_backend.py
src/ai_company/memory/backends/mem0/mappers.py
src/ai_company/memory/backends/mem0/adapter.py
src/ai_company/memory/factory.py
tests/unit/memory/test_init.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
Use except A, B: (no parentheses) for exception syntax on Python 3.14 per PEP 758 — ruff enforces this
Type hints required on all public functions; mypy strict mode enforced
Google-style docstrings required on public classes and functions, enforced by ruff D rules
Create new objects, never mutate existing ones. For non-Pydantic internal collections, use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; separate mutable-via-copy models for runtime state using model_copy(update=...)
Use Pydantic v2 with BaseModel, model_validator, computed_field, ConfigDict
Use @computed_field for derived values instead of storing redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls) — use structured concurrency over bare create_task
Line length must be 88 characters (ruff enforced)
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Every module with business logic must have: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code
Use only logger as the variable name (not _logger, not log)
Use domain-specific event name constants from ai_company.observability.events modules (e.g...

Files:

src/ai_company/memory/backends/__init__.py
src/ai_company/memory/backends/mem0/config.py
src/ai_company/memory/__init__.py
src/ai_company/memory/backends/mem0/__init__.py
src/ai_company/memory/backends/mem0/mappers.py
src/ai_company/memory/backends/mem0/adapter.py
src/ai_company/memory/factory.py

pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: All version pinning in pyproject.toml uses == (exact versions)
Dependency groups: test (pytest + plugins), dev (includes test + ruff, mypy, pre-commit, commitizen) — install via uv sync

Files:

pyproject.toml

🧬 Code graph analysis (11)

tests/unit/memory/backends/mem0/test_mappers.py (3)

src/ai_company/core/enums.py (1)

MemoryCategory (101-108)

src/ai_company/memory/errors.py (1)

MemoryRetrievalError (25-26)

src/ai_company/memory/models.py (4)

MemoryEntry (82-150)

MemoryMetadata (20-52)

MemoryQuery (153-230)

MemoryStoreRequest (55-79)

src/ai_company/memory/backends/__init__.py (1)

src/ai_company/memory/backends/mem0/adapter.py (1)

Mem0MemoryBackend (101-772)

src/ai_company/memory/backends/mem0/config.py (1)

src/ai_company/memory/config.py (1)

CompanyMemoryConfig (139-195)

tests/unit/memory/backends/mem0/test_config.py (2)

src/ai_company/memory/backends/mem0/config.py (4)

Mem0BackendConfig (46-78)

Mem0EmbedderConfig (17-43)

build_config_from_company_config (110-128)

build_mem0_config_dict (81-107)

src/ai_company/memory/config.py (2)

CompanyMemoryConfig (139-195)

MemoryStorageConfig (25-104)

src/ai_company/memory/__init__.py (2)

src/ai_company/memory/backends/mem0/config.py (1)

Mem0EmbedderConfig (17-43)

src/ai_company/memory/backends/mem0/adapter.py (1)

Mem0MemoryBackend (101-772)

tests/unit/memory/test_factory.py (5)

src/ai_company/memory/backends/mem0/adapter.py (4)

Mem0MemoryBackend (101-772)

is_connected (210-212)

backend_name (215-217)

max_memories_per_agent (247-249)

src/ai_company/memory/backends/mem0/config.py (1)

Mem0EmbedderConfig (17-43)

src/ai_company/memory/config.py (2)

CompanyMemoryConfig (139-195)

MemoryOptionsConfig (107-136)

src/ai_company/memory/errors.py (1)

MemoryConfigError (40-41)

src/ai_company/memory/factory.py (1)

create_memory_backend (22-83)

src/ai_company/memory/backends/mem0/__init__.py (2)

src/ai_company/memory/backends/mem0/adapter.py (1)

Mem0MemoryBackend (101-772)

src/ai_company/memory/backends/mem0/config.py (2)

Mem0BackendConfig (46-78)

Mem0EmbedderConfig (17-43)

tests/unit/memory/backends/mem0/test_adapter.py (5)

src/ai_company/core/enums.py (1)

MemoryCategory (101-108)

src/ai_company/memory/backends/mem0/config.py (2)

Mem0BackendConfig (46-78)

Mem0EmbedderConfig (17-43)

src/ai_company/memory/errors.py (3)

MemoryConnectionError (17-18)

MemoryRetrievalError (25-26)

MemoryStoreError (21-22)

src/ai_company/memory/models.py (2)

MemoryQuery (153-230)

MemoryStoreRequest (55-79)

tests/unit/memory/test_factory.py (1)

_test_embedder (15-20)

src/ai_company/memory/backends/mem0/mappers.py (3)

src/ai_company/core/enums.py (1)

MemoryCategory (101-108)

src/ai_company/memory/errors.py (1)

MemoryRetrievalError (25-26)

src/ai_company/memory/models.py (4)

MemoryEntry (82-150)

MemoryMetadata (20-52)

MemoryQuery (153-230)

MemoryStoreRequest (55-79)

src/ai_company/memory/backends/mem0/adapter.py (3)

src/ai_company/core/enums.py (1)

MemoryCategory (101-108)

src/ai_company/memory/backends/mem0/config.py (2)

Mem0BackendConfig (46-78)

build_mem0_config_dict (81-107)

src/ai_company/memory/models.py (1)

MemoryEntry (82-150)

src/ai_company/memory/factory.py (4)

src/ai_company/memory/config.py (1)

CompanyMemoryConfig (139-195)

src/ai_company/memory/errors.py (1)

MemoryConfigError (40-41)

src/ai_company/memory/protocol.py (1)

MemoryBackend (20-182)

src/ai_company/memory/backends/mem0/config.py (2)

Mem0EmbedderConfig (17-43)

build_config_from_company_config (110-128)

🔇 Additional comments (20)

tests/unit/memory/backends/mem0/test_mappers.py (2)

1-51: LGTM! Well-structured test helper and imports.

The _make_entry helper function is well-designed with sensible defaults. The pytestmark timeout is appropriate for unit tests.

53-387: Comprehensive test coverage for mapper functions.

The test suite covers:

Metadata serialization/deserialization with edge cases

Datetime parsing including timezone handling and malformed inputs

Relevance score normalization with boundary conditions

Result-to-entry conversion with validation errors

Query translation and post-filtering logic

Good use of descriptive test names and inline comments (e.g., Line 350) explaining expected behavior.

tests/unit/memory/test_factory.py (1)

15-54: LGTM! Factory tests cover key validation paths.

The tests properly validate:

Backend creation with embedder config

Config propagation (max_memories_per_agent)

Error cases for missing/wrong embedder type

The _test_embedder() helper follows the coding guideline to avoid vendor names.

src/ai_company/memory/backends/mem0/config.py (2)

69-78: Good path traversal validation.

The validator correctly handles both Windows (C:\..\) and POSIX (../) path traversal patterns by checking parts from both path types.

81-107: The hardcoded version "v1.1" is correct.

The Mem0 SDK documentation confirms that "v1.1" is the minimum supported version and is used as the default if omitted from the config.

src/ai_company/memory/backends/mem0/adapter.py (4)

76-98: LGTM! Validates add results defensively.

The helper correctly validates the Mem0 add result structure and provides clear error messages with context.

125-207: LGTM! Lifecycle methods are well-implemented.

Good practices:

Explicit ImportError handling with structured logging

asyncio.to_thread for blocking Mem0 calls

Health check performs real probe (not just flag check)

Safe disconnect when not connected

559-772: LGTM! Shared knowledge store has proper ownership checks.

The implementation correctly:

Tags published memories with publisher agent ID

Verifies publisher ownership before retraction

Excludes requesting agent from search results when specified

Handles missing publisher metadata gracefully

778-800: LGTM! Helper functions handle edge cases.

_extract_category and _extract_publisher defensively handle missing or invalid metadata with appropriate defaults and logging.

tests/unit/memory/backends/mem0/test_adapter.py (5)

1-119: LGTM! Well-organized test fixtures and helpers.

The test file has:

Clean fixture setup with mocked Mem0 client

Reusable helpers for building mock responses

Proper isolation by injecting mock client directly

125-209: LGTM! Properties and protocol conformance tests.

Good coverage of:

Backend properties (backend_name, is_connected)

All capability properties

Protocol method presence checks

214-350: LGTM! Lifecycle and connection guard tests.

Comprehensive coverage of:

Connect success/failure scenarios

Disconnect (connected and not connected)

Health check states (connected, disconnected, probe failure)

Connection guards for all operations

355-679: LGTM! CRUD operation tests are thorough.

Tests cover:

Success paths with proper mock validation

Error wrapping (empty results, missing ID, exceptions)

Edge cases (not found, category filtering, invalid data)

684-893: LGTM! Shared knowledge store tests cover all scenarios.

Tests validate:

Publish with publisher metadata

Search with/without text, agent exclusion

Retract ownership verification (success, not found, mismatch, no publisher)

src/ai_company/memory/backends/mem0/mappers.py (6)

32-51: LGTM! Metadata serialization is clean and defensive.

Creates a new dict without mutating the input request. Conditionally includes optional fields only when present.

54-94: LGTM! Datetime parsing and score normalization are robust.

Handles empty strings and None

Naive datetimes correctly default to UTC

Malformed values are logged and return None

Score clamping handles out-of-bounds values

97-144: LGTM! Metadata deserialization with defensive defaults.

Handles:

None/empty metadata → defaults to WORKING category

Invalid category → logs warning and defaults to WORKING

Empty/whitespace tags → filtered out

147-193: LGTM! Result-to-entry conversion validates required fields.

Properly:

Validates id presence (raises MemoryRetrievalError)

Validates non-empty content

Falls back to data key if memory is missing

Defaults created_at to now if missing

196-238: LGTM! Query translation functions are straightforward.

Clear separation between search (requires text) and get_all (no text) with appropriate validation.

241-273: LGTM! Post-filters implement correct semantics.

Category filter: entry must be in categories set

Tag filter: AND semantics (all query tags must be present)

Time range: since inclusive, until exclusive

Min relevance: skipped if relevance_score is None (permissive)

Order is preserved as documented.

coderabbitai · 2026-03-12T23:33:31Z

+    async def count(
+        self,
+        agent_id: NotBlankStr,
+        *,
+        category: MemoryCategory | None = None,
+    ) -> int:
+        """Count memory entries for an agent.
+
+        Uses ``get_all()`` internally — O(n) in the agent's memory
+        count.  Acceptable because ``count()`` is not on the hot path.
+
+        Args:
+            agent_id: Owning agent identifier.
+            category: Optional category filter.
+
+        Returns:
+            Number of matching entries.
+
+        Raises:
+            MemoryConnectionError: If the backend is not connected.
+            MemoryRetrievalError: If the count query fails.
+        """
+        self._require_connected()
+        try:
+            raw_result = await asyncio.to_thread(
+                self._client.get_all,
+                user_id=str(agent_id),
+                limit=self._max_memories_per_agent,
+            )
+            raw_list = raw_result.get("results", [])
+            if category is None:
+                total = len(raw_list)
+            else:
+                total = sum(
+                    1 for item in raw_list if _extract_category(item) == category
+                )
+        except MemoryRetrievalError:
+            logger.warning(
+                MEMORY_ENTRY_COUNT_FAILED,
+                agent_id=agent_id,
+                error="count query failed",
+                error_type="MemoryRetrievalError",
+            )
+            raise
+        except Exception as exc:
+            logger.warning(
+                MEMORY_ENTRY_COUNT_FAILED,
+                agent_id=agent_id,
+                error=str(exc),
+                error_type=type(exc).__name__,
+            )
+            msg = f"Failed to count memories: {exc}"
+            raise MemoryRetrievalError(msg) from exc
+        else:
+            logger.info(
+                MEMORY_ENTRY_COUNTED,
+                agent_id=agent_id,
+                count=total,
+                category=category.value if category else None,
+            )
+            return total


⚠️ Potential issue | 🟡 Minor

Potential undercounting if memories exceed max_memories_per_agent.

The count operation uses limit=self._max_memories_per_agent (Line 522), so if an agent has more memories than this limit (e.g., due to configuration changes), the count will be capped and potentially incorrect.

Consider using a higher limit or documenting this limitation explicitly.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 495 - 555, The count method currently uses self._client.get_all with limit=self._max_memories_per_agent which can cap results and undercount when an agent has more memories than _max_memories_per_agent; update MemoryAdapter.count to fetch the full set (e.g. use the backend's pagination or a no-limit/count endpoint) instead of passing the capped limit — specifically change the call site using self._client.get_all (and any helpers around _max_memories_per_agent) to iterate pages or request an unbounded count, then compute total (and preserve the category filter logic using _extract_category) and keep existing error handling and logging in count, MEMORY_ENTRY_COUNT_FAILED, and MEMORY_ENTRY_COUNTED.

coderabbitai · 2026-03-12T23:33:31Z

+def build_config_from_company_config(
+    config: CompanyMemoryConfig,
+    *,
+    embedder: Mem0EmbedderConfig,
+) -> Mem0BackendConfig:
+    """Derive a ``Mem0BackendConfig`` from the top-level memory config.
+
+    Args:
+        config: Company-wide memory configuration.
+        embedder: Embedder settings (provider and model must be
+            supplied explicitly to avoid vendor names in defaults).
+
+    Returns:
+        Mem0-specific backend configuration.
+    """
+    return Mem0BackendConfig(
+        data_dir=config.storage.data_dir,
+        embedder=embedder,
+    )


🧹 Nitpick | 🔵 Trivial

Consider exposing collection_name in the builder.

build_config_from_company_config always uses the default collection_name from Mem0BackendConfig. If collection names need to be customizable per deployment, this could be exposed as a parameter.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/config.py` around lines 110 - 128, The builder build_config_from_company_config currently always constructs Mem0BackendConfig without setting collection_name, so add an optional parameter (e.g., collection_name: Optional[str] = None) to build_config_from_company_config and pass it into the returned Mem0BackendConfig(...) (use the provided collection_name when non-empty, otherwise allow Mem0BackendConfig to keep its default); update the function signature and the call site(s) to wire through CompanyMemoryConfig or explicit overrides as needed so deployments can customize collection_name.

coderabbitai

Actionable comments posted: 8

♻️ Duplicate comments (1)

src/ai_company/memory/backends/mem0/adapter.py (1)
510-514: ⚠️ Potential issue | 🟠 Major

count() still caps the total at max_memories_per_agent.

Lines 510-514 reuse self._max_memories_per_agent as the get_all() limit, so the method undercounts as soon as stored memories exceed that cap. This needs pagination or a backend-native count path instead of the retention limit.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 510 - 514, The
count() implementation currently passes self._max_memories_per_agent into
self._client.get_all (in adapter.py), which caps results and undercounts when
stored memories exceed that limit; update count() to either call a
backend-native count endpoint on self._client if available (e.g., client.count
or similar) or implement pagination by repeatedly calling self._client.get_all
with offset/next_token until no more results are returned, aggregating the total
rather than using self._max_memories_per_agent as the final limit; ensure you
reference and adjust the calls to self._client.get_all and remove reliance on
self._max_memories_per_agent for the total count.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/memory/backends/mem0/adapter.py`:
- Around line 326-336: The adapter currently requests only query.limit raw hits
(via query_to_mem0_search_args / query_to_mem0_getall_args and
self._client.search / get_all) then applies apply_post_filters, which can leave
fewer than query.limit results if many are filtered out; change the logic to
over-fetch / paginate: call query_to_mem0_search_args or
query_to_mem0_getall_args with a larger limit or with offset pagination and loop
calling self._client.search / self._client.get_all, converting results with
mem0_result_to_entry and accumulating entries, applying apply_post_filters on
the accumulated set each iteration and stop once the filtered entries reach
query.limit or the backend is exhausted; ensure the same change is applied to
both places that use these calls (the blocks using
query_to_mem0_search_args/query_to_mem0_getall_args, mem0_result_to_entry, and
apply_post_filters).
- Around line 364-455: The get() and delete() methods currently fetch Mem0
records by ID without validating ownership and mem0_result_to_entry() relabels
the entry with the caller's agent_id; update get() and delete() to retrieve the
raw record, check the record's owner/user_id (as returned by Mem0) against the
provided agent_id and reject mismatches before returning or deleting (follow the
ownership-check pattern used in retract()); for get(), if owner mismatch log
MEMORY_ENTRY_FETCH_FAILED and raise MemoryRetrievalError, and for delete() log
MEMORY_ENTRY_DELETED with found=False and raise MemoryStoreError (or return
False per existing semantics) when ownership does not match; also adjust
mem0_result_to_entry() usage so it does not overwrite the original owner field
when validating ownership.

In `@src/ai_company/memory/backends/mem0/mappers.py`:
- Around line 114-164: The code currently assumes raw_metadata is a mapping and
only checks truthiness, causing AttributeError for non-object metadata like
lists or strings; update parse_mem0_metadata(), extract_category(), and
extract_publisher() to treat non-dict/raw-mapping metadata as absent by first
verifying raw_metadata is an instance of dict (or collections.abc.Mapping) and
if not, set raw_metadata = {} (or early-return the default tuple used when
metadata is absent), then proceed with the existing logic that reads keys like
f"{_PREFIX}category" and f"{_PREFIX}confidence" so malformed but truthy payloads
(e.g., [], "oops") gracefully fall back to default category, metadata, and
expires_at.
- Around line 332-342: The add-result handling in the mem0 mapper must
explicitly reject non-dict or blank 'id' values: in the block using
result.get("results") and the variable first, first check that results_list is a
non-empty list, then assert isinstance(first, dict) before accessing keys, and
verify that first.get("id") is a non-empty string (not "", None, or other types)
before returning NotBlankStr(str(first["id"])); if those checks fail, call
logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) and raise
MemoryStoreError(msg) with a clear message so you never call first.keys() on
non-dicts or wrap blank ids in NotBlankStr.
- Around line 205-207: The current code synthesizes created_at =
datetime.now(UTC) when parse_mem0_datetime(raw.get("created_at")) returns None,
which improperly makes malformed rows appear brand-new; change this to either
reject the row (raise a validation error/return None so the caller drops it) or
assign a deterministic floor timestamp (e.g., a constant EPOCH_FLOOR like
datetime(1970,1,1,tzinfo=UTC)) instead of datetime.now; update the handling in
the mappers.py logic around parse_mem0_datetime and created_at so callers get a
clear rejection or a stable floor value and add/declare a named constant (e.g.,
EPOCH_FLOOR or MIN_CREATED_AT) if choosing the deterministic floor option.

In `@src/ai_company/memory/factory.py`:
- Around line 54-75: The logger currently emits MEMORY_BACKEND_UNKNOWN for
embedder validation failures in the embedder checks inside the memory factory;
change those logger.warning calls (the ones guarding embedder is None and
isinstance(embedder, Mem0EmbedderConfig)) to use a more specific event constant
(e.g., MEMORY_BACKEND_CONFIG_INVALID) to clearly indicate a
configuration/validation error rather than an unknown backend, keeping the same
parameters (backend="mem0", error=msg, and embedder_type where present) and
still raising MemoryConfigError; update any imports or constant definitions if
needed to reference MEMORY_BACKEND_CONFIG_INVALID so the two embedder-check
branches (the embedder variable and Mem0EmbedderConfig check) log the new event.

In `@tests/integration/memory/test_mem0_backend.py`:
- Around line 156-160: The test only asserts one expected memory phrase; update
the assertions to verify both injected memories survive relevance filtering by
checking for the second expected substring in the combined text (use the
existing messages/combined variables). For example, after the existing assert
"concise responses" in combined, add an assertion that the second memory phrase
(e.g., "episodic memory" or the exact second memory content used in the test
setup) is present in combined so the test fails if the episodic memory is
dropped.

In `@tests/unit/memory/backends/mem0/test_adapter.py`:
- Around line 308-373: Replace the eight nearly-identical tests in
TestConnectionGuard with a single parametrized test using
`@pytest.mark.parametrize` that lists the target method names (store, retrieve,
get, delete, count, publish, search_shared, retract) and their respective
argument tuples (e.g., ("test-agent-001", _make_store_request()), (
"test-agent-001", MemoryQuery(text="test")), ("test-agent-001", "mem-001"),
("test-agent-001",), etc.), then inside the test instantiate
Mem0MemoryBackend(mem0_config=mem0_config), use getattr(b, method_name) to
obtain the coroutine, and assert it raises MemoryConnectionError with match "Not
connected" when awaited; keep references to Mem0MemoryBackend,
MemoryConnectionError, _make_store_request, and MemoryQuery to locate the code.

---

Duplicate comments:
In `@src/ai_company/memory/backends/mem0/adapter.py`:
- Around line 510-514: The count() implementation currently passes
self._max_memories_per_agent into self._client.get_all (in adapter.py), which
caps results and undercounts when stored memories exceed that limit; update
count() to either call a backend-native count endpoint on self._client if
available (e.g., client.count or similar) or implement pagination by repeatedly
calling self._client.get_all with offset/next_token until no more results are
returned, aggregating the total rather than using self._max_memories_per_agent
as the final limit; ensure you reference and adjust the calls to
self._client.get_all and remove reliance on self._max_memories_per_agent for the
total count.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2593886f-c386-46e3-a2c4-dfef93265075

📥 Commits

Reviewing files that changed from the base of the PR and between b04e3e6 and 8ea375f.

📒 Files selected for processing (13)

CLAUDE.md
README.md
docs/design/memory.md
docs/roadmap/index.md
src/ai_company/memory/backends/mem0/adapter.py
src/ai_company/memory/backends/mem0/config.py
src/ai_company/memory/backends/mem0/mappers.py
src/ai_company/memory/factory.py
tests/integration/memory/test_mem0_backend.py
tests/unit/memory/backends/mem0/test_adapter.py
tests/unit/memory/backends/mem0/test_config.py
tests/unit/memory/backends/mem0/test_mappers.py
tests/unit/memory/test_factory.py

💤 Files with no reviewable changes (1)

docs/roadmap/index.md

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Greptile Review
GitHub Check: Test (Python 3.14)

🧰 Additional context used

📓 Path-based instructions (4)

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark tests with @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, and @pytest.mark.slow
Prefer @pytest.mark.parametrize for testing similar cases

Files:

tests/unit/memory/backends/mem0/test_config.py
tests/unit/memory/backends/mem0/test_mappers.py
tests/unit/memory/backends/mem0/test_adapter.py
tests/unit/memory/test_factory.py
tests/integration/memory/test_mem0_backend.py

{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names like example-provider, example-large-001, test-provider, test-small-001

Files:

tests/unit/memory/backends/mem0/test_config.py
src/ai_company/memory/backends/mem0/config.py
tests/unit/memory/backends/mem0/test_mappers.py
tests/unit/memory/backends/mem0/test_adapter.py
src/ai_company/memory/factory.py
src/ai_company/memory/backends/mem0/adapter.py
tests/unit/memory/test_factory.py
src/ai_company/memory/backends/mem0/mappers.py
tests/integration/memory/test_mem0_backend.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649
Use except A, B: (no parentheses) for exception syntax on Python 3.14 per PEP 758 — ruff enforces this
Type hints required on all public functions; mypy strict mode enforced
Google-style docstrings required on public classes and functions, enforced by ruff D rules
Create new objects, never mutate existing ones. For non-Pydantic internal collections, use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; separate mutable-via-copy models for runtime state using model_copy(update=...)
Use Pydantic v2 with BaseModel, model_validator, computed_field, ConfigDict
Use @computed_field for derived values instead of storing redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls) — use structured concurrency over bare create_task
Line length must be 88 characters (ruff enforced)
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Every module with business logic must have: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code
Use only logger as the variable name (not _logger, not log)
Use domain-specific event name constants from ai_company.observability.events modules (e.g...

Files:

src/ai_company/memory/backends/mem0/config.py
src/ai_company/memory/factory.py
src/ai_company/memory/backends/mem0/adapter.py
src/ai_company/memory/backends/mem0/mappers.py

docs/design/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When approved deviations occur, update the relevant docs/design/ page to reflect the new reality

Files:

docs/design/memory.md

🧠 Learnings (6)

📚 Learning: 2026-03-12T22:12:29.867Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to tests/**/*.py : Prefer `pytest.mark.parametrize` for testing similar cases

Applied to files:

tests/unit/memory/backends/mem0/test_config.py

📚 Learning: 2026-03-12T22:12:29.867Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising an exception

Applied to files:

src/ai_company/memory/factory.py

📚 Learning: 2026-03-12T22:12:29.867Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to pyproject.toml : Dependency groups: `test` (pytest + plugins), `dev` (includes test + ruff, mypy, pre-commit, commitizen) — install via `uv sync`

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-12T22:12:29.867Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to pyproject.toml : All version pinning in `pyproject.toml` uses `==` (exact versions)

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-12T22:12:29.867Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to src/ai_company/providers/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError` — all others raise immediately

Applied to files:

src/ai_company/memory/backends/mem0/adapter.py

📚 Learning: 2026-03-12T22:12:29.867Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to src/ai_company/{providers,engine}/**/*.py : `RetryExhaustedError` signals all retries failed — the engine layer catches this to trigger fallback chains

Applied to files:

src/ai_company/memory/backends/mem0/adapter.py

🧬 Code graph analysis (5)

tests/unit/memory/backends/mem0/test_config.py (2)

src/ai_company/memory/backends/mem0/config.py (4)

Mem0BackendConfig (46-78)

Mem0EmbedderConfig (17-43)

build_config_from_company_config (111-129)

build_mem0_config_dict (81-108)

src/ai_company/memory/config.py (2)

CompanyMemoryConfig (139-195)

MemoryStorageConfig (25-104)

src/ai_company/memory/backends/mem0/config.py (1)

src/ai_company/memory/config.py (1)

CompanyMemoryConfig (139-195)

tests/unit/memory/backends/mem0/test_adapter.py (3)

src/ai_company/core/enums.py (1)

MemoryCategory (101-108)

src/ai_company/memory/backends/mem0/config.py (2)

Mem0BackendConfig (46-78)

Mem0EmbedderConfig (17-43)

src/ai_company/memory/errors.py (4)

MemoryConnectionError (17-18)

MemoryRetrievalError (25-26)

MemoryStoreError (21-22)

MemoryError (13-14)

src/ai_company/memory/backends/mem0/adapter.py (1)

src/ai_company/memory/backends/mem0/mappers.py (8)

apply_post_filters (280-312)

build_mem0_metadata (38-57)

extract_category (345-367)

extract_publisher (370-376)

mem0_result_to_entry (167-226)

query_to_mem0_getall_args (261-277)

query_to_mem0_search_args (229-258)

validate_add_result (318-342)

src/ai_company/memory/backends/mem0/mappers.py (4)

src/ai_company/core/enums.py (1)

MemoryCategory (101-108)

src/ai_company/memory/errors.py (2)

MemoryRetrievalError (25-26)

MemoryStoreError (21-22)

src/ai_company/memory/models.py (4)

MemoryEntry (82-150)

MemoryMetadata (20-52)

MemoryQuery (153-230)

MemoryStoreRequest (55-79)

src/ai_company/observability/_logger.py (1)

get_logger (8-28)

🔇 Additional comments (23)

README.md (1)

130-130: LGTM!

The status line accurately reflects the addition of the Mem0 backend adapter to the memory subsystem.

docs/design/memory.md (1)

33-33: LGTM!

Design documentation correctly updated to reflect that Mem0 backend is now implemented, aligning with the coding guideline to update design pages when implementation status changes.

CLAUDE.md (2)

97-97: LGTM!

Package structure documentation accurately reflects the new Mem0 adapter location at backends/mem0/.

208-208: LGTM!

Clear documentation of the optional mem0ai dependency requirement when using the Mem0 backend.

tests/unit/memory/backends/mem0/test_config.py (4)

1-14: LGTM!

Well-structured test file with proper module docstring, imports, and timeout marker. Uses vendor-agnostic test values (test-provider, test-embedding-001) per coding guidelines.

27-64: LGTM!

Comprehensive test coverage for Mem0EmbedderConfig: validates required fields, custom values, defaults, immutability, and rejection of invalid inputs (zero dims, blank strings).

66-117: LGTM!

Good test coverage for Mem0BackendConfig. The path traversal test is correctly parametrized with Unix, Windows, and relative path variants as requested in the past review.

120-191: LGTM!

Builder function tests validate the expected dict structure for build_mem0_config_dict and proper derivation in build_config_from_company_config. Coverage includes default and custom configurations.

src/ai_company/memory/factory.py (2)

1-22: LGTM!

Clean module structure with TYPE_CHECKING guard for Mem0EmbedderConfig import to avoid circular dependencies while maintaining type safety.

77-90: LGTM!

Backend creation correctly uses the config builder and logs MEMORY_BACKEND_CREATED with relevant context.

tests/unit/memory/test_factory.py (2)

15-20: LGTM!

Good helper function with vendor-agnostic defaults. Consistent with the helper in test_config.py.

23-54: LGTM!

Comprehensive test coverage for the factory: validates backend creation, configuration propagation (max_memories_per_agent), and both embedder validation error paths (missing and wrong type).

tests/unit/memory/backends/mem0/test_mappers.py (7)

1-28: LGTM!

Well-organized test module with proper imports, timeout marker, and a reusable _make_entry helper. Importing _PREFIX (a private constant) is acceptable for test verification of metadata key prefixes.

53-87: LGTM!

Good coverage for build_mem0_metadata with basic and full metadata scenarios, verifying correct key prefixing and value serialization.

89-137: LGTM!

Thorough datetime parsing tests covering None, empty string, aware/naive ISO formats, non-UTC timezones, and malformed inputs.

139-183: LGTM!

Good coverage for parse_mem0_metadata including fallback behavior for missing/invalid categories and empty tag filtering.

185-266: LGTM!

Comprehensive mem0_result_to_entry tests covering basic conversion, score handling, missing timestamps, no metadata, validation errors (missing ID, empty content), and data-key fallback.

268-292: LGTM!

Query translation tests validate correct argument mapping and error handling for missing query text.

294-387: LGTM!

Thorough apply_post_filters tests covering no filters, category, tag, time range (since/until), min relevance, and combined filter scenarios.

src/ai_company/memory/backends/mem0/config.py (4)

17-43: LGTM!

Well-designed embedder config with required provider and model fields to avoid vendor-specific defaults. Proper use of NotBlankStr and frozen model.

46-78: LGTM!

Good defensive validation against path traversal using both Windows and POSIX path parsing. The combined parts check catches .. segments regardless of platform path separators used.

111-129: LGTM!

Clean builder that derives Mem0BackendConfig from CompanyMemoryConfig. The explicit embedder parameter enforces vendor-agnostic configuration. Per a past review, collection_name could optionally be exposed as a parameter for per-deployment customization, but the current default-based approach is acceptable.

81-108: No action needed. The hardcoded version "v1.1" is correct for the pinned mem0ai dependency (v1.0.5), which requires this schema version per the Mem0 migration documentation.

coderabbitai · 2026-03-13T06:09:55Z

+            if query.text is not None:
+                kwargs = query_to_mem0_search_args(str(agent_id), query)
+                raw_result = await asyncio.to_thread(self._client.search, **kwargs)
+            else:
+                kwargs = query_to_mem0_getall_args(str(agent_id), query)
+                raw_result = await asyncio.to_thread(self._client.get_all, **kwargs)
+            raw_list = raw_result.get("results", [])
+            entries = tuple(
+                mem0_result_to_entry(item, str(agent_id)) for item in raw_list
+            )
+            entries = apply_post_filters(entries, query)


⚠️ Potential issue | 🟠 Major

Local filters run after an already-capped fetch.

Lines 327-331 and 638-649 ask Mem0 for only query.limit raw hits, then Lines 336 and 659-662 apply category/tag/time/relevance and exclude_agent filtering locally. If the first page is dominated by filtered-out items, these methods return too few results even when more eligible memories exist. Page or over-fetch until the final filtered set reaches query.limit or the backend is exhausted.

Also applies to: 637-663

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 326 - 336, The adapter currently requests only query.limit raw hits (via query_to_mem0_search_args / query_to_mem0_getall_args and self._client.search / get_all) then applies apply_post_filters, which can leave fewer than query.limit results if many are filtered out; change the logic to over-fetch / paginate: call query_to_mem0_search_args or query_to_mem0_getall_args with a larger limit or with offset pagination and loop calling self._client.search / self._client.get_all, converting results with mem0_result_to_entry and accumulating entries, applying apply_post_filters on the accumulated set each iteration and stop once the filtered entries reach query.limit or the backend is exhausted; ensure the same change is applied to both places that use these calls (the blocks using query_to_mem0_search_args/query_to_mem0_getall_args, mem0_result_to_entry, and apply_post_filters).

coderabbitai · 2026-03-13T06:09:55Z

+    async def get(
+        self,
+        agent_id: NotBlankStr,
+        memory_id: NotBlankStr,
+    ) -> MemoryEntry | None:
+        """Get a specific memory entry by ID.
+
+        Args:
+            agent_id: Owning agent identifier.
+            memory_id: Memory identifier.
+
+        Returns:
+            The memory entry, or ``None`` if not found.
+
+        Raises:
+            MemoryConnectionError: If the backend is not connected.
+            MemoryRetrievalError: If the backend query fails.
+        """
+        self._require_connected()
+        try:
+            raw = await asyncio.to_thread(self._client.get, str(memory_id))
+            if raw is None:
+                logger.debug(
+                    MEMORY_ENTRY_FETCHED,
+                    agent_id=agent_id,
+                    memory_id=memory_id,
+                    found=False,
+                )
+                return None
+            entry = mem0_result_to_entry(raw, str(agent_id))
+        except MemoryRetrievalError as exc:
+            logger.warning(
+                MEMORY_ENTRY_FETCH_FAILED,
+                agent_id=agent_id,
+                memory_id=memory_id,
+                error=str(exc),
+                error_type="MemoryRetrievalError",
+            )
+            raise
+        except MemoryError, RecursionError:
+            raise
+        except Exception as exc:
+            logger.warning(
+                MEMORY_ENTRY_FETCH_FAILED,
+                agent_id=agent_id,
+                memory_id=memory_id,
+                error=str(exc),
+                error_type=type(exc).__name__,
+            )
+            msg = f"Failed to get memory {memory_id}: {exc}"
+            raise MemoryRetrievalError(msg) from exc
+        else:
+            logger.debug(
+                MEMORY_ENTRY_FETCHED,
+                agent_id=agent_id,
+                memory_id=memory_id,
+                found=True,
+            )
+            return entry
+
+    async def delete(
+        self,
+        agent_id: NotBlankStr,
+        memory_id: NotBlankStr,
+    ) -> bool:
+        """Delete a specific memory entry.
+
+        Args:
+            agent_id: Owning agent identifier.
+            memory_id: Memory identifier.
+
+        Returns:
+            ``True`` if the entry was deleted, ``False`` if not found.
+
+        Raises:
+            MemoryConnectionError: If the backend is not connected.
+            MemoryStoreError: If the delete operation fails.
+        """
+        self._require_connected()
+        try:
+            # Check existence first — Mem0 delete doesn't indicate
+            # whether the entry existed.
+            existing = await asyncio.to_thread(self._client.get, str(memory_id))
+            if existing is None:
+                logger.debug(
+                    MEMORY_ENTRY_DELETED,
+                    agent_id=agent_id,
+                    memory_id=memory_id,
+                    found=False,
+                )
+                return False
+            await asyncio.to_thread(self._client.delete, str(memory_id))


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the complete get() and delete() methods cat -n src/ai_company/memory/backends/mem0/adapter.py | sed -n '364,455p'

Repository: Aureliolo/synthorg

Length of output: 3703

🏁 Script executed:

# Find the mem0_result_to_entry function to understand owner handling rg -n "mem0_result_to_entry" src/ai_company/memory/backends/mem0/ -A 15

Repository: Aureliolo/synthorg

Length of output: 6730

🏁 Script executed:

# Look for how agent_id is handled throughout this file rg -n "agent_id" src/ai_company/memory/backends/mem0/adapter.py -B 2 -A 2

Repository: Aureliolo/synthorg

Length of output: 9153

🏁 Script executed:

# Check the imports and what Mem0 client is being used head -50 src/ai_company/memory/backends/mem0/adapter.py

Repository: Aureliolo/synthorg

Length of output: 1416

🏁 Script executed:

# Get the complete mem0_result_to_entry implementation sed -n '167,200p' src/ai_company/memory/backends/mem0/mappers.py

Repository: Aureliolo/synthorg

Length of output: 1123

🏁 Script executed:

# Check if there's any validation of the returned user_id against the passed agent_id rg -n "user_id" src/ai_company/memory/backends/mem0/mappers.py -B 3 -A 3

Repository: Aureliolo/synthorg

Length of output: 394

🏁 Script executed:

# Look for the full mappers file to understand the conversion logic wc -l src/ai_company/memory/backends/mem0/mappers.py

Repository: Aureliolo/synthorg

Length of output: 112

🏁 Script executed:

# Get the broader context around mem0_result_to_entry sed -n '167,230p' src/ai_company/memory/backends/mem0/mappers.py

Repository: Aureliolo/synthorg

Length of output: 1966

🏁 Script executed:

# Check if Mem0 responses include a user_id field that could be validated rg -n "user_id" src/ai_company/memory/backends/mem0/mappers.py -B 5 -A 5 | head -60

Repository: Aureliolo/synthorg

Length of output: 531

🏁 Script executed:

# Look at retrieve() to see how it uses user_id filtering sed -n '303,362p' src/ai_company/memory/backends/mem0/adapter.py

Repository: Aureliolo/synthorg

Length of output: 2236

🏁 Script executed:

# Check the Mem0 client initialization and configuration to understand scoping sed -n '1,100p' src/ai_company/memory/backends/mem0/adapter.py | tail -60

Repository: Aureliolo/synthorg

Length of output: 1730

🏁 Script executed:

# Check if there's any configuration that might make the client scoped to a user sed -n '100,240p' src/ai_company/memory/backends/mem0/adapter.py

Repository: Aureliolo/synthorg

Length of output: 4630

🏁 Script executed:

# Look at the config to see if there's any user_id scoping at the client level cat src/ai_company/memory/backends/mem0/config.py

Repository: Aureliolo/synthorg

Length of output: 4083

🏁 Script executed:

# Double-check: look at what fields are in the Mem0 result dict by examining any tests or examples rg -n "user_id" src/ai_company/memory/backends/mem0/mappers.py -B 2 -A 2 | grep -A 10 "raw.get"

Repository: Aureliolo/synthorg

Length of output: 44

🏁 Script executed:

# Check if there's any documentation or tests showing what fields are in Mem0 results fd -e py mem0 -path "*/tests/*" | head -5

Repository: Aureliolo/synthorg

Length of output: 230

🏁 Script executed:

# Verify by looking at store() method to see if it validates the user_id in responses sed -n '245,300p' src/ai_company/memory/backends/mem0/adapter.py

Repository: Aureliolo/synthorg

Length of output: 1898

Add ownership validation to get() and delete() before returning or removing the record.

The get() and delete() methods call Mem0's ID-based operations without passing a user_id filter, unlike retrieve(), store(), and other methods that explicitly scope by agent. Additionally, mem0_result_to_entry() never validates that the fetched entry's owner matches the caller; it simply re-labels the entry with the caller's agent_id. This creates a potential cross-agent access path if Mem0 IDs are globally addressable. Retrieve the entry, validate that its owner (if returned by Mem0) matches the provided agent_id, and reject mismatches before returning or deleting the record—following the pattern already used in retract() (line 739).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 364 - 455, The get() and delete() methods currently fetch Mem0 records by ID without validating ownership and mem0_result_to_entry() relabels the entry with the caller's agent_id; update get() and delete() to retrieve the raw record, check the record's owner/user_id (as returned by Mem0) against the provided agent_id and reject mismatches before returning or deleting (follow the ownership-check pattern used in retract()); for get(), if owner mismatch log MEMORY_ENTRY_FETCH_FAILED and raise MemoryRetrievalError, and for delete() log MEMORY_ENTRY_DELETED with found=False and raise MemoryStoreError (or return False per existing semantics) when ownership does not match; also adjust mem0_result_to_entry() usage so it does not overwrite the original owner field when validating ownership.

coderabbitai · 2026-03-13T06:09:55Z

+    created_at = parse_mem0_datetime(raw.get("created_at"))
+    if created_at is None:
+        created_at = datetime.now(UTC)


⚠️ Potential issue | 🟠 Major

Don't synthesize created_at from now().

When parsing fails, Line 207 makes the row look brand-new. That skews recency ranking and since/until filters, so malformed timestamps can float to the top of retrieval results. Reject the row or use a deterministic floor timestamp instead.

One safe option

created_at = parse_mem0_datetime(raw.get("created_at")) if created_at is None: - created_at = datetime.now(UTC) + msg = f"Mem0 result {memory_id} has missing or malformed created_at" + logger.warning( + MEMORY_MODEL_INVALID, + field="created_at", + raw_value=raw.get("created_at"), + reason=msg, + ) + raise MemoryRetrievalError(msg)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 205 - 207, The current code synthesizes created_at = datetime.now(UTC) when parse_mem0_datetime(raw.get("created_at")) returns None, which improperly makes malformed rows appear brand-new; change this to either reject the row (raise a validation error/return None so the caller drops it) or assign a deterministic floor timestamp (e.g., a constant EPOCH_FLOOR like datetime(1970,1,1,tzinfo=UTC)) instead of datetime.now; update the handling in the mappers.py logic around parse_mem0_datetime and created_at so callers get a clear rejection or a stable floor value and add/declare a named constant (e.g., EPOCH_FLOOR or MIN_CREATED_AT) if choosing the deterministic floor option.

coderabbitai · 2026-03-13T06:09:55Z

+    results_list = result.get("results")
+    if not isinstance(results_list, list) or not results_list:
+        msg = f"Mem0 add returned no results for {context}"
+        logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg)
+        raise MemoryStoreError(msg)
+    first = results_list[0]
+    if "id" not in first:
+        msg = f"Mem0 add result missing 'id' for {context}: keys={list(first.keys())}"
+        logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg)
+        raise MemoryStoreError(msg)
+    return NotBlankStr(str(first["id"]))


⚠️ Potential issue | 🟠 Major

Reject blank or non-object add results explicitly.

"id" not in first only checks key presence. {"id": ""} or a non-dict first result currently falls through into NotBlankStr(...) / first.keys() and gets reported as a generic store failure instead of the intended validation error.

Suggested hardening

def validate_add_result(result: dict[str, Any], *, context: str) -> NotBlankStr: @@ - results_list = result.get("results") + results_list = result.get("results") if isinstance(result, dict) else None if not isinstance(results_list, list) or not results_list: msg = f"Mem0 add returned no results for {context}" logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) raise MemoryStoreError(msg) first = results_list[0] - if "id" not in first: - msg = f"Mem0 add result missing 'id' for {context}: keys={list(first.keys())}" + if not isinstance(first, dict): + msg = f"Mem0 add result is not an object for {context}: {type(first).__name__}" + logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) + raise MemoryStoreError(msg) + raw_id = first.get("id") + if raw_id is None or not str(raw_id).strip(): + msg = ( + f"Mem0 add result missing or blank 'id' for {context}: " + f"keys={list(first.keys())}" + ) logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) raise MemoryStoreError(msg) - return NotBlankStr(str(first["id"])) + return NotBlankStr(str(raw_id))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 332 - 342, The add-result handling in the mem0 mapper must explicitly reject non-dict or blank 'id' values: in the block using result.get("results") and the variable first, first check that results_list is a non-empty list, then assert isinstance(first, dict) before accessing keys, and verify that first.get("id") is a non-empty string (not "", None, or other types) before returning NotBlankStr(str(first["id"])); if those checks fail, call logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) and raise MemoryStoreError(msg) with a clear message so you never call first.keys() on non-dicts or wrap blank ids in NotBlankStr.

coderabbitai · 2026-03-13T06:09:55Z

+        # Should produce at least one message with memory context
+        assert len(messages) >= 1
+        # Content should include both memories (they pass min_relevance)
+        combined = " ".join(m.content for m in messages if m.content)
+        assert "concise responses" in combined


⚠️ Potential issue | 🟡 Minor

Assert the second injected memory too.

Line 158 says both memories should survive the relevance gate, but the test only checks the first one. A regression that drops the episodic memory would still pass.

Suggested assertion

combined = " ".join(m.content for m in messages if m.content) assert "concise responses" in combined + assert "last task was code review" in combined

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/integration/memory/test_mem0_backend.py` around lines 156 - 160, The test only asserts one expected memory phrase; update the assertions to verify both injected memories survive relevance filtering by checking for the second expected substring in the combined text (use the existing messages/combined variables). For example, after the existing assert "concise responses" in combined, add an assertion that the second memory phrase (e.g., "episodic memory" or the exact second memory content used in the test setup) is present in combined so the test fails if the episodic memory is dropped.

coderabbitai · 2026-03-13T06:09:55Z

+@pytest.mark.unit
+class TestConnectionGuard:
+    async def test_store_raises_when_disconnected(
+        self,
+        mem0_config: Mem0BackendConfig,
+    ) -> None:
+        b = Mem0MemoryBackend(mem0_config=mem0_config)
+        with pytest.raises(MemoryConnectionError, match="Not connected"):
+            await b.store("test-agent-001", _make_store_request())
+
+    async def test_retrieve_raises_when_disconnected(
+        self,
+        mem0_config: Mem0BackendConfig,
+    ) -> None:
+        b = Mem0MemoryBackend(mem0_config=mem0_config)
+        with pytest.raises(MemoryConnectionError, match="Not connected"):
+            await b.retrieve("test-agent-001", MemoryQuery(text="test"))
+
+    async def test_get_raises_when_disconnected(
+        self,
+        mem0_config: Mem0BackendConfig,
+    ) -> None:
+        b = Mem0MemoryBackend(mem0_config=mem0_config)
+        with pytest.raises(MemoryConnectionError, match="Not connected"):
+            await b.get("test-agent-001", "mem-001")
+
+    async def test_delete_raises_when_disconnected(
+        self,
+        mem0_config: Mem0BackendConfig,
+    ) -> None:
+        b = Mem0MemoryBackend(mem0_config=mem0_config)
+        with pytest.raises(MemoryConnectionError, match="Not connected"):
+            await b.delete("test-agent-001", "mem-001")
+
+    async def test_count_raises_when_disconnected(
+        self,
+        mem0_config: Mem0BackendConfig,
+    ) -> None:
+        b = Mem0MemoryBackend(mem0_config=mem0_config)
+        with pytest.raises(MemoryConnectionError, match="Not connected"):
+            await b.count("test-agent-001")
+
+    async def test_publish_raises_when_disconnected(
+        self,
+        mem0_config: Mem0BackendConfig,
+    ) -> None:
+        b = Mem0MemoryBackend(mem0_config=mem0_config)
+        with pytest.raises(MemoryConnectionError, match="Not connected"):
+            await b.publish("test-agent-001", _make_store_request())
+
+    async def test_search_shared_raises_when_disconnected(
+        self,
+        mem0_config: Mem0BackendConfig,
+    ) -> None:
+        b = Mem0MemoryBackend(mem0_config=mem0_config)
+        with pytest.raises(MemoryConnectionError, match="Not connected"):
+            await b.search_shared(MemoryQuery(text="test"))
+
+    async def test_retract_raises_when_disconnected(
+        self,
+        mem0_config: Mem0BackendConfig,
+    ) -> None:
+        b = Mem0MemoryBackend(mem0_config=mem0_config)
+        with pytest.raises(MemoryConnectionError, match="Not connected"):
+            await b.retract("test-agent-001", "mem-001")
+


🧹 Nitpick | 🔵 Trivial

Parametrize the disconnected-guard matrix.

These eight tests differ only by the invoked method and the expected MemoryConnectionError. A single parametrized table will keep this surface much easier to extend and audit. As per coding guidelines, "Prefer @pytest.mark.parametrize for testing similar cases".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/memory/backends/mem0/test_adapter.py` around lines 308 - 373, Replace the eight nearly-identical tests in TestConnectionGuard with a single parametrized test using `@pytest.mark.parametrize` that lists the target method names (store, retrieve, get, delete, count, publish, search_shared, retract) and their respective argument tuples (e.g., ("test-agent-001", _make_store_request()), ( "test-agent-001", MemoryQuery(text="test")), ("test-agent-001", "mem-001"), ("test-agent-001",), etc.), then inside the test instantiate Mem0MemoryBackend(mem0_config=mem0_config), use getattr(b, method_name) to obtain the coroutine, and assert it raises MemoryConnectionError with match "Not connected" when awaited; keep references to Mem0MemoryBackend, MemoryConnectionError, _make_store_request, and MemoryQuery to locate the code.

coderabbitai

♻️ Duplicate comments (10)

src/ai_company/memory/backends/mem0/mappers.py (5)

103-164: ⚠️ Potential issue | 🟠 Major

Treat non-dict metadata as absent.

The function checks if not raw_metadata: (line 114) but doesn't verify it's actually a dict. A truthy malformed payload like metadata=[] or metadata="oops" would raise AttributeError when calling .get() on line 121 and subsequent lines. Add an isinstance(raw_metadata, dict) check.

🛡️ Suggested fix

 def parse_mem0_metadata(
     raw_metadata: dict[str, Any] | None,
 ) -> tuple[MemoryCategory, MemoryMetadata, AwareDatetime | None]:
-    if not raw_metadata:
+    if not raw_metadata or not isinstance(raw_metadata, dict):
         return (
             MemoryCategory.WORKING,
             MemoryMetadata(),
             None,
         )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 103 - 164, The
function parse_mem0_metadata currently only checks truthiness of raw_metadata
and will raise when raw_metadata is a non-dict (e.g., list or string); update
the initial guard to treat any non-dict as absent by checking
isinstance(raw_metadata, dict) and returning the default
(MemoryCategory.WORKING, MemoryMetadata(), None) when it's not a dict or is
falsy, so all subsequent calls like raw_metadata.get(...) are safe.

337-346: ⚠️ Potential issue | 🟡 Minor

Validate that first is a dict before calling .get().

If results_list[0] is not a dict (e.g., a string or number), line 338 first.get("id") will raise AttributeError. Add a type check for robustness.

🛡️ Suggested fix

     first = results_list[0]
+    if not isinstance(first, dict):
+        msg = f"Mem0 add result is not an object for {context}: {type(first).__name__}"
+        logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg)
+        raise MemoryStoreError(msg)
     raw_id = first.get("id")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 337 - 346, The
code assumes results_list[0] is a dict and calls first.get("id") which can raise
AttributeError for non-dict items; update the block in the mapper where
results_list and first are used to first verify type: check isinstance(first,
dict) (or mapping) before accessing .get, and if not a dict log the same
MEMORY_ENTRY_STORE_FAILED via logger with context and an explanatory msg and
raise MemoryStoreError; keep the existing behavior of returning
NotBlankStr(str(raw_id)) only after successful type and non-blank id validation
so callers of the function (and symbols results_list, first, logger,
MEMORY_ENTRY_STORE_FAILED, MemoryStoreError, NotBlankStr) are protected from
type errors.

349-371: ⚠️ Potential issue | 🟠 Major

Same non-dict metadata issue in extract_category().

Line 355 uses raw.get("metadata", {}) which is safe for missing keys, but if metadata exists as a non-dict value (e.g., "oops"), line 358 metadata.get(...) will fail. Apply the same isinstance guard.

🛡️ Suggested fix

 def extract_category(raw: dict[str, Any]) -> MemoryCategory:
     metadata = raw.get("metadata", {})
-    if not metadata:
+    if not metadata or not isinstance(metadata, dict):
         return MemoryCategory.WORKING

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 349 - 371, The
extract_category function currently assumes metadata is a dict; guard metadata
with isinstance(metadata, dict) after metadata = raw.get("metadata", {}) and
treat non-dict values as missing (return MemoryCategory.WORKING), mirroring the
other mapper fixes; optionally log via MEMORY_MODEL_INVALID with
field="metadata" and raw_value=metadata before returning default; keep existing
handling for recognized/unrecognized category using _PREFIX and MemoryCategory.

374-380: ⚠️ Potential issue | 🟠 Major

Same non-dict metadata issue in extract_publisher().

Apply the isinstance(metadata, dict) guard here as well for consistency.

🛡️ Suggested fix

 def extract_publisher(raw: dict[str, Any]) -> str | None:
     metadata = raw.get("metadata", {})
-    if not metadata:
+    if not metadata or not isinstance(metadata, dict):
         return None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 374 - 380, The
extract_publisher function assumes raw["metadata"] is a dict and can call .get;
add the same guard used elsewhere by checking isinstance(metadata, dict) after
obtaining metadata and return None if it's not a dict so you don't call .get on
non-dict types; update extract_publisher to only call
metadata.get(_PUBLISHER_KEY) when metadata is a dict and otherwise return None.

205-207: ⚠️ Potential issue | 🟠 Major

Consider rejecting rows with missing created_at instead of synthesizing now().

When parse_mem0_datetime returns None, the code defaults to datetime.now(UTC) (line 207). This makes malformed rows appear brand-new, which skews recency ranking and since/until filters. Consider raising MemoryRetrievalError or using a deterministic floor timestamp instead.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 205 - 207, The
code currently substitutes datetime.now(UTC) when
parse_mem0_datetime(raw.get("created_at")) returns None, which makes malformed
rows appear new; instead, modify the handling in the mapper (around created_at
assignment) to reject such rows deterministically: if parse_mem0_datetime(...)
returns None, raise MemoryRetrievalError with a clear message including the
offending raw row id/metadata (or, if a deterministic fallback is preferred, set
created_at to a defined constant floor timestamp like MEM0_MIN_TIMESTAMP), and
ensure callers of the mapper can surface or handle MemoryRetrievalError
appropriately; update any tests or error handling paths that expect the old
behavior.

tests/unit/memory/backends/mem0/test_adapter.py (1)

308-373: 🧹 Nitpick | 🔵 Trivial

Consider parametrizing the disconnected-guard tests.

These eight tests share identical structure and could be consolidated into a single parametrized test. This would make the test suite more maintainable and easier to extend when new methods are added.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/memory/backends/mem0/test_adapter.py` around lines 308 - 373,
Consolidate the eight identical disconnected-guard tests into one parametrized
pytest test: create a single async test that instantiates
Mem0MemoryBackend(mem0_config=mem0_config) and use pytest.mark.parametrize to
pass tuples of (method_name, args_tuple) for "store", "retrieve", "get",
"delete", "count", "publish", "search_shared", and "retract" (use
_make_store_request() or MemoryQuery(text="test") where appropriate); inside the
test use getattr(b, method_name)(*args) awaited and assert it raises
MemoryConnectionError with match "Not connected". Ensure test references
Mem0MemoryBackend, MemoryConnectionError, _make_store_request, and MemoryQuery
so readers can locate the original methods.

src/ai_company/memory/backends/mem0/adapter.py (4)

644-669: ⚠️ Potential issue | 🟠 Major

Same post-filter limitation applies to search_shared().

The search_shared() method also fetches query.limit items (lines 649, 655) and then applies filters plus exclude_agent (lines 666-669), which may result in fewer items than requested.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 644 - 669,
search_shared() currently requests exactly query.limit items from
self._client.search / get_all then applies apply_post_filters and exclude_agent,
which can produce fewer results than requested; fix by over-fetching and
trimming: call self._client.search / get_all with a larger limit (e.g.,
min(max_overfetch, query.limit * X) or loop/paginate requesting additional
pages) until you have at least query.limit post-filtered entries or no more raw
results, then convert via mem0_result_to_entry, apply apply_post_filters and the
exclude_agent filter and finally trim to query.limit before returning; update
the calls to self._client.search and self._client.get_all and the
post-processing around raw_result/raw_list to implement this
over-fetch/pagination strategy.
364-422: ⚠️ Potential issue | 🟠 Major

Add ownership validation in get() before returning the entry.

The get() method fetches by memory ID without validating that the record belongs to the provided agent_id. Unlike retrieve() which scopes by user_id, and retract() which validates publisher ownership, get() could return entries belonging to other agents if Mem0 IDs are globally addressable. Consider validating ownership before returning.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 364 - 422, get()
currently returns an entry fetched by memory_id without checking it belongs to
the given agent_id; after converting raw -> entry via mem0_result_to_entry (or
immediately after that call), validate the entry's owner (e.g.,
entry.publisher_id, entry.agent_id or the appropriate ownership field on
MemoryEntry) against the provided agent_id, and if it does not match log a
warning (similar to MEMORY_ENTRY_FETCH_FAILED/MEMORY_ENTRY_FETCHED pattern) and
return None (or raise a MemoryRetrievalError if your codebase prefers); ensure
the ownership check happens before the final debug log and before returning the
entry so entries owned by other agents are not exposed.
326-336: ⚠️ Potential issue | 🟠 Major

Post-filters run after an already-capped fetch.

The retrieve logic fetches only query.limit items from Mem0 (lines 327-331), then applies local filters (line 336). If many items are filtered out, the result set may be smaller than query.limit even when more eligible memories exist. Consider over-fetching or paginating until the filtered set reaches query.limit.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 326 - 336, The
current retrieve flow uses query_to_mem0_search_args / query_to_mem0_getall_args
to fetch only query.limit items from self._client.search / get_all, then runs
apply_post_filters (via mem0_result_to_entry) which can reduce results below
query.limit; change the logic to over-fetch or paginate: call the client in a
loop (using offset/next token or increasing page size in kwargs) accumulating
raw_results -> mem0_result_to_entry conversions and applying post-filters
incrementally until the filtered entries length >= query.limit or no more
results are returned, deduplicate entries as you accumulate, and then trim to
query.limit before returning. Ensure the loop uses the same kwargs shape
produced by query_to_mem0_search_args / query_to_mem0_getall_args and preserves
existing behavior when there are fewer overall matches.
424-484: ⚠️ Potential issue | 🟠 Major

Add ownership validation in delete() before deleting the entry.

Similar to get(), the delete() method fetches by memory ID without validating ownership. This could allow one agent to delete another agent's memories if IDs are globally addressable. The retract() method (line 746) demonstrates the ownership check pattern that should be applied here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 424 - 484, The
delete() method fetches the entry but doesn't verify that the fetched entry
belongs to the requesting agent; mirror the ownership check used in retract()
and get(): after retrieving existing = await asyncio.to_thread(self._client.get,
str(memory_id)) verify the entry's owner field (use the same attribute used in
retract(), e.g., existing.agent_id or existing.owner_id) matches the passed
agent_id, and if it does not, log the deletion as not-found/forbidden (reuse
MEMORY_ENTRY_DELETED with found=False or MEMORY_ENTRY_DELETE_FAILED as
appropriate) and return False without calling self._client.delete; keep existing
exception handling and asyncio.to_thread usage for delete when ownership
matches.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/ai_company/memory/backends/mem0/adapter.py`:
- Around line 644-669: search_shared() currently requests exactly query.limit
items from self._client.search / get_all then applies apply_post_filters and
exclude_agent, which can produce fewer results than requested; fix by
over-fetching and trimming: call self._client.search / get_all with a larger
limit (e.g., min(max_overfetch, query.limit * X) or loop/paginate requesting
additional pages) until you have at least query.limit post-filtered entries or
no more raw results, then convert via mem0_result_to_entry, apply
apply_post_filters and the exclude_agent filter and finally trim to query.limit
before returning; update the calls to self._client.search and
self._client.get_all and the post-processing around raw_result/raw_list to
implement this over-fetch/pagination strategy.
- Around line 364-422: get() currently returns an entry fetched by memory_id
without checking it belongs to the given agent_id; after converting raw -> entry
via mem0_result_to_entry (or immediately after that call), validate the entry's
owner (e.g., entry.publisher_id, entry.agent_id or the appropriate ownership
field on MemoryEntry) against the provided agent_id, and if it does not match
log a warning (similar to MEMORY_ENTRY_FETCH_FAILED/MEMORY_ENTRY_FETCHED
pattern) and return None (or raise a MemoryRetrievalError if your codebase
prefers); ensure the ownership check happens before the final debug log and
before returning the entry so entries owned by other agents are not exposed.
- Around line 326-336: The current retrieve flow uses query_to_mem0_search_args
/ query_to_mem0_getall_args to fetch only query.limit items from
self._client.search / get_all, then runs apply_post_filters (via
mem0_result_to_entry) which can reduce results below query.limit; change the
logic to over-fetch or paginate: call the client in a loop (using offset/next
token or increasing page size in kwargs) accumulating raw_results ->
mem0_result_to_entry conversions and applying post-filters incrementally until
the filtered entries length >= query.limit or no more results are returned,
deduplicate entries as you accumulate, and then trim to query.limit before
returning. Ensure the loop uses the same kwargs shape produced by
query_to_mem0_search_args / query_to_mem0_getall_args and preserves existing
behavior when there are fewer overall matches.
- Around line 424-484: The delete() method fetches the entry but doesn't verify
that the fetched entry belongs to the requesting agent; mirror the ownership
check used in retract() and get(): after retrieving existing = await
asyncio.to_thread(self._client.get, str(memory_id)) verify the entry's owner
field (use the same attribute used in retract(), e.g., existing.agent_id or
existing.owner_id) matches the passed agent_id, and if it does not, log the
deletion as not-found/forbidden (reuse MEMORY_ENTRY_DELETED with found=False or
MEMORY_ENTRY_DELETE_FAILED as appropriate) and return False without calling
self._client.delete; keep existing exception handling and asyncio.to_thread
usage for delete when ownership matches.

In `@src/ai_company/memory/backends/mem0/mappers.py`:
- Around line 103-164: The function parse_mem0_metadata currently only checks
truthiness of raw_metadata and will raise when raw_metadata is a non-dict (e.g.,
list or string); update the initial guard to treat any non-dict as absent by
checking isinstance(raw_metadata, dict) and returning the default
(MemoryCategory.WORKING, MemoryMetadata(), None) when it's not a dict or is
falsy, so all subsequent calls like raw_metadata.get(...) are safe.
- Around line 337-346: The code assumes results_list[0] is a dict and calls
first.get("id") which can raise AttributeError for non-dict items; update the
block in the mapper where results_list and first are used to first verify type:
check isinstance(first, dict) (or mapping) before accessing .get, and if not a
dict log the same MEMORY_ENTRY_STORE_FAILED via logger with context and an
explanatory msg and raise MemoryStoreError; keep the existing behavior of
returning NotBlankStr(str(raw_id)) only after successful type and non-blank id
validation so callers of the function (and symbols results_list, first, logger,
MEMORY_ENTRY_STORE_FAILED, MemoryStoreError, NotBlankStr) are protected from
type errors.
- Around line 349-371: The extract_category function currently assumes metadata
is a dict; guard metadata with isinstance(metadata, dict) after metadata =
raw.get("metadata", {}) and treat non-dict values as missing (return
MemoryCategory.WORKING), mirroring the other mapper fixes; optionally log via
MEMORY_MODEL_INVALID with field="metadata" and raw_value=metadata before
returning default; keep existing handling for recognized/unrecognized category
using _PREFIX and MemoryCategory.
- Around line 374-380: The extract_publisher function assumes raw["metadata"] is
a dict and can call .get; add the same guard used elsewhere by checking
isinstance(metadata, dict) after obtaining metadata and return None if it's not
a dict so you don't call .get on non-dict types; update extract_publisher to
only call metadata.get(_PUBLISHER_KEY) when metadata is a dict and otherwise
return None.
- Around line 205-207: The code currently substitutes datetime.now(UTC) when
parse_mem0_datetime(raw.get("created_at")) returns None, which makes malformed
rows appear new; instead, modify the handling in the mapper (around created_at
assignment) to reject such rows deterministically: if parse_mem0_datetime(...)
returns None, raise MemoryRetrievalError with a clear message including the
offending raw row id/metadata (or, if a deterministic fallback is preferred, set
created_at to a defined constant floor timestamp like MEM0_MIN_TIMESTAMP), and
ensure callers of the mapper can surface or handle MemoryRetrievalError
appropriately; update any tests or error handling paths that expect the old
behavior.

In `@tests/unit/memory/backends/mem0/test_adapter.py`:
- Around line 308-373: Consolidate the eight identical disconnected-guard tests
into one parametrized pytest test: create a single async test that instantiates
Mem0MemoryBackend(mem0_config=mem0_config) and use pytest.mark.parametrize to
pass tuples of (method_name, args_tuple) for "store", "retrieve", "get",
"delete", "count", "publish", "search_shared", and "retract" (use
_make_store_request() or MemoryQuery(text="test") where appropriate); inside the
test use getattr(b, method_name)(*args) awaited and assert it raises
MemoryConnectionError with match "Not connected". Ensure test references
Mem0MemoryBackend, MemoryConnectionError, _make_store_request, and MemoryQuery
so readers can locate the original methods.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 30dde13f-7c5f-4805-8160-22b084477d2d

📥 Commits

Reviewing files that changed from the base of the PR and between 8ea375f and dc77bcb.

📒 Files selected for processing (7)

.github/workflows/dependency-review.yml
src/ai_company/memory/backends/mem0/adapter.py
src/ai_company/memory/backends/mem0/mappers.py
src/ai_company/memory/factory.py
src/ai_company/observability/events/memory.py
tests/unit/memory/backends/mem0/test_adapter.py
tests/unit/memory/backends/mem0/test_mappers.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Agent
GitHub Check: Greptile Review
GitHub Check: Test (Python 3.14)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations
Use except A, B: syntax (no parentheses) — ruff enforces PEP 758 except syntax on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use @computed_field for derived values instead of storing + validating redundant fields. Use NotBlankStr for all identifier/name fields — including optional and tuple variants — instead of manual whitespace validators.
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task. Existing code is being migrated incrementally.
Line length: 88 characters (ruff)
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)

Files:

src/ai_company/observability/events/memory.py
src/ai_company/memory/factory.py
src/ai_company/memory/backends/mem0/adapter.py
tests/unit/memory/backends/mem0/test_adapter.py
tests/unit/memory/backends/mem0/test_mappers.py
src/ai_company/memory/backends/mem0/mappers.py

src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code
Variable name: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under ai_company.observability.events (e.g. PROVIDER_CALL_START from events.provider). Import directly: from ai_company.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs in logging: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG for object creation, internal flow, entry/exit of key functions
Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Tests must use test-provider, test-small-001, etc.

Files:

src/ai_company/observability/events/memory.py
src/ai_company/memory/factory.py
src/ai_company/memory/backends/mem0/adapter.py
src/ai_company/memory/backends/mem0/mappers.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Tests: markers @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Coverage: 80% minimum (enforced in CI)
Async: asyncio_mode = "auto" — no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test
Parallelism: pytest-xdist via -n auto — ALWAYS include -n auto when running pytest, never run tests sequentially
Prefer @pytest.mark.parametrize for testing similar cases
Vendor-agnostic everywhere: Tests must use test-provider, test-small-001, etc. (Vendor names may only appear in operations design page, .claude/ skill/agent files, or third-party import paths)

Files:

tests/unit/memory/backends/mem0/test_adapter.py
tests/unit/memory/backends/mem0/test_mappers.py

🧠 Learnings (4)

📚 Learning: 2026-03-13T06:12:19.113Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T06:12:19.113Z
Learning: Applies to src/ai_company/**/*.py : Event names: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

src/ai_company/observability/events/memory.py

📚 Learning: 2026-03-13T06:12:19.113Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T06:12:19.113Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR with context before raising

Applied to files:

src/ai_company/memory/backends/mem0/adapter.py

📚 Learning: 2026-03-13T06:12:19.113Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T06:12:19.113Z
Learning: Applies to src/ai_company/providers/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry. `RetryExhaustedError` signals that all retries failed — the engine layer catches this to trigger fallback chains.

Applied to files:

src/ai_company/memory/backends/mem0/adapter.py

📚 Learning: 2026-03-13T06:12:19.113Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T06:12:19.113Z
Learning: Applies to tests/**/*.py : Prefer `pytest.mark.parametrize` for testing similar cases

Applied to files:

tests/unit/memory/backends/mem0/test_adapter.py

🧬 Code graph analysis (2)

src/ai_company/memory/backends/mem0/adapter.py (3)

src/ai_company/core/enums.py (1)

MemoryCategory (101-108)

src/ai_company/memory/errors.py (4)

MemoryConnectionError (17-18)

MemoryRetrievalError (25-26)

MemoryStoreError (21-22)

MemoryError (13-14)

tests/integration/memory/test_mem0_backend.py (1)

backend (42-51)

tests/unit/memory/backends/mem0/test_mappers.py (4)

src/ai_company/core/enums.py (1)

MemoryCategory (101-108)

src/ai_company/memory/backends/mem0/mappers.py (11)

apply_post_filters (280-312)

build_mem0_metadata (38-57)

extract_category (349-371)

extract_publisher (374-380)

mem0_result_to_entry (167-226)

normalize_relevance_score (89-100)

parse_mem0_datetime (60-86)

parse_mem0_metadata (103-164)

query_to_mem0_getall_args (261-277)

query_to_mem0_search_args (229-258)

validate_add_result (318-346)

src/ai_company/memory/errors.py (2)

MemoryRetrievalError (25-26)

MemoryStoreError (21-22)

src/ai_company/memory/models.py (4)

MemoryEntry (82-150)

MemoryMetadata (20-52)

MemoryQuery (153-230)

MemoryStoreRequest (55-79)

🔇 Additional comments (10)

src/ai_company/observability/events/memory.py (1)

22-22: LGTM!

The new event constant follows the established memory.<entity>.<action> naming convention and is properly placed in the Backend lifecycle section. This addresses the past review feedback about using a more specific event for embedder validation failures.

.github/workflows/dependency-review.yml (1)

32-51: LGTM!

License allow-list updates are well-documented with manual verification notes. The added packages (mem0ai, numpy, qdrant-client, posthog) and licenses (LicenseRef-scancode-protobuf, ZPL-2.1) are all permissive and properly justified in comments.

src/ai_company/memory/factory.py (1)

48-93: LGTM!

The Mem0 backend creation logic properly validates the embedder configuration with clear error messages and structured logging. The switch to MEMORY_BACKEND_CONFIG_INVALID event (addressing the past review feedback) provides better semantic clarity for configuration errors versus unknown backends.

tests/unit/memory/backends/mem0/test_mappers.py (1)

1-491: LGTM!

Comprehensive test suite covering all mapper functions with good edge case coverage. Tests follow vendor-agnostic naming conventions (test-agent-001, test-provider) and include proper markers. The test structure clearly maps to each public mapper function.

tests/unit/memory/backends/mem0/test_adapter.py (1)

1-1117: Comprehensive test coverage for the Mem0 adapter.

The test suite thoroughly exercises the adapter's lifecycle, CRUD operations, shared knowledge flows, and error handling paths. Good coverage of edge cases including MemoryError re-raising and validation failures.

src/ai_company/memory/backends/mem0/adapter.py (2)

497-501: Documentation of count limitation is helpful.

The docstring now explicitly documents that results are capped at max_memories_per_agent and may underestimate. This addresses the past review concern by making the limitation transparent to callers.

699-789: retract() correctly validates publisher ownership.

Good implementation of the ownership check pattern (lines 732-758) that should serve as the model for get() and delete().

src/ai_company/memory/backends/mem0/mappers.py (3)

38-57: build_mem0_metadata is well-structured.

Clean serialization logic with proper prefixing and conditional field inclusion.

60-86: parse_mem0_datetime handles edge cases properly.

Good defensive handling of None/empty input, malformed strings, and naive datetime conversion to UTC with appropriate logging.

280-312: apply_post_filters logic is correct.

The filtering logic properly handles all query parameters while preserving order.

Copilot

Pull request overview

Implements and hardens a concrete Mem0-backed MemoryBackend implementation, making Mem0 configuration vendor-agnostic (explicit provider/model) while adding robust adapter behavior and extensive test coverage. This fits into the memory subsystem by providing the first real backend wired through the factory and validated via unit + integration tests.

Changes:

Add Mem0 backend implementation (adapter.py, mappers.py, config.py) plus factory wiring and exports.
Harden parsing/validation paths (IDs/content/categories/datetimes), add health probes, and improve structured logging/events.
Add Mem0 dependency + license-review exceptions, update docs/README, and add comprehensive unit/integration tests.

Reviewed changes

Copilot reviewed 20 out of 23 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
uv.lock	Locks new transitive dependencies introduced by `mem0ai` and related packages.
pyproject.toml	Adds `mem0ai==1.0.5` and a mypy ignore override for `mem0.*`.
.github/workflows/dependency-review.yml	Updates dependency-review allowlists for new license metadata cases and missing SPDX metadata.
docs/roadmap/index.md	Removes “Mem0 adapter” from remaining roadmap items (now implemented).
docs/design/memory.md	Updates design doc to reflect Mem0 as implemented.
README.md	Updates status section to include Mem0 backend adapter as implemented.
CLAUDE.md	Updates repo structure notes and mentions `mem0ai` dependency.
src/ai_company/observability/events/memory.py	Adds `MEMORY_BACKEND_CONFIG_INVALID` event constant for factory validation logging.
src/ai_company/memory/factory.py	Implements backend creation for `mem0`, requires embedder config, adds structured logging.
src/ai_company/memory/backends/mem0/mappers.py	Adds Mem0 ↔ domain mapping helpers, defensive parsing, post-filters, and add-result validation.
src/ai_company/memory/backends/mem0/config.py	Introduces Mem0-specific config models and dict builder; adds traversal validation; embedder config now required.
src/ai_company/memory/backends/mem0/adapter.py	Adds Mem0 adapter implementing CRUD + shared knowledge operations with health probes and defensive error handling.
src/ai_company/memory/backends/mem0/init.py	Re-exports Mem0 backend/config types.
src/ai_company/memory/backends/init.py	Exposes concrete backends package exports.
src/ai_company/memory/init.py	Re-exports `Mem0MemoryBackend` and `Mem0EmbedderConfig` from `ai_company.memory`.
tests/unit/memory/test_init.py	Updates export expectations to include Mem0 symbols.
tests/unit/memory/test_factory.py	Updates factory tests for real Mem0 backend creation + embedder requirement validation.
tests/unit/memory/backends/mem0/test_mappers.py	Adds unit coverage for mapper functions and edge cases.
tests/unit/memory/backends/mem0/test_config.py	Adds unit coverage for Mem0 config validation and config dict building.
tests/unit/memory/backends/mem0/test_adapter.py	Adds unit coverage for adapter lifecycle/CRUD/shared operations and error paths.
tests/integration/memory/test_mem0_backend.py	Adds integration tests validating adapter behavior through retrieval pipeline using a mocked client.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    if not raw_metadata:
+        return (
+            MemoryCategory.WORKING,
+            MemoryMetadata(),
+            None,
+        )
+
+    category_str = raw_metadata.get(f"{_PREFIX}category")
+    if category_str:
+        try:


+    metadata = raw.get("metadata", {})
+    if not metadata:
+        return MemoryCategory.WORKING
+    cat_str = metadata.get(f"{_PREFIX}category")
+    if cat_str:
+        try:
+            return MemoryCategory(cat_str)


+def extract_publisher(raw: dict[str, Any]) -> str | None:
+    """Extract the publisher agent ID from a shared memory dict."""
+    metadata = raw.get("metadata", {})
+    if not metadata:
+        return None
+    value: str | None = metadata.get(_PUBLISHER_KEY)
+    return value


coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

src/ai_company/memory/backends/mem0/mappers.py (1)

343-352: ⚠️ Potential issue | 🟡 Minor

Add explicit type check for first before accessing .get().

If results_list[0] is not a dict (e.g., a string or number), first.get("id") raises AttributeError instead of the expected MemoryStoreError. This was flagged in a prior review.

🛡️ Proposed fix

     first = results_list[0]
+    if not isinstance(first, dict):
+        msg = f"Mem0 add result is not an object for {context}: {type(first).__name__}"
+        logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg)
+        raise MemoryStoreError(msg)
     raw_id = first.get("id")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 343 - 352, The
code assumes results_list[0] is a dict and calls first.get("id"), which raises
AttributeError for non-dict items; update the extraction logic in the block that
handles results_list (referencing results_list, first, and the return of
NotBlankStr) to first check the type of first (e.g., isinstance(first, dict))
before calling .get(), and if it is not a dict construct the same warning and
raise MemoryStoreError (using logger with MEMORY_ENTRY_STORE_FAILED, context,
error=...) with a clear message about unexpected result type and keys/contents;
keep the existing behavior for missing/blank id and return
NotBlankStr(str(raw_id)) when valid.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/memory/backends/mem0/test_mappers.py`:
- Around line 394-439: Add a unit test in the TestValidateAddResult class to
cover the case where results[0] is not a dict: create a method named
test_non_dict_first_result_raises that calls validate_add_result({"results":
["not-a-dict"]}, context="test") and asserts it raises MemoryStoreError with a
match like "not an object"; this will expose the AttributeError and ensure
validate_add_result properly validates the first result element.

---

Duplicate comments:
In `@src/ai_company/memory/backends/mem0/mappers.py`:
- Around line 343-352: The code assumes results_list[0] is a dict and calls
first.get("id"), which raises AttributeError for non-dict items; update the
extraction logic in the block that handles results_list (referencing
results_list, first, and the return of NotBlankStr) to first check the type of
first (e.g., isinstance(first, dict)) before calling .get(), and if it is not a
dict construct the same warning and raise MemoryStoreError (using logger with
MEMORY_ENTRY_STORE_FAILED, context, error=...) with a clear message about
unexpected result type and keys/contents; keep the existing behavior for
missing/blank id and return NotBlankStr(str(raw_id)) when valid.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 144746d0-6fd8-48b0-85ec-3d6e4c606065

📥 Commits

Reviewing files that changed from the base of the PR and between dc77bcb and 8f6638e.

📒 Files selected for processing (3)

src/ai_company/memory/backends/mem0/mappers.py
tests/integration/memory/test_mem0_backend.py
tests/unit/memory/backends/mem0/test_mappers.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Greptile Review
GitHub Check: Test (Python 3.14)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations
Use except A, B: syntax (no parentheses) — ruff enforces PEP 758 except syntax on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use @computed_field for derived values instead of storing + validating redundant fields. Use NotBlankStr for all identifier/name fields — including optional and tuple variants — instead of manual whitespace validators.
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task. Existing code is being migrated incrementally.
Line length: 88 characters (ruff)
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)

Files:

tests/unit/memory/backends/mem0/test_mappers.py
src/ai_company/memory/backends/mem0/mappers.py
tests/integration/memory/test_mem0_backend.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Tests: markers @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Coverage: 80% minimum (enforced in CI)
Async: asyncio_mode = "auto" — no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test
Parallelism: pytest-xdist via -n auto — ALWAYS include -n auto when running pytest, never run tests sequentially
Prefer @pytest.mark.parametrize for testing similar cases
Vendor-agnostic everywhere: Tests must use test-provider, test-small-001, etc. (Vendor names may only appear in operations design page, .claude/ skill/agent files, or third-party import paths)

Files:

tests/unit/memory/backends/mem0/test_mappers.py
tests/integration/memory/test_mem0_backend.py

src/ai_company/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/ai_company/**/*.py: Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code
Variable name: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under ai_company.observability.events (e.g. PROVIDER_CALL_START from events.provider). Import directly: from ai_company.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs in logging: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG for object creation, internal flow, entry/exit of key functions
Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Tests must use test-provider, test-small-001, etc.

Files:

src/ai_company/memory/backends/mem0/mappers.py

🧬 Code graph analysis (3)

tests/unit/memory/backends/mem0/test_mappers.py (1)

src/ai_company/memory/backends/mem0/mappers.py (11)

apply_post_filters (286-318)

build_mem0_metadata (38-57)

extract_category (355-377)

extract_publisher (380-386)

mem0_result_to_entry (167-232)

normalize_relevance_score (89-100)

parse_mem0_datetime (60-86)

parse_mem0_metadata (103-164)

query_to_mem0_getall_args (267-283)

query_to_mem0_search_args (235-264)

validate_add_result (324-352)

src/ai_company/memory/backends/mem0/mappers.py (5)

src/ai_company/core/enums.py (1)

MemoryCategory (101-108)

src/ai_company/memory/errors.py (2)

MemoryRetrievalError (25-26)

MemoryStoreError (21-22)

src/ai_company/memory/models.py (4)

MemoryEntry (82-150)

MemoryMetadata (20-52)

MemoryQuery (153-230)

MemoryStoreRequest (55-79)

src/ai_company/observability/_logger.py (1)

get_logger (8-28)

src/ai_company/memory/backends/mem0/adapter.py (1)

get (364-422)

tests/integration/memory/test_mem0_backend.py (5)

src/ai_company/memory/backends/mem0/adapter.py (9)

Mem0MemoryBackend (77-789)

store (245-301)

retrieve (303-362)

publish (559-621)

search_shared (623-697)

get (364-422)

delete (424-484)

retract (699-789)

count (486-555)

src/ai_company/memory/backends/mem0/config.py (2)

Mem0BackendConfig (46-78)

Mem0EmbedderConfig (17-43)

src/ai_company/memory/models.py (2)

MemoryQuery (153-230)

MemoryStoreRequest (55-79)

src/ai_company/memory/retrieval_config.py (1)

MemoryRetrievalConfig (20-132)

src/ai_company/memory/retriever.py (1)

ContextInjectionStrategy (97-400)

🔇 Additional comments (28)

src/ai_company/memory/backends/mem0/mappers.py (11)

1-29: LGTM!

Module setup follows guidelines: proper docstring, no __future__ imports, correct logger initialization with get_logger(__name__), and appropriate imports.

38-57: LGTM!

Clean metadata serialization with proper type hints, docstring, and immutable construction pattern.

60-86: LGTM!

Robust datetime parsing with proper exception handling using PEP 758 syntax (except ValueError, TypeError:), appropriate fallback to None on malformed input, and UTC assignment for naive datetimes.

89-100: LGTM!

Simple, correct clamping implementation that preserves None values.

103-164: LGTM!

Robust metadata deserialization that properly handles non-dict metadata types (Line 114), invalid categories, non-numeric confidence values, and various tag formats. The past review concern about non-object metadata is addressed.

205-213: Defaulting created_at to now() may skew recency ranking.

When created_at is missing or unparseable, using datetime.now(UTC) makes the entry appear brand-new. This can affect since/until filters and recency-weighted ranking, potentially surfacing malformed records at the top of results.

Consider either rejecting entries with missing timestamps (raise MemoryRetrievalError) or using a deterministic floor timestamp (e.g., datetime(1970, 1, 1, tzinfo=UTC)) to make these entries rank lower rather than higher.

235-264: LGTM!

Proper validation that query.text is set before building search arguments, with appropriate logging and error raising.

267-283: LGTM!

Simple and correct implementation for metadata-only queries.

286-318: LGTM!

Post-filtering logic is correct: half-open time interval [since, until), AND semantics for tags, and proper handling of None relevance scores (they bypass the min_relevance check).

355-377: LGTM!

Properly handles non-dict metadata types (Line 362-363), returning WORKING as the safe default.

380-386: LGTM!

Consistent defensive handling of non-dict metadata, returning None when publisher cannot be extracted.

tests/unit/memory/backends/mem0/test_mappers.py (10)

1-32: LGTM!

Proper test module setup with correct imports, timeout marker, and unit test marker usage.

34-54: LGTM!

Clean test helper with sensible defaults for constructing MemoryEntry instances.

57-91: LGTM!

Good coverage of build_mem0_metadata with basic and full metadata scenarios.

93-121: LGTM!

Comprehensive datetime parsing tests covering None, empty string, aware/naive timezones, and malformed inputs.

124-141: LGTM!

Good boundary testing for relevance score normalization.

143-186: LGTM!

Thorough metadata parsing tests including edge cases for invalid categories and empty/whitespace tags.

189-269: LGTM!

Excellent coverage of mem0_result_to_entry including validation of missing/blank IDs, empty content, and the data key fallback path.

272-295: LGTM!

Query argument translation tests properly validate both happy path and error case.

298-391: LGTM!

Comprehensive post-filter tests covering category, tags, time ranges (including exclusive until), min relevance, and combined filters.

442-507: LGTM!

Thorough tests for extract_category and extract_publisher including non-dict metadata edge cases (list, string) which validates the defensive fixes in the production code.

tests/integration/memory/test_mem0_backend.py (7)

1-32: LGTM!

Proper integration test setup with vendor-agnostic embedder configuration helper.

35-51: LGTM!

Appropriate test fixture setup that injects a mock client and bypasses the connection flow. The noqa: S108 for the temp path is acceptable in test context.

54-106: LGTM!

Good integration test validating the store → retrieve flow with proper mock setup and assertions on memory ID, content, and relevance score.

107-161: LGTM!

Full pipeline integration test properly validates that both memories survive ranking and appear in the formatted output. The past review concern about asserting the second memory is now addressed (Line 161).

163-221: LGTM!

Comprehensive shared knowledge flow test covering publish → search_shared → retract with proper publisher metadata verification.

223-253: LGTM!

Good test for the exclude_agent filter in search_shared, verifying that the requesting agent's memories are excluded from results.

255-288: LGTM!

Proper test for count functionality including category filtering.

coderabbitai · 2026-03-13T06:23:14Z

+@pytest.mark.unit
+class TestValidateAddResult:
+    def test_valid_result(self) -> None:
+        result = {"results": [{"id": "mem-001", "event": "ADD"}]}
+        memory_id = validate_add_result(result, context="test")
+        assert memory_id == "mem-001"
+
+    def test_empty_results_raises(self) -> None:
+        result: dict[str, Any] = {"results": []}
+        with pytest.raises(MemoryStoreError, match="no results"):
+            validate_add_result(result, context="test")
+
+    def test_missing_results_key_raises(self) -> None:
+        result = {"data": "something"}
+        with pytest.raises(MemoryStoreError, match="no results"):
+            validate_add_result(result, context="test")
+
+    def test_non_list_results_raises(self) -> None:
+        result = {"results": "not-a-list"}
+        with pytest.raises(MemoryStoreError, match="no results"):
+            validate_add_result(result, context="test")
+
+    def test_missing_id_raises(self) -> None:
+        result = {"results": [{"memory": "no id"}]}
+        with pytest.raises(MemoryStoreError, match="missing or blank 'id'"):
+            validate_add_result(result, context="test")
+
+    def test_none_id_raises(self) -> None:
+        result = {"results": [{"id": None, "event": "ADD"}]}
+        with pytest.raises(MemoryStoreError, match="missing or blank 'id'"):
+            validate_add_result(result, context="test")
+
+    def test_blank_id_raises(self) -> None:
+        result = {"results": [{"id": "", "event": "ADD"}]}
+        with pytest.raises(MemoryStoreError, match="missing or blank 'id'"):
+            validate_add_result(result, context="test")
+
+    def test_whitespace_only_id_raises(self) -> None:
+        result = {"results": [{"id": "   ", "event": "ADD"}]}
+        with pytest.raises(MemoryStoreError, match="missing or blank 'id'"):
+            validate_add_result(result, context="test")
+
+    def test_numeric_id_coerced_to_string(self) -> None:
+        result = {"results": [{"id": 42, "event": "ADD"}]}
+        memory_id = validate_add_result(result, context="test")
+        assert memory_id == "42"


🧹 Nitpick | 🔵 Trivial

Add test for non-dict first result element.

The test suite validates empty, non-list, and missing results, but doesn't cover the case where results[0] is not a dict (e.g., {"results": ["just-a-string"]}). This would expose the AttributeError bug in the production code.

🧪 Suggested test case

def test_non_dict_first_result_raises(self) -> None: result = {"results": ["not-a-dict"]} with pytest.raises(MemoryStoreError, match="not an object"): validate_add_result(result, context="test")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/memory/backends/mem0/test_mappers.py` around lines 394 - 439, Add a unit test in the TestValidateAddResult class to cover the case where results[0] is not a dict: create a method named test_non_dict_first_result_raises that calls validate_add_result({"results": ["not-a-dict"]}, context="test") and asserts it raises MemoryStoreError with a match like "not an object"; this will expose the AttributeError and ensure validate_add_result properly validates the first result element.

Add concrete MemoryBackend implementation using Mem0 (embedded Qdrant + SQLite) as the storage layer, unblocking all downstream memory features. - Mem0MemoryBackend implements MemoryBackend, MemoryCapabilities, and SharedKnowledgeStore protocols with asyncio.to_thread for all sync Mem0 calls - Mapping layer (mappers.py) for bidirectional domain model <-> Mem0 dict conversion with _synthorg_ metadata prefix - Mem0BackendConfig + config builder deriving from CompanyMemoryConfig - Factory wired up with deferred import (no longer raises MemoryConfigError for mem0 backend) - Shared knowledge via reserved __synthorg_shared__ namespace with publisher ownership tracking - 95 unit tests (adapter, mappers, config) + 6 integration tests (retrieval pipeline, shared knowledge flow) Closes #206

Pre-reviewed by 9 agents, 33 findings addressed: - Remove vendor-specific defaults from Mem0EmbedderConfig (now required) - Fix metadata dict mutation in publish() (use spread instead) - Add path traversal validation on Mem0BackendConfig.data_dir - Extract _validate_add_result helper for store/publish result validation - Add explicit ImportError handling in connect() - Make health_check() probe backend with lightweight get_all call - Add defensive category parsing in _extract_category (handle invalid enums) - Add missing publisher check in retract() (not a shared memory entry) - Replace C-style loop in search_shared with generator expression - Use meaningful variable names (raw_entries/filtered vs triple result) - Change logger.exception to logger.warning for self-raised errors - Add structured error/error_type kwargs to all re-raise logs - Add MemoryConnectionError to Raises docstrings for shared methods - Update factory to require embedder config (no vendor defaults) - Remove dead store_request_to_mem0_args function and its tests - Add tests: path traversal, missing id, empty content, malformed datetime, invalid category, no-publisher retract, delete-after-get failure, protocol conformance, health_check probe failure

…iewers - Add MemoryError/RecursionError guards before all except Exception blocks - Capture exceptions in re-raise blocks for structured logging - Move helper functions (validate_add_result, extract_category, extract_publisher) from adapter.py to mappers.py to keep adapter under 800-line limit - Add type/tag/confidence validation in mappers (items 9, 11, 23) - Add logging before all raise statements (items 7, 20, 21) - Fix health_check exception log level from debug to warning - Add search_shared query context to error logs - Strengthen factory.py type annotation (embedder: Mem0EmbedderConfig | None) - Parametrize config traversal tests, add ImportError/MemoryError/publish tests - Update CLAUDE.md, README, roadmap, and design spec for Mem0 adapter status

- Fix dependency review CI: add LicenseRef-scancode-protobuf, ZPL-2.1 to license allow-list and allow-dependencies-licenses for packages with null SPDX metadata (mem0ai, numpy, qdrant-client, posthog) - Fix validate_add_result blank-ID gap: check for None and whitespace IDs, not just missing key (Greptile finding) - Fix factory.py wrong event constant: use MEMORY_BACKEND_CONFIG_INVALID instead of MEMORY_BACKEND_UNKNOWN for embedder config errors (Greptile) - Fix retract() bare except: capture as exc and log error detail - Document count() limitation: capped at max_memories_per_agent - Add 37 new tests: validate_add_result (blank/None/whitespace/numeric ID, non-list results), extract_category, extract_publisher, MemoryError re-raise for all operations, blank-ID through store/retrieve, shared namespace fallback, count empty results, retract delete failure

Copilot

Pull request overview

This PR finalizes and hardens a concrete Mem0-backed implementation of the MemoryBackend/SharedKnowledgeStore protocols, and updates configuration/factory wiring so Mem0 can be instantiated with explicitly-specified embedder settings (vendor-agnostic defaults removed).

Changes:

Introduces Mem0 backend implementation (adapter.py, mappers.py, config.py) with defensive parsing, health probes, shared-knowledge publish/search/retract, and structured logging.
Updates the memory factory + public exports to support creating a Mem0 backend with a required embedder config.
Adds unit + integration tests covering edge cases and protocol conformance; updates docs/status text and adds mem0ai dependency.

Reviewed changes

Copilot reviewed 20 out of 23 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
uv.lock	Adds mem0ai and transitive deps to the lockfile.
pyproject.toml	Adds mem0ai dependency; adds mypy ignore for `mem0.*`.
src/ai_company/memory/factory.py	Creates Mem0 backend when configured; requires `embedder` param.
src/ai_company/memory/backends/mem0/adapter.py	Implements Mem0-backed `MemoryBackend` + shared knowledge operations.
src/ai_company/memory/backends/mem0/mappers.py	Adds pure mapping/validation helpers for Mem0 dicts ↔ domain models.
src/ai_company/memory/backends/mem0/config.py	Adds Mem0 backend config models and config dict builder.
src/ai_company/memory/backends/mem0/init.py	Exports Mem0 backend/config symbols.
src/ai_company/memory/backends/init.py	Exposes concrete backend(s).
src/ai_company/memory/init.py	Re-exports Mem0 backend/config at package level.
src/ai_company/observability/events/memory.py	Adds config-invalid event constant.
tests/unit/memory/test_init.py	Updates export expectations.
tests/unit/memory/test_factory.py	Updates factory tests for embedder requirement and backend creation.
tests/unit/memory/backends/mem0/test_config.py	New tests for Mem0 config validation/building.
tests/unit/memory/backends/mem0/test_mappers.py	New tests for mapping/parsing/validation helpers.
tests/unit/memory/backends/mem0/test_adapter.py	New comprehensive adapter tests (CRUD + shared knowledge + edge cases).
tests/integration/memory/test_mem0_backend.py	New integration-style tests plugging adapter into retrieval pipeline.
docs/roadmap/index.md	Removes “Mem0 adapter” from remaining roadmap items.
docs/design/memory.md	Updates diagram text to mark Mem0 implemented.
README.md	Updates status statement to include Mem0 backend adapter.
CLAUDE.md	Updates memory subsystem description; notes mem0ai as optional.
.github/workflows/dependency-review.yml	Allows additional licenses/metadata for new deps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        )
+        logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg)
+        raise MemoryStoreError(msg)
+    return NotBlankStr(str(raw_id))


+        try:
+            config_dict = build_mem0_config_dict(self._mem0_config)
+            client = await asyncio.to_thread(Memory.from_config, config_dict)
+        except MemoryError, RecursionError:


+                user_id=_SHARED_NAMESPACE,
+                limit=1,
+            )
+        except MemoryError, RecursionError:


+    @property
+    def backend_name(self) -> NotBlankStr:
+        """Human-readable backend identifier."""
+        return NotBlankStr("mem0")


 dependencies = [
    "aiodocker==0.26.0",
    "aiosqlite==0.22.1",
    "argon2-cffi==25.1.0",
    "jinja2==3.1.6",
    "jsonschema==4.26.0",
    "litellm==1.82.1",
    "litestar[standard,structlog,pydantic,brotli,prometheus]==2.21.1",
    "mcp==1.26.0",
+    "mem0ai==1.0.5",
    "pydantic==2.12.5",
    "pyjwt[crypto]==2.11.0",


+    return {
+        "vector_store": {
+            "provider": "qdrant",
+            "config": {
+                "collection_name": config.collection_name,
+                "embedding_model_dims": config.embedder.dims,
+                "path": f"{config.data_dir}/qdrant",
+            },
+        },
+        "embedder": {
+            "provider": config.embedder.provider,
+            "config": {
+                "model": config.embedder.model,
+            },
+        },
+        "history_db_path": f"{config.data_dir}/history.db",
+        # Mem0 config schema version — required by Memory.from_config().
+        "version": "v1.1",


+        raw_tags = [raw_tags]
+    elif not isinstance(raw_tags, (list, tuple)):
+        raw_tags = ()
+    tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip())


+    if not raw:
+        return None
+    try:
+        dt = datetime.fromisoformat(raw)
+    except ValueError, TypeError:
+        logger.warning(
+            MEMORY_MODEL_INVALID,
+            field="datetime",
+            raw_value=raw,
+            reason="malformed ISO 8601 datetime, returning None",
+        )
+        return None


+    raw_id = raw.get("id")
+    if raw_id is None or not str(raw_id).strip():
+        msg = f"Mem0 result has missing or blank 'id': keys={list(raw.keys())}"
+        logger.warning(
+            MEMORY_MODEL_INVALID,
+            field="id",
+            raw_value=raw_id,
+            reason=msg,
+        )
+        raise MemoryRetrievalError(msg)
+    memory_id = NotBlankStr(str(raw_id))
+
+    raw_content = raw.get("memory") or raw.get("data")
+    if not raw_content or not str(raw_content).strip():
+        msg = f"Mem0 result {raw.get('id', '?')} has empty content"
+        logger.warning(
+            MEMORY_MODEL_INVALID,
+            field="content",
+            raw_value=raw_content,
+            reason=msg,
+        )
+        raise MemoryRetrievalError(msg)
+    content = NotBlankStr(str(raw_content))
+
+    created_at = parse_mem0_datetime(raw.get("created_at"))
+    if created_at is None:
+        logger.debug(
+            MEMORY_MODEL_INVALID,
+            field="created_at",
+            memory_id=str(raw.get("id", "?")),
+            reason="missing or unparseable created_at, defaulting to now()",
+        )
+        created_at = datetime.now(UTC)
+    updated_at = parse_mem0_datetime(raw.get("updated_at"))
+
+    raw_metadata = raw.get("metadata")
+    category, metadata, expires_at = parse_mem0_metadata(raw_metadata)
+
+    raw_score = raw.get("score")
+    relevance_score = normalize_relevance_score(raw_score)
+
+    return MemoryEntry(
+        id=memory_id,
+        agent_id=NotBlankStr(agent_id),
+        category=category,


greptile-apps · 2026-03-13T06:31:38Z

+            entries = tuple(
+                mem0_result_to_entry(item, str(agent_id)) for item in raw_list
+            )


Single malformed item aborts entire batch retrieval

Both retrieve() and search_shared() build the result tuple by eagerly evaluating a generator that calls mem0_result_to_entry() per item. mem0_result_to_entry raises MemoryRetrievalError for any item with a missing/blank ID or empty content. Because the exception is thrown inside the tuple() call, it propagates out of the try block and the entire retrieval fails — all valid results from the same batch are discarded.

This is particularly impactful for search_shared(): if even one stored shared-knowledge entry is malformed (due to a direct Mem0 write, data corruption, or a bug in a prior version), the whole shared knowledge store becomes completely inaccessible to all agents until the corrupt record is manually removed.

mem0_result_to_entry already logs a warning before raising, so a try/except skip-and-warn pattern at the call site would preserve that observability while returning partial results instead of failing completely:

# retrieve() – lines 333–335 entries_list: list[MemoryEntry] = [] for item in raw_list: try: entries_list.append(mem0_result_to_entry(item, str(agent_id))) except MemoryRetrievalError: pass # warning already logged inside mem0_result_to_entry entries = tuple(entries_list)

The same pattern should be applied to search_shared() lines 659–665.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ai_company/memory/backends/mem0/adapter.py Line: 333-335 Comment: **Single malformed item aborts entire batch retrieval** Both `retrieve()` and `search_shared()` build the result tuple by eagerly evaluating a generator that calls `mem0_result_to_entry()` per item. `mem0_result_to_entry` raises `MemoryRetrievalError` for any item with a missing/blank ID or empty content. Because the exception is thrown inside the `tuple()` call, it propagates out of the `try` block and the entire retrieval fails — all valid results from the same batch are discarded. This is particularly impactful for `search_shared()`: if even one stored shared-knowledge entry is malformed (due to a direct Mem0 write, data corruption, or a bug in a prior version), the whole shared knowledge store becomes **completely inaccessible** to all agents until the corrupt record is manually removed. `mem0_result_to_entry` already logs a warning before raising, so a try/except skip-and-warn pattern at the call site would preserve that observability while returning partial results instead of failing completely: ```python # retrieve() – lines 333–335 entries_list: list[MemoryEntry] = [] for item in raw_list: try: entries_list.append(mem0_result_to_entry(item, str(agent_id))) except MemoryRetrievalError: pass # warning already logged inside mem0_result_to_entry entries = tuple(entries_list) ``` The same pattern should be applied to `search_shared()` lines 659–665. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-13T06:31:39Z

+        if (
+            query.min_relevance > 0.0
+            and entry.relevance_score is not None
+            and entry.relevance_score < query.min_relevance
+        ):
+            continue


Unscored entries bypass min_relevance filter silently

The short-circuit on line 313 (and entry.relevance_score is not None) means that any entry without a relevance score — which includes all results from get_all() since Mem0 only computes scores during vector search — will pass the min_relevance filter unconditionally, regardless of how high the caller sets it.

A caller who invokes retrieve() without query.text (which takes the get_all path) and also sets query.min_relevance = 0.5 expecting to restrict results to high-confidence hits will silently receive all stored entries. The current behaviour is defensible ("can't filter what wasn't scored"), but since it diverges from what the parameter name implies and is not documented, it is likely to produce confusing results in practice.

At minimum the docstring for apply_post_filters should document that min_relevance is a no-op for entries that were not retrieved via vector search.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ai_company/memory/backends/mem0/mappers.py Line: 311-316 Comment: **Unscored entries bypass `min_relevance` filter silently** The short-circuit on line 313 (`and entry.relevance_score is not None`) means that any entry without a relevance score — which includes **all** results from `get_all()` since Mem0 only computes scores during vector search — will pass the `min_relevance` filter unconditionally, regardless of how high the caller sets it. A caller who invokes `retrieve()` without `query.text` (which takes the `get_all` path) and also sets `query.min_relevance = 0.5` expecting to restrict results to high-confidence hits will silently receive all stored entries. The current behaviour is defensible ("can't filter what wasn't scored"), but since it diverges from what the parameter name implies and is not documented, it is likely to produce confusing results in practice. At minimum the docstring for `apply_post_filters` should document that `min_relevance` is a no-op for entries that were not retrieved via vector search. How can I resolve this? If you propose a fix, please make it concise.

…indings Root cause of round-2 lychee CI failure: lycheeverse/lychee-action v2.8.0's install step does not handle lychee v0.24.x's new nested archive layout. The action's #330 fix (merged 2026-04-24) has not shipped a v2.9 release yet (issue #332 open). Our local install_cli_tools.sh already handles the nested archive correctly, so the cleanest fix is to drop the action entirely and use the script in CI. Removes drift: the script (local pre-push hook) and CI now install the byte-identical binary from one Renovate-tracked version pin. Eliminates upstream-action release-cadence coupling. Changes: - scripts/install_cli_tools.sh: positional subcommand dispatcher (all | lychee | golangci-lint, default all). PATH-staleness check after both installs (CodeRabbit major actionable + AI-agent inline prompt): if the PATH-resolved binary differs from the install_dir binary AND its --version disagrees with the pin, fail fast with a 'fix your PATH or remove the stale binary' hint. SC2064 disabled on the EXIT trap with rationale comment (early expansion is intentional since the tmpdir is function-local). - .github/workflows/lychee.yml: replaced lycheeverse/lychee-action step with three steps: install via bash scripts/install_cli_tools.sh lychee (LYCHEE_INSTALL_DIR at step-level env per actionlint), echo to GITHUB_PATH, then run lychee directly. --format markdown teed into GITHUB_STEP_SUMMARY preserves the action's job-summary feature. Path-filter widened to include scripts/install_cli_tools.sh so the workflow re-runs when the install changes. - renovate.json: reverted the version|lycheeVersion regex alternation (no workflow uses lycheeVersion: now that the action is gone). - docs/guides/dynamic-scoring.md: fixed misleading cross-reference (CodeRabbit minor). scoring-hyperparameters.md is operator-tunable weights / thresholds, not an architecture overview; the sentence now says exactly that instead of pretending it documents composite/weighted/ranked/multi-objective strategies.

Copilot AI review requested due to automatic review settings March 12, 2026 23:25

Aureliolo temporarily deployed to ci March 12, 2026 23:25 — with GitHub Actions Inactive

Copilot started reviewing on behalf of Aureliolo March 12, 2026 23:26 View session

Aureliolo temporarily deployed to cloudflare-preview March 12, 2026 23:26 — with GitHub Actions Inactive

greptile-apps Bot reviewed Mar 12, 2026

View reviewed changes

Comment thread src/ai_company/memory/backends/mem0/adapter.py

gemini-code-assist Bot reviewed Mar 12, 2026

View reviewed changes

Copilot AI reviewed Mar 12, 2026

View reviewed changes

coderabbitai Bot reviewed Mar 12, 2026

View reviewed changes

Aureliolo temporarily deployed to ci March 13, 2026 05:55 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview March 13, 2026 05:56 — with GitHub Actions Inactive

greptile-apps Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread src/ai_company/memory/backends/mem0/mappers.py Outdated

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings March 13, 2026 06:13

Aureliolo temporarily deployed to ci March 13, 2026 06:13 — with GitHub Actions Inactive

Copilot started reviewing on behalf of Aureliolo March 13, 2026 06:14 View session

Aureliolo temporarily deployed to cloudflare-preview March 13, 2026 06:14 — with GitHub Actions Inactive

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

greptile-apps Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread src/ai_company/memory/backends/mem0/adapter.py

Comment thread src/ai_company/memory/backends/mem0/adapter.py Outdated

Copilot AI reviewed Mar 13, 2026

View reviewed changes

Aureliolo temporarily deployed to ci March 13, 2026 06:19 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview March 13, 2026 06:20 — with GitHub Actions Inactive

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

greptile-apps Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread src/ai_company/memory/backends/mem0/mappers.py

Copilot AI review requested due to automatic review settings March 13, 2026 06:25

Aureliolo temporarily deployed to ci March 13, 2026 06:25 — with GitHub Actions Inactive

Copilot started reviewing on behalf of Aureliolo March 13, 2026 06:25 View session

Aureliolo added 6 commits March 13, 2026 07:25

fix: add non-dict metadata guards and created_at fallback logging

669975c

fix: remove double-logging in retract() ownership check path

28eba7a

Aureliolo force-pushed the feat/mem0-adapter branch from f7d017c to 28eba7a Compare March 13, 2026 06:26

Aureliolo closed this Mar 13, 2026

Aureliolo temporarily deployed to cloudflare-preview March 13, 2026 06:26 — with GitHub Actions Inactive

Aureliolo temporarily deployed to ci March 13, 2026 06:26 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview March 13, 2026 06:27 — with GitHub Actions Inactive

Copilot AI reviewed Mar 13, 2026

View reviewed changes

greptile-apps Bot reviewed Mar 13, 2026

View reviewed changes

		raw_tags = raw_metadata.get(f"{_PREFIX}tags", ())
		tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip())

Conversation

Aureliolo commented Mar 12, 2026

Summary

Changes

Test plan

Review coverage

Uh oh!

github-actions Bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

OpenSSF Scorecard

Scanned Files

Uh oh!

coderabbitai Bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot commented Mar 12, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

codecov Bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

github-actions Bot commented Mar 12, 2026 •

edited

Loading

coderabbitai Bot commented Mar 12, 2026 •

edited

Loading

codecov Bot commented Mar 12, 2026 •

edited

Loading

greptile-apps Bot commented Mar 12, 2026 •

edited

Loading