refactor: harden Mem0 adapter with pre-PR review fixes#332
Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.OpenSSF ScorecardScorecard details
Scanned Files
|
📝 WalkthroughWalkthroughAdds a concrete Mem0 memory backend: new Mem0 backend package (config, mappers, adapter), factory dispatch and exports, tests (unit + integration), docs/CI updates, and a pyproject dependency for mem0ai. Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant Backend as Mem0MemoryBackend
participant Mem0 as Mem0<br/>(Memory Instance)
participant Qdrant as Qdrant<br/>(Vector Store)
participant SQLite as SQLite<br/>(History DB)
rect rgba(52, 152, 219, 0.5)
Note over Client,SQLite: Store Memory Flow
Client->>Backend: store(agent_id, request)
Backend->>Mem0: add(namespace, name, description, metadata)
Mem0->>Qdrant: embed & index description/metadata
Mem0->>SQLite: persist history record
Mem0-->>Backend: result with memory_id
Backend-->>Client: memory_id
end
rect rgba(46, 204, 113, 0.5)
Note over Client,SQLite: Retrieve Memory Flow
Client->>Backend: retrieve(agent_id, query)
Backend->>Mem0: search(namespace, text, limit, filters)
Mem0->>Qdrant: vector search on embeddings
Qdrant-->>Mem0: ranked results
Mem0->>SQLite: fetch full records & metadata
Mem0-->>Backend: results with scores & metadata
Backend->>Backend: parse_mem0_datetime / parse_mem0_metadata
Backend->>Backend: apply_post_filters
Backend-->>Client: filtered MemoryEntry tuples
end
sequenceDiagram
actor PublisherAgent as Publisher<br/>(Agent A)
actor SearcherAgent as Searcher<br/>(Agent B)
participant Backend as Mem0MemoryBackend
participant Mem0 as Mem0<br/>(Memory Instance)
participant Qdrant as Qdrant<br/>(Vector Store)
rect rgba(231, 76, 60, 0.5)
Note over PublisherAgent,Qdrant: Publish Shared Memory
PublisherAgent->>Backend: publish(agent_id, request)
Backend->>Backend: build_mem0_metadata (publisher_key=agent_id)
Backend->>Mem0: add(SHARED_NAMESPACE, description, metadata)
Mem0->>Qdrant: embed & index with publisher metadata
Mem0-->>Backend: memory_id
Backend-->>PublisherAgent: memory_id
end
rect rgba(155, 89, 182, 0.5)
Note over SearcherAgent,Qdrant: Search Shared Memory (Exclude Self)
SearcherAgent->>Backend: search_shared(query, exclude_agent=agent_id)
Backend->>Mem0: search(SHARED_NAMESPACE, text, limit)
Mem0->>Qdrant: vector search
Qdrant-->>Mem0: results with scores
Mem0-->>Backend: results
Backend->>Backend: filter out publisher==exclude_agent
Backend-->>SearcherAgent: shared entries (excluding self)
end
rect rgba(241, 196, 15, 0.5)
Note over PublisherAgent,Qdrant: Retract Shared Memory
PublisherAgent->>Backend: retract(agent_id, memory_id)
Backend->>Mem0: get(SHARED_NAMESPACE, memory_id)
Mem0-->>Backend: result with publisher metadata
Backend->>Backend: validate publisher_key == agent_id
Backend->>Mem0: delete(SHARED_NAMESPACE, memory_id)
Mem0->>Qdrant: remove indexed entry
Mem0-->>Backend: success
Backend-->>PublisherAgent: true
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
📝 Coding Plan
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors and hardens the Mem0 memory adapter, enhancing its reliability and maintainability. The changes ensure a more robust and vendor-agnostic integration with Mem0, improving error handling, configuration clarity, and overall system resilience. This work addresses pre-PR review findings and includes comprehensive testing to validate the improvements. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #332 +/- ##
==========================================
+ Coverage 93.76% 93.81% +0.04%
==========================================
Files 434 439 +5
Lines 19850 20222 +372
Branches 1915 1945 +30
==========================================
+ Hits 18613 18971 +358
- Misses 956 968 +12
- Partials 281 283 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Greptile SummaryThis PR implements and hardens the Mem0 memory backend adapter, completing the Key changes:
Issues found:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant Adapter as Mem0MemoryBackend
participant Mappers
participant Mem0 as Mem0 Client (asyncio.to_thread)
Note over Caller,Mem0: store() / publish()
Caller->>Adapter: store(agent_id, request)
Adapter->>Mappers: build_mem0_metadata(request)
Mappers-->>Adapter: metadata dict
Adapter->>Mem0: client.add(messages, user_id, metadata, infer=False)
Mem0-->>Adapter: {results: [{id, memory, event}]}
Adapter->>Mappers: validate_add_result(result, context)
Mappers-->>Adapter: NotBlankStr(memory_id)
Adapter-->>Caller: memory_id
Note over Caller,Mem0: retrieve() — text search path
Caller->>Adapter: retrieve(agent_id, query)
Adapter->>Mem0: client.search(query, user_id, limit)
Mem0-->>Adapter: {results: [...]}
loop for each item in results
Adapter->>Mappers: mem0_result_to_entry(item, agent_id)
Mappers->>Mappers: parse_mem0_datetime, parse_mem0_metadata, normalize_relevance_score
Mappers-->>Adapter: MemoryEntry
end
Adapter->>Mappers: apply_post_filters(entries, query)
Mappers-->>Adapter: filtered tuple
Adapter-->>Caller: tuple[MemoryEntry, ...]
Note over Caller,Mem0: retract() — ownership-verified delete
Caller->>Adapter: retract(agent_id, memory_id)
Adapter->>Mem0: client.get(memory_id)
Mem0-->>Adapter: raw dict or None
Adapter->>Mappers: extract_publisher(raw)
Mappers-->>Adapter: publisher str or None
alt publisher is None
Adapter-->>Caller: MemoryStoreError (not shared)
else publisher != agent_id
Adapter-->>Caller: MemoryStoreError (ownership mismatch)
else ownership confirmed
Adapter->>Mem0: client.delete(memory_id)
Adapter-->>Caller: True
end
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/ai_company/memory/backends/mem0/adapter.py
Line: 333-335
Comment:
**Single malformed item aborts entire batch retrieval**
Both `retrieve()` and `search_shared()` build the result tuple by eagerly evaluating a generator that calls `mem0_result_to_entry()` per item. `mem0_result_to_entry` raises `MemoryRetrievalError` for any item with a missing/blank ID or empty content. Because the exception is thrown inside the `tuple()` call, it propagates out of the `try` block and the entire retrieval fails — all valid results from the same batch are discarded.
This is particularly impactful for `search_shared()`: if even one stored shared-knowledge entry is malformed (due to a direct Mem0 write, data corruption, or a bug in a prior version), the whole shared knowledge store becomes **completely inaccessible** to all agents until the corrupt record is manually removed.
`mem0_result_to_entry` already logs a warning before raising, so a try/except skip-and-warn pattern at the call site would preserve that observability while returning partial results instead of failing completely:
```python
# retrieve() – lines 333–335
entries_list: list[MemoryEntry] = []
for item in raw_list:
try:
entries_list.append(mem0_result_to_entry(item, str(agent_id)))
except MemoryRetrievalError:
pass # warning already logged inside mem0_result_to_entry
entries = tuple(entries_list)
```
The same pattern should be applied to `search_shared()` lines 659–665.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/ai_company/memory/backends/mem0/mappers.py
Line: 311-316
Comment:
**Unscored entries bypass `min_relevance` filter silently**
The short-circuit on line 313 (`and entry.relevance_score is not None`) means that any entry without a relevance score — which includes **all** results from `get_all()` since Mem0 only computes scores during vector search — will pass the `min_relevance` filter unconditionally, regardless of how high the caller sets it.
A caller who invokes `retrieve()` without `query.text` (which takes the `get_all` path) and also sets `query.min_relevance = 0.5` expecting to restrict results to high-confidence hits will silently receive all stored entries. The current behaviour is defensible ("can't filter what wasn't scored"), but since it diverges from what the parameter name implies and is not documented, it is likely to produce confusing results in practice.
At minimum the docstring for `apply_post_filters` should document that `min_relevance` is a no-op for entries that were not retrieved via vector search.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 28eba7a |
There was a problem hiding this comment.
Code Review
This is an excellent pull request that significantly hardens the Mem0 adapter. The changes are well-structured, introducing clear separation of concerns with new config and mappers modules. The code is robust, with thorough defensive programming, explicit error handling, and improved logging. The addition of comprehensive unit and integration tests is also a major plus. I have only a couple of minor suggestions to further improve robustness against potentially malformed API responses.
| results_list = result.get("results", []) | ||
| if not results_list: | ||
| msg = f"Mem0 add returned no results for {context}" | ||
| raise MemoryStoreError(msg) |
There was a problem hiding this comment.
The code assumes that result.get("results") will either be a list or None. If the mem0 API returns a non-list, non-None value for "results" (e.g., a string or a dictionary), this code will raise a TypeError on results_list[0] instead of the intended MemoryStoreError. It would be more robust to also check the type of results_list.
| results_list = result.get("results", []) | |
| if not results_list: | |
| msg = f"Mem0 add returned no results for {context}" | |
| raise MemoryStoreError(msg) | |
| results_list = result.get("results") | |
| if not isinstance(results_list, list) or not results_list: | |
| msg = f"Mem0 add returned no or invalid 'results' for {context}" | |
| raise MemoryStoreError(msg) |
| raw_tags = raw_metadata.get(f"{_PREFIX}tags", ()) | ||
| tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip()) |
There was a problem hiding this comment.
The code assumes that the value for the tags key in the metadata will be an iterable. While the default is a tuple, if the key exists in raw_metadata with a non-iterable value (e.g., an integer), the list comprehension will raise a TypeError. Adding a type check would make this more robust against malformed data from the backend.
| raw_tags = raw_metadata.get(f"{_PREFIX}tags", ()) | |
| tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip()) | |
| raw_tags = raw_metadata.get(f"{_PREFIX}tags") | |
| tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip()) if isinstance(raw_tags, (list, tuple)) else () |
There was a problem hiding this comment.
Pull request overview
This PR finishes and hardens a concrete mem0 memory backend implementation, making the memory layer functional end-to-end (factory → backend adapter → retrieval pipeline) while enforcing vendor-agnostic configuration and adding broad test coverage.
Changes:
- Adds a Mem0 backend package (
adapter.py,config.py,mappers.py) and wires it into the memory factory and public exports. - Requires explicit embedder configuration (no vendor defaults) and adds path traversal validation for
data_dir. - Adds comprehensive unit + integration tests for the Mem0 adapter, config, and mapping behavior.
Reviewed changes
Copilot reviewed 14 out of 17 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
pyproject.toml |
Adds mem0ai dependency and mypy ignore for mem0.*. |
uv.lock |
Locks new dependency graph for mem0ai and transitive deps. |
src/ai_company/memory/factory.py |
Creates a real Mem0 backend (requires embedder=) instead of placeholder errors. |
src/ai_company/memory/backends/mem0/adapter.py |
Implements Mem0-backed MemoryBackend + shared store operations with logging and guards. |
src/ai_company/memory/backends/mem0/mappers.py |
Introduces stateless mapping/parsing utilities between Mem0 dicts and domain models. |
src/ai_company/memory/backends/mem0/config.py |
Defines Mem0-specific config + builder for Mem0 SDK config dict. |
src/ai_company/memory/backends/mem0/__init__.py |
Exposes Mem0 backend/config types. |
src/ai_company/memory/backends/__init__.py |
Exposes concrete backends package-level. |
src/ai_company/memory/__init__.py |
Re-exports Mem0MemoryBackend and Mem0EmbedderConfig. |
tests/unit/memory/test_init.py |
Updates export expectations for new Mem0 symbols. |
tests/unit/memory/test_factory.py |
Updates factory tests for new required embedder= behavior and Mem0 backend creation. |
tests/unit/memory/backends/mem0/test_config.py |
Adds config validation tests (required fields, traversal checks, config dict builder). |
tests/unit/memory/backends/mem0/test_mappers.py |
Adds mapper/metadata/datetime parsing tests + filtering tests. |
tests/unit/memory/backends/mem0/test_adapter.py |
Adds extensive adapter unit tests using a mocked Mem0 client. |
tests/integration/memory/test_mem0_backend.py |
Adds retrieval-pipeline integration tests with mocked Mem0 client. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try: | ||
| config_dict = build_mem0_config_dict(self._mem0_config) | ||
| client = await asyncio.to_thread(Memory.from_config, config_dict) | ||
| except Exception as exc: | ||
| logger.warning( | ||
| MEMORY_BACKEND_CONNECTION_FAILED, | ||
| backend="mem0", | ||
| error=str(exc), | ||
| error_type=type(exc).__name__, | ||
| ) | ||
| msg = f"Failed to connect to Mem0: {exc}" | ||
| raise MemoryConnectionError(msg) from exc |
| except Exception as exc: | ||
| logger.warning( | ||
| MEMORY_ENTRY_STORE_FAILED, | ||
| agent_id=agent_id, | ||
| error=str(exc), | ||
| error_type=type(exc).__name__, | ||
| ) | ||
| msg = f"Failed to store memory: {exc}" | ||
| raise MemoryStoreError(msg) from exc |
| except Exception as exc: | ||
| logger.warning( | ||
| MEMORY_ENTRY_RETRIEVAL_FAILED, | ||
| agent_id=agent_id, | ||
| error=str(exc), | ||
| error_type=type(exc).__name__, | ||
| ) | ||
| msg = f"Failed to retrieve memories: {exc}" | ||
| raise MemoryRetrievalError(msg) from exc |
| except Exception as exc: | ||
| logger.warning( | ||
| MEMORY_ENTRY_FETCH_FAILED, | ||
| agent_id=agent_id, | ||
| memory_id=memory_id, | ||
| error=str(exc), | ||
| error_type=type(exc).__name__, | ||
| ) | ||
| msg = f"Failed to get memory {memory_id}: {exc}" | ||
| raise MemoryRetrievalError(msg) from exc |
| except Exception as exc: | ||
| logger.warning( | ||
| MEMORY_ENTRY_DELETE_FAILED, | ||
| agent_id=agent_id, | ||
| memory_id=memory_id, | ||
| error=str(exc), | ||
| error_type=type(exc).__name__, | ||
| ) | ||
| msg = f"Failed to delete memory {memory_id}: {exc}" | ||
| raise MemoryStoreError(msg) from exc |
| confidence = raw_metadata.get(f"{_PREFIX}confidence", 1.0) | ||
| source = raw_metadata.get(f"{_PREFIX}source") | ||
| raw_tags = raw_metadata.get(f"{_PREFIX}tags", ()) | ||
| tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip()) |
| except Exception as exc: | ||
| logger.warning( | ||
| MEMORY_SHARED_PUBLISH_FAILED, | ||
| agent_id=agent_id, | ||
| error=str(exc), | ||
| error_type=type(exc).__name__, | ||
| ) | ||
| msg = f"Failed to publish shared memory: {exc}" | ||
| raise MemoryStoreError(msg) from exc |
| except Exception as exc: | ||
| logger.warning( | ||
| MEMORY_SHARED_RETRACT_FAILED, | ||
| agent_id=agent_id, | ||
| memory_id=memory_id, | ||
| error=str(exc), | ||
| error_type=type(exc).__name__, | ||
| ) | ||
| msg = f"Failed to retract shared memory {memory_id}: {exc}" | ||
| raise MemoryStoreError(msg) from exc |
| try: | ||
| await asyncio.to_thread( | ||
| self._client.get_all, | ||
| user_id=_SHARED_NAMESPACE, | ||
| limit=1, | ||
| ) | ||
| except Exception: | ||
| logger.debug( | ||
| MEMORY_BACKEND_HEALTH_CHECK, | ||
| backend="mem0", | ||
| healthy=False, | ||
| ) | ||
| return False |
| if "id" not in raw: | ||
| msg = f"Mem0 result missing required 'id' field: keys={list(raw.keys())}" | ||
| raise MemoryRetrievalError(msg) | ||
| memory_id = NotBlankStr(str(raw["id"])) |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/memory/backends/mem0/adapter.py`:
- Around line 495-555: The count method currently uses self._client.get_all with
limit=self._max_memories_per_agent which can cap results and undercount when an
agent has more memories than _max_memories_per_agent; update MemoryAdapter.count
to fetch the full set (e.g. use the backend's pagination or a no-limit/count
endpoint) instead of passing the capped limit — specifically change the call
site using self._client.get_all (and any helpers around _max_memories_per_agent)
to iterate pages or request an unbounded count, then compute total (and preserve
the category filter logic using _extract_category) and keep existing error
handling and logging in count, MEMORY_ENTRY_COUNT_FAILED, and
MEMORY_ENTRY_COUNTED.
In `@src/ai_company/memory/backends/mem0/config.py`:
- Around line 110-128: The builder build_config_from_company_config currently
always constructs Mem0BackendConfig without setting collection_name, so add an
optional parameter (e.g., collection_name: Optional[str] = None) to
build_config_from_company_config and pass it into the returned
Mem0BackendConfig(...) (use the provided collection_name when non-empty,
otherwise allow Mem0BackendConfig to keep its default); update the function
signature and the call site(s) to wire through CompanyMemoryConfig or explicit
overrides as needed so deployments can customize collection_name.
In `@src/ai_company/memory/factory.py`:
- Around line 51-61: The embedder validation in the Mem0 branch currently raises
MemoryConfigError without logging; before each raise in the checks that test "if
embedder is None" and "if not isinstance(embedder, Mem0EmbedderConfig)" add a
log call (matching the style used for the unknown-backend path) that emits
WARNING or ERROR with context including the backend name, the embedder
value/type (use type(embedder).__name__ or repr(embedder) as appropriate) and
the intended requirement, then raise MemoryConfigError as before; reference the
embedder variable, Mem0EmbedderConfig, and MemoryConfigError in your changes so
the log provides the same contextual info as the other error path.
In `@tests/unit/memory/backends/mem0/test_config.py`:
- Around line 96-108: Combine the two tests test_rejects_parent_traversal_unix
and test_rejects_parent_traversal_windows into a single parametrized test using
pytest.mark.parametrize: create one test (e.g., test_rejects_parent_traversal)
that takes data_dir as a parameter, parametrize it with the two values
("/data/../escape" and "C:\\data\\..\\escape"), and keep the existing assertion
that constructing Mem0BackendConfig(data_dir=data_dir, embedder=_embedder())
raises ValidationError matching "parent-directory traversal"; this removes
duplicated setup/assertion while still testing both paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 61c4af4d-53a3-4b4b-ae55-fe78a42ca8ae
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (16)
pyproject.tomlsrc/ai_company/memory/__init__.pysrc/ai_company/memory/backends/__init__.pysrc/ai_company/memory/backends/mem0/__init__.pysrc/ai_company/memory/backends/mem0/adapter.pysrc/ai_company/memory/backends/mem0/config.pysrc/ai_company/memory/backends/mem0/mappers.pysrc/ai_company/memory/factory.pytests/integration/memory/test_mem0_backend.pytests/unit/memory/backends/__init__.pytests/unit/memory/backends/mem0/__init__.pytests/unit/memory/backends/mem0/test_adapter.pytests/unit/memory/backends/mem0/test_config.pytests/unit/memory/backends/mem0/test_mappers.pytests/unit/memory/test_factory.pytests/unit/memory/test_init.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Agent
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, and@pytest.mark.slow
Prefer@pytest.mark.parametrizefor testing similar cases
Files:
tests/unit/memory/backends/mem0/test_mappers.pytests/unit/memory/backends/mem0/test_config.pytests/unit/memory/test_factory.pytests/unit/memory/backends/mem0/test_adapter.pytests/integration/memory/test_mem0_backend.pytests/unit/memory/test_init.py
{src,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names like
example-provider,example-large-001,test-provider,test-small-001
Files:
tests/unit/memory/backends/mem0/test_mappers.pysrc/ai_company/memory/backends/__init__.pysrc/ai_company/memory/backends/mem0/config.pytests/unit/memory/backends/mem0/test_config.pysrc/ai_company/memory/__init__.pytests/unit/memory/test_factory.pysrc/ai_company/memory/backends/mem0/__init__.pytests/unit/memory/backends/mem0/test_adapter.pytests/integration/memory/test_mem0_backend.pysrc/ai_company/memory/backends/mem0/mappers.pysrc/ai_company/memory/backends/mem0/adapter.pysrc/ai_company/memory/factory.pytests/unit/memory/test_init.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649
Useexcept A, B:(no parentheses) for exception syntax on Python 3.14 per PEP 758 — ruff enforces this
Type hints required on all public functions; mypy strict mode enforced
Google-style docstrings required on public classes and functions, enforced by ruff D rules
Create new objects, never mutate existing ones. For non-Pydantic internal collections, usecopy.deepcopy()at construction andMappingProxyTypewrapping for read-only enforcement
Fordict/listfields in frozen Pydantic models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; separate mutable-via-copy models for runtime state usingmodel_copy(update=...)
Use Pydantic v2 withBaseModel,model_validator,computed_field,ConfigDict
Use@computed_fieldfor derived values instead of storing redundant fields (e.g.TokenUsage.total_tokens)
UseNotBlankStrfromcore.typesfor all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls) — use structured concurrency over barecreate_task
Line length must be 88 characters (ruff enforced)
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Every module with business logic must have:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Use onlyloggeras the variable name (not_logger, notlog)
Use domain-specific event name constants fromai_company.observability.eventsmodules (e.g...
Files:
src/ai_company/memory/backends/__init__.pysrc/ai_company/memory/backends/mem0/config.pysrc/ai_company/memory/__init__.pysrc/ai_company/memory/backends/mem0/__init__.pysrc/ai_company/memory/backends/mem0/mappers.pysrc/ai_company/memory/backends/mem0/adapter.pysrc/ai_company/memory/factory.py
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
pyproject.toml: All version pinning inpyproject.tomluses==(exact versions)
Dependency groups:test(pytest + plugins),dev(includes test + ruff, mypy, pre-commit, commitizen) — install viauv sync
Files:
pyproject.toml
🧬 Code graph analysis (11)
tests/unit/memory/backends/mem0/test_mappers.py (3)
src/ai_company/core/enums.py (1)
MemoryCategory(101-108)src/ai_company/memory/errors.py (1)
MemoryRetrievalError(25-26)src/ai_company/memory/models.py (4)
MemoryEntry(82-150)MemoryMetadata(20-52)MemoryQuery(153-230)MemoryStoreRequest(55-79)
src/ai_company/memory/backends/__init__.py (1)
src/ai_company/memory/backends/mem0/adapter.py (1)
Mem0MemoryBackend(101-772)
src/ai_company/memory/backends/mem0/config.py (1)
src/ai_company/memory/config.py (1)
CompanyMemoryConfig(139-195)
tests/unit/memory/backends/mem0/test_config.py (2)
src/ai_company/memory/backends/mem0/config.py (4)
Mem0BackendConfig(46-78)Mem0EmbedderConfig(17-43)build_config_from_company_config(110-128)build_mem0_config_dict(81-107)src/ai_company/memory/config.py (2)
CompanyMemoryConfig(139-195)MemoryStorageConfig(25-104)
src/ai_company/memory/__init__.py (2)
src/ai_company/memory/backends/mem0/config.py (1)
Mem0EmbedderConfig(17-43)src/ai_company/memory/backends/mem0/adapter.py (1)
Mem0MemoryBackend(101-772)
tests/unit/memory/test_factory.py (5)
src/ai_company/memory/backends/mem0/adapter.py (4)
Mem0MemoryBackend(101-772)is_connected(210-212)backend_name(215-217)max_memories_per_agent(247-249)src/ai_company/memory/backends/mem0/config.py (1)
Mem0EmbedderConfig(17-43)src/ai_company/memory/config.py (2)
CompanyMemoryConfig(139-195)MemoryOptionsConfig(107-136)src/ai_company/memory/errors.py (1)
MemoryConfigError(40-41)src/ai_company/memory/factory.py (1)
create_memory_backend(22-83)
src/ai_company/memory/backends/mem0/__init__.py (2)
src/ai_company/memory/backends/mem0/adapter.py (1)
Mem0MemoryBackend(101-772)src/ai_company/memory/backends/mem0/config.py (2)
Mem0BackendConfig(46-78)Mem0EmbedderConfig(17-43)
tests/unit/memory/backends/mem0/test_adapter.py (5)
src/ai_company/core/enums.py (1)
MemoryCategory(101-108)src/ai_company/memory/backends/mem0/config.py (2)
Mem0BackendConfig(46-78)Mem0EmbedderConfig(17-43)src/ai_company/memory/errors.py (3)
MemoryConnectionError(17-18)MemoryRetrievalError(25-26)MemoryStoreError(21-22)src/ai_company/memory/models.py (2)
MemoryQuery(153-230)MemoryStoreRequest(55-79)tests/unit/memory/test_factory.py (1)
_test_embedder(15-20)
src/ai_company/memory/backends/mem0/mappers.py (3)
src/ai_company/core/enums.py (1)
MemoryCategory(101-108)src/ai_company/memory/errors.py (1)
MemoryRetrievalError(25-26)src/ai_company/memory/models.py (4)
MemoryEntry(82-150)MemoryMetadata(20-52)MemoryQuery(153-230)MemoryStoreRequest(55-79)
src/ai_company/memory/backends/mem0/adapter.py (3)
src/ai_company/core/enums.py (1)
MemoryCategory(101-108)src/ai_company/memory/backends/mem0/config.py (2)
Mem0BackendConfig(46-78)build_mem0_config_dict(81-107)src/ai_company/memory/models.py (1)
MemoryEntry(82-150)
src/ai_company/memory/factory.py (4)
src/ai_company/memory/config.py (1)
CompanyMemoryConfig(139-195)src/ai_company/memory/errors.py (1)
MemoryConfigError(40-41)src/ai_company/memory/protocol.py (1)
MemoryBackend(20-182)src/ai_company/memory/backends/mem0/config.py (2)
Mem0EmbedderConfig(17-43)build_config_from_company_config(110-128)
🔇 Additional comments (20)
tests/unit/memory/backends/mem0/test_mappers.py (2)
1-51: LGTM! Well-structured test helper and imports.The
_make_entryhelper function is well-designed with sensible defaults. Thepytestmarktimeout is appropriate for unit tests.
53-387: Comprehensive test coverage for mapper functions.The test suite covers:
- Metadata serialization/deserialization with edge cases
- Datetime parsing including timezone handling and malformed inputs
- Relevance score normalization with boundary conditions
- Result-to-entry conversion with validation errors
- Query translation and post-filtering logic
Good use of descriptive test names and inline comments (e.g., Line 350) explaining expected behavior.
tests/unit/memory/test_factory.py (1)
15-54: LGTM! Factory tests cover key validation paths.The tests properly validate:
- Backend creation with embedder config
- Config propagation (
max_memories_per_agent)- Error cases for missing/wrong embedder type
The
_test_embedder()helper follows the coding guideline to avoid vendor names.src/ai_company/memory/backends/mem0/config.py (2)
69-78: Good path traversal validation.The validator correctly handles both Windows (
C:\..\) and POSIX (../) path traversal patterns by checking parts from both path types.
81-107: The hardcoded version"v1.1"is correct.The Mem0 SDK documentation confirms that
"v1.1"is the minimum supported version and is used as the default if omitted from the config.src/ai_company/memory/backends/mem0/adapter.py (4)
76-98: LGTM! Validates add results defensively.The helper correctly validates the Mem0
addresult structure and provides clear error messages with context.
125-207: LGTM! Lifecycle methods are well-implemented.Good practices:
- Explicit
ImportErrorhandling with structured loggingasyncio.to_threadfor blocking Mem0 calls- Health check performs real probe (not just flag check)
- Safe disconnect when not connected
559-772: LGTM! Shared knowledge store has proper ownership checks.The implementation correctly:
- Tags published memories with publisher agent ID
- Verifies publisher ownership before retraction
- Excludes requesting agent from search results when specified
- Handles missing publisher metadata gracefully
778-800: LGTM! Helper functions handle edge cases.
_extract_categoryand_extract_publisherdefensively handle missing or invalid metadata with appropriate defaults and logging.tests/unit/memory/backends/mem0/test_adapter.py (5)
1-119: LGTM! Well-organized test fixtures and helpers.The test file has:
- Clean fixture setup with mocked Mem0 client
- Reusable helpers for building mock responses
- Proper isolation by injecting mock client directly
125-209: LGTM! Properties and protocol conformance tests.Good coverage of:
- Backend properties (
backend_name,is_connected)- All capability properties
- Protocol method presence checks
214-350: LGTM! Lifecycle and connection guard tests.Comprehensive coverage of:
- Connect success/failure scenarios
- Disconnect (connected and not connected)
- Health check states (connected, disconnected, probe failure)
- Connection guards for all operations
355-679: LGTM! CRUD operation tests are thorough.Tests cover:
- Success paths with proper mock validation
- Error wrapping (empty results, missing ID, exceptions)
- Edge cases (not found, category filtering, invalid data)
684-893: LGTM! Shared knowledge store tests cover all scenarios.Tests validate:
- Publish with publisher metadata
- Search with/without text, agent exclusion
- Retract ownership verification (success, not found, mismatch, no publisher)
src/ai_company/memory/backends/mem0/mappers.py (6)
32-51: LGTM! Metadata serialization is clean and defensive.Creates a new dict without mutating the input request. Conditionally includes optional fields only when present.
54-94: LGTM! Datetime parsing and score normalization are robust.
- Handles empty strings and None
- Naive datetimes correctly default to UTC
- Malformed values are logged and return None
- Score clamping handles out-of-bounds values
97-144: LGTM! Metadata deserialization with defensive defaults.Handles:
- None/empty metadata → defaults to WORKING category
- Invalid category → logs warning and defaults to WORKING
- Empty/whitespace tags → filtered out
147-193: LGTM! Result-to-entry conversion validates required fields.Properly:
- Validates
idpresence (raises MemoryRetrievalError)- Validates non-empty content
- Falls back to
datakey ifmemoryis missing- Defaults
created_atto now if missing
196-238: LGTM! Query translation functions are straightforward.Clear separation between search (requires text) and get_all (no text) with appropriate validation.
241-273: LGTM! Post-filters implement correct semantics.
- Category filter: entry must be in categories set
- Tag filter: AND semantics (all query tags must be present)
- Time range:
sinceinclusive,untilexclusive- Min relevance: skipped if
relevance_scoreis None (permissive)Order is preserved as documented.
| async def count( | ||
| self, | ||
| agent_id: NotBlankStr, | ||
| *, | ||
| category: MemoryCategory | None = None, | ||
| ) -> int: | ||
| """Count memory entries for an agent. | ||
|
|
||
| Uses ``get_all()`` internally — O(n) in the agent's memory | ||
| count. Acceptable because ``count()`` is not on the hot path. | ||
|
|
||
| Args: | ||
| agent_id: Owning agent identifier. | ||
| category: Optional category filter. | ||
|
|
||
| Returns: | ||
| Number of matching entries. | ||
|
|
||
| Raises: | ||
| MemoryConnectionError: If the backend is not connected. | ||
| MemoryRetrievalError: If the count query fails. | ||
| """ | ||
| self._require_connected() | ||
| try: | ||
| raw_result = await asyncio.to_thread( | ||
| self._client.get_all, | ||
| user_id=str(agent_id), | ||
| limit=self._max_memories_per_agent, | ||
| ) | ||
| raw_list = raw_result.get("results", []) | ||
| if category is None: | ||
| total = len(raw_list) | ||
| else: | ||
| total = sum( | ||
| 1 for item in raw_list if _extract_category(item) == category | ||
| ) | ||
| except MemoryRetrievalError: | ||
| logger.warning( | ||
| MEMORY_ENTRY_COUNT_FAILED, | ||
| agent_id=agent_id, | ||
| error="count query failed", | ||
| error_type="MemoryRetrievalError", | ||
| ) | ||
| raise | ||
| except Exception as exc: | ||
| logger.warning( | ||
| MEMORY_ENTRY_COUNT_FAILED, | ||
| agent_id=agent_id, | ||
| error=str(exc), | ||
| error_type=type(exc).__name__, | ||
| ) | ||
| msg = f"Failed to count memories: {exc}" | ||
| raise MemoryRetrievalError(msg) from exc | ||
| else: | ||
| logger.info( | ||
| MEMORY_ENTRY_COUNTED, | ||
| agent_id=agent_id, | ||
| count=total, | ||
| category=category.value if category else None, | ||
| ) | ||
| return total |
There was a problem hiding this comment.
Potential undercounting if memories exceed max_memories_per_agent.
The count operation uses limit=self._max_memories_per_agent (Line 522), so if an agent has more memories than this limit (e.g., due to configuration changes), the count will be capped and potentially incorrect.
Consider using a higher limit or documenting this limitation explicitly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 495 - 555, The
count method currently uses self._client.get_all with
limit=self._max_memories_per_agent which can cap results and undercount when an
agent has more memories than _max_memories_per_agent; update MemoryAdapter.count
to fetch the full set (e.g. use the backend's pagination or a no-limit/count
endpoint) instead of passing the capped limit — specifically change the call
site using self._client.get_all (and any helpers around _max_memories_per_agent)
to iterate pages or request an unbounded count, then compute total (and preserve
the category filter logic using _extract_category) and keep existing error
handling and logging in count, MEMORY_ENTRY_COUNT_FAILED, and
MEMORY_ENTRY_COUNTED.
| def build_config_from_company_config( | ||
| config: CompanyMemoryConfig, | ||
| *, | ||
| embedder: Mem0EmbedderConfig, | ||
| ) -> Mem0BackendConfig: | ||
| """Derive a ``Mem0BackendConfig`` from the top-level memory config. | ||
|
|
||
| Args: | ||
| config: Company-wide memory configuration. | ||
| embedder: Embedder settings (provider and model must be | ||
| supplied explicitly to avoid vendor names in defaults). | ||
|
|
||
| Returns: | ||
| Mem0-specific backend configuration. | ||
| """ | ||
| return Mem0BackendConfig( | ||
| data_dir=config.storage.data_dir, | ||
| embedder=embedder, | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider exposing collection_name in the builder.
build_config_from_company_config always uses the default collection_name from Mem0BackendConfig. If collection names need to be customizable per deployment, this could be exposed as a parameter.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/memory/backends/mem0/config.py` around lines 110 - 128, The
builder build_config_from_company_config currently always constructs
Mem0BackendConfig without setting collection_name, so add an optional parameter
(e.g., collection_name: Optional[str] = None) to
build_config_from_company_config and pass it into the returned
Mem0BackendConfig(...) (use the provided collection_name when non-empty,
otherwise allow Mem0BackendConfig to keep its default); update the function
signature and the call site(s) to wire through CompanyMemoryConfig or explicit
overrides as needed so deployments can customize collection_name.
There was a problem hiding this comment.
Actionable comments posted: 8
♻️ Duplicate comments (1)
src/ai_company/memory/backends/mem0/adapter.py (1)
510-514:⚠️ Potential issue | 🟠 Major
count()still caps the total atmax_memories_per_agent.Lines 510-514 reuse
self._max_memories_per_agentas theget_all()limit, so the method undercounts as soon as stored memories exceed that cap. This needs pagination or a backend-native count path instead of the retention limit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 510 - 514, The count() implementation currently passes self._max_memories_per_agent into self._client.get_all (in adapter.py), which caps results and undercounts when stored memories exceed that limit; update count() to either call a backend-native count endpoint on self._client if available (e.g., client.count or similar) or implement pagination by repeatedly calling self._client.get_all with offset/next_token until no more results are returned, aggregating the total rather than using self._max_memories_per_agent as the final limit; ensure you reference and adjust the calls to self._client.get_all and remove reliance on self._max_memories_per_agent for the total count.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/memory/backends/mem0/adapter.py`:
- Around line 326-336: The adapter currently requests only query.limit raw hits
(via query_to_mem0_search_args / query_to_mem0_getall_args and
self._client.search / get_all) then applies apply_post_filters, which can leave
fewer than query.limit results if many are filtered out; change the logic to
over-fetch / paginate: call query_to_mem0_search_args or
query_to_mem0_getall_args with a larger limit or with offset pagination and loop
calling self._client.search / self._client.get_all, converting results with
mem0_result_to_entry and accumulating entries, applying apply_post_filters on
the accumulated set each iteration and stop once the filtered entries reach
query.limit or the backend is exhausted; ensure the same change is applied to
both places that use these calls (the blocks using
query_to_mem0_search_args/query_to_mem0_getall_args, mem0_result_to_entry, and
apply_post_filters).
- Around line 364-455: The get() and delete() methods currently fetch Mem0
records by ID without validating ownership and mem0_result_to_entry() relabels
the entry with the caller's agent_id; update get() and delete() to retrieve the
raw record, check the record's owner/user_id (as returned by Mem0) against the
provided agent_id and reject mismatches before returning or deleting (follow the
ownership-check pattern used in retract()); for get(), if owner mismatch log
MEMORY_ENTRY_FETCH_FAILED and raise MemoryRetrievalError, and for delete() log
MEMORY_ENTRY_DELETED with found=False and raise MemoryStoreError (or return
False per existing semantics) when ownership does not match; also adjust
mem0_result_to_entry() usage so it does not overwrite the original owner field
when validating ownership.
In `@src/ai_company/memory/backends/mem0/mappers.py`:
- Around line 114-164: The code currently assumes raw_metadata is a mapping and
only checks truthiness, causing AttributeError for non-object metadata like
lists or strings; update parse_mem0_metadata(), extract_category(), and
extract_publisher() to treat non-dict/raw-mapping metadata as absent by first
verifying raw_metadata is an instance of dict (or collections.abc.Mapping) and
if not, set raw_metadata = {} (or early-return the default tuple used when
metadata is absent), then proceed with the existing logic that reads keys like
f"{_PREFIX}category" and f"{_PREFIX}confidence" so malformed but truthy payloads
(e.g., [], "oops") gracefully fall back to default category, metadata, and
expires_at.
- Around line 332-342: The add-result handling in the mem0 mapper must
explicitly reject non-dict or blank 'id' values: in the block using
result.get("results") and the variable first, first check that results_list is a
non-empty list, then assert isinstance(first, dict) before accessing keys, and
verify that first.get("id") is a non-empty string (not "", None, or other types)
before returning NotBlankStr(str(first["id"])); if those checks fail, call
logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) and raise
MemoryStoreError(msg) with a clear message so you never call first.keys() on
non-dicts or wrap blank ids in NotBlankStr.
- Around line 205-207: The current code synthesizes created_at =
datetime.now(UTC) when parse_mem0_datetime(raw.get("created_at")) returns None,
which improperly makes malformed rows appear brand-new; change this to either
reject the row (raise a validation error/return None so the caller drops it) or
assign a deterministic floor timestamp (e.g., a constant EPOCH_FLOOR like
datetime(1970,1,1,tzinfo=UTC)) instead of datetime.now; update the handling in
the mappers.py logic around parse_mem0_datetime and created_at so callers get a
clear rejection or a stable floor value and add/declare a named constant (e.g.,
EPOCH_FLOOR or MIN_CREATED_AT) if choosing the deterministic floor option.
In `@src/ai_company/memory/factory.py`:
- Around line 54-75: The logger currently emits MEMORY_BACKEND_UNKNOWN for
embedder validation failures in the embedder checks inside the memory factory;
change those logger.warning calls (the ones guarding embedder is None and
isinstance(embedder, Mem0EmbedderConfig)) to use a more specific event constant
(e.g., MEMORY_BACKEND_CONFIG_INVALID) to clearly indicate a
configuration/validation error rather than an unknown backend, keeping the same
parameters (backend="mem0", error=msg, and embedder_type where present) and
still raising MemoryConfigError; update any imports or constant definitions if
needed to reference MEMORY_BACKEND_CONFIG_INVALID so the two embedder-check
branches (the embedder variable and Mem0EmbedderConfig check) log the new event.
In `@tests/integration/memory/test_mem0_backend.py`:
- Around line 156-160: The test only asserts one expected memory phrase; update
the assertions to verify both injected memories survive relevance filtering by
checking for the second expected substring in the combined text (use the
existing messages/combined variables). For example, after the existing assert
"concise responses" in combined, add an assertion that the second memory phrase
(e.g., "episodic memory" or the exact second memory content used in the test
setup) is present in combined so the test fails if the episodic memory is
dropped.
In `@tests/unit/memory/backends/mem0/test_adapter.py`:
- Around line 308-373: Replace the eight nearly-identical tests in
TestConnectionGuard with a single parametrized test using
`@pytest.mark.parametrize` that lists the target method names (store, retrieve,
get, delete, count, publish, search_shared, retract) and their respective
argument tuples (e.g., ("test-agent-001", _make_store_request()), (
"test-agent-001", MemoryQuery(text="test")), ("test-agent-001", "mem-001"),
("test-agent-001",), etc.), then inside the test instantiate
Mem0MemoryBackend(mem0_config=mem0_config), use getattr(b, method_name) to
obtain the coroutine, and assert it raises MemoryConnectionError with match "Not
connected" when awaited; keep references to Mem0MemoryBackend,
MemoryConnectionError, _make_store_request, and MemoryQuery to locate the code.
---
Duplicate comments:
In `@src/ai_company/memory/backends/mem0/adapter.py`:
- Around line 510-514: The count() implementation currently passes
self._max_memories_per_agent into self._client.get_all (in adapter.py), which
caps results and undercounts when stored memories exceed that limit; update
count() to either call a backend-native count endpoint on self._client if
available (e.g., client.count or similar) or implement pagination by repeatedly
calling self._client.get_all with offset/next_token until no more results are
returned, aggregating the total rather than using self._max_memories_per_agent
as the final limit; ensure you reference and adjust the calls to
self._client.get_all and remove reliance on self._max_memories_per_agent for the
total count.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2593886f-c386-46e3-a2c4-dfef93265075
📒 Files selected for processing (13)
CLAUDE.mdREADME.mddocs/design/memory.mddocs/roadmap/index.mdsrc/ai_company/memory/backends/mem0/adapter.pysrc/ai_company/memory/backends/mem0/config.pysrc/ai_company/memory/backends/mem0/mappers.pysrc/ai_company/memory/factory.pytests/integration/memory/test_mem0_backend.pytests/unit/memory/backends/mem0/test_adapter.pytests/unit/memory/backends/mem0/test_config.pytests/unit/memory/backends/mem0/test_mappers.pytests/unit/memory/test_factory.py
💤 Files with no reviewable changes (1)
- docs/roadmap/index.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Mark tests with@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e, and@pytest.mark.slow
Prefer@pytest.mark.parametrizefor testing similar cases
Files:
tests/unit/memory/backends/mem0/test_config.pytests/unit/memory/backends/mem0/test_mappers.pytests/unit/memory/backends/mem0/test_adapter.pytests/unit/memory/test_factory.pytests/integration/memory/test_mem0_backend.py
{src,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names like
example-provider,example-large-001,test-provider,test-small-001
Files:
tests/unit/memory/backends/mem0/test_config.pysrc/ai_company/memory/backends/mem0/config.pytests/unit/memory/backends/mem0/test_mappers.pytests/unit/memory/backends/mem0/test_adapter.pysrc/ai_company/memory/factory.pysrc/ai_company/memory/backends/mem0/adapter.pytests/unit/memory/test_factory.pysrc/ai_company/memory/backends/mem0/mappers.pytests/integration/memory/test_mem0_backend.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649
Useexcept A, B:(no parentheses) for exception syntax on Python 3.14 per PEP 758 — ruff enforces this
Type hints required on all public functions; mypy strict mode enforced
Google-style docstrings required on public classes and functions, enforced by ruff D rules
Create new objects, never mutate existing ones. For non-Pydantic internal collections, usecopy.deepcopy()at construction andMappingProxyTypewrapping for read-only enforcement
Fordict/listfields in frozen Pydantic models, usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; separate mutable-via-copy models for runtime state usingmodel_copy(update=...)
Use Pydantic v2 withBaseModel,model_validator,computed_field,ConfigDict
Use@computed_fieldfor derived values instead of storing redundant fields (e.g.TokenUsage.total_tokens)
UseNotBlankStrfromcore.typesfor all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls) — use structured concurrency over barecreate_task
Line length must be 88 characters (ruff enforced)
Functions must be less than 50 lines; files must be less than 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
Every module with business logic must have:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Use onlyloggeras the variable name (not_logger, notlog)
Use domain-specific event name constants fromai_company.observability.eventsmodules (e.g...
Files:
src/ai_company/memory/backends/mem0/config.pysrc/ai_company/memory/factory.pysrc/ai_company/memory/backends/mem0/adapter.pysrc/ai_company/memory/backends/mem0/mappers.py
docs/design/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
When approved deviations occur, update the relevant
docs/design/page to reflect the new reality
Files:
docs/design/memory.md
🧠 Learnings (6)
📚 Learning: 2026-03-12T22:12:29.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to tests/**/*.py : Prefer `pytest.mark.parametrize` for testing similar cases
Applied to files:
tests/unit/memory/backends/mem0/test_config.py
📚 Learning: 2026-03-12T22:12:29.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising an exception
Applied to files:
src/ai_company/memory/factory.py
📚 Learning: 2026-03-12T22:12:29.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to pyproject.toml : Dependency groups: `test` (pytest + plugins), `dev` (includes test + ruff, mypy, pre-commit, commitizen) — install via `uv sync`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-12T22:12:29.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to pyproject.toml : All version pinning in `pyproject.toml` uses `==` (exact versions)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-12T22:12:29.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to src/ai_company/providers/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError` — all others raise immediately
Applied to files:
src/ai_company/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-12T22:12:29.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T22:12:29.867Z
Learning: Applies to src/ai_company/{providers,engine}/**/*.py : `RetryExhaustedError` signals all retries failed — the engine layer catches this to trigger fallback chains
Applied to files:
src/ai_company/memory/backends/mem0/adapter.py
🧬 Code graph analysis (5)
tests/unit/memory/backends/mem0/test_config.py (2)
src/ai_company/memory/backends/mem0/config.py (4)
Mem0BackendConfig(46-78)Mem0EmbedderConfig(17-43)build_config_from_company_config(111-129)build_mem0_config_dict(81-108)src/ai_company/memory/config.py (2)
CompanyMemoryConfig(139-195)MemoryStorageConfig(25-104)
src/ai_company/memory/backends/mem0/config.py (1)
src/ai_company/memory/config.py (1)
CompanyMemoryConfig(139-195)
tests/unit/memory/backends/mem0/test_adapter.py (3)
src/ai_company/core/enums.py (1)
MemoryCategory(101-108)src/ai_company/memory/backends/mem0/config.py (2)
Mem0BackendConfig(46-78)Mem0EmbedderConfig(17-43)src/ai_company/memory/errors.py (4)
MemoryConnectionError(17-18)MemoryRetrievalError(25-26)MemoryStoreError(21-22)MemoryError(13-14)
src/ai_company/memory/backends/mem0/adapter.py (1)
src/ai_company/memory/backends/mem0/mappers.py (8)
apply_post_filters(280-312)build_mem0_metadata(38-57)extract_category(345-367)extract_publisher(370-376)mem0_result_to_entry(167-226)query_to_mem0_getall_args(261-277)query_to_mem0_search_args(229-258)validate_add_result(318-342)
src/ai_company/memory/backends/mem0/mappers.py (4)
src/ai_company/core/enums.py (1)
MemoryCategory(101-108)src/ai_company/memory/errors.py (2)
MemoryRetrievalError(25-26)MemoryStoreError(21-22)src/ai_company/memory/models.py (4)
MemoryEntry(82-150)MemoryMetadata(20-52)MemoryQuery(153-230)MemoryStoreRequest(55-79)src/ai_company/observability/_logger.py (1)
get_logger(8-28)
🔇 Additional comments (23)
README.md (1)
130-130: LGTM!The status line accurately reflects the addition of the Mem0 backend adapter to the memory subsystem.
docs/design/memory.md (1)
33-33: LGTM!Design documentation correctly updated to reflect that Mem0 backend is now implemented, aligning with the coding guideline to update design pages when implementation status changes.
CLAUDE.md (2)
97-97: LGTM!Package structure documentation accurately reflects the new Mem0 adapter location at
backends/mem0/.
208-208: LGTM!Clear documentation of the optional
mem0aidependency requirement when using the Mem0 backend.tests/unit/memory/backends/mem0/test_config.py (4)
1-14: LGTM!Well-structured test file with proper module docstring, imports, and timeout marker. Uses vendor-agnostic test values (
test-provider,test-embedding-001) per coding guidelines.
27-64: LGTM!Comprehensive test coverage for
Mem0EmbedderConfig: validates required fields, custom values, defaults, immutability, and rejection of invalid inputs (zero dims, blank strings).
66-117: LGTM!Good test coverage for
Mem0BackendConfig. The path traversal test is correctly parametrized with Unix, Windows, and relative path variants as requested in the past review.
120-191: LGTM!Builder function tests validate the expected dict structure for
build_mem0_config_dictand proper derivation inbuild_config_from_company_config. Coverage includes default and custom configurations.src/ai_company/memory/factory.py (2)
1-22: LGTM!Clean module structure with
TYPE_CHECKINGguard forMem0EmbedderConfigimport to avoid circular dependencies while maintaining type safety.
77-90: LGTM!Backend creation correctly uses the config builder and logs
MEMORY_BACKEND_CREATEDwith relevant context.tests/unit/memory/test_factory.py (2)
15-20: LGTM!Good helper function with vendor-agnostic defaults. Consistent with the helper in
test_config.py.
23-54: LGTM!Comprehensive test coverage for the factory: validates backend creation, configuration propagation (
max_memories_per_agent), and both embedder validation error paths (missing and wrong type).tests/unit/memory/backends/mem0/test_mappers.py (7)
1-28: LGTM!Well-organized test module with proper imports, timeout marker, and a reusable
_make_entryhelper. Importing_PREFIX(a private constant) is acceptable for test verification of metadata key prefixes.
53-87: LGTM!Good coverage for
build_mem0_metadatawith basic and full metadata scenarios, verifying correct key prefixing and value serialization.
89-137: LGTM!Thorough datetime parsing tests covering None, empty string, aware/naive ISO formats, non-UTC timezones, and malformed inputs.
139-183: LGTM!Good coverage for
parse_mem0_metadataincluding fallback behavior for missing/invalid categories and empty tag filtering.
185-266: LGTM!Comprehensive
mem0_result_to_entrytests covering basic conversion, score handling, missing timestamps, no metadata, validation errors (missing ID, empty content), and data-key fallback.
268-292: LGTM!Query translation tests validate correct argument mapping and error handling for missing query text.
294-387: LGTM!Thorough
apply_post_filterstests covering no filters, category, tag, time range (since/until), min relevance, and combined filter scenarios.src/ai_company/memory/backends/mem0/config.py (4)
17-43: LGTM!Well-designed embedder config with required
providerandmodelfields to avoid vendor-specific defaults. Proper use ofNotBlankStrand frozen model.
46-78: LGTM!Good defensive validation against path traversal using both Windows and POSIX path parsing. The combined parts check catches
..segments regardless of platform path separators used.
111-129: LGTM!Clean builder that derives
Mem0BackendConfigfromCompanyMemoryConfig. The explicitembedderparameter enforces vendor-agnostic configuration. Per a past review,collection_namecould optionally be exposed as a parameter for per-deployment customization, but the current default-based approach is acceptable.
81-108: No action needed. The hardcoded version"v1.1"is correct for the pinned mem0ai dependency (v1.0.5), which requires this schema version per the Mem0 migration documentation.
| if query.text is not None: | ||
| kwargs = query_to_mem0_search_args(str(agent_id), query) | ||
| raw_result = await asyncio.to_thread(self._client.search, **kwargs) | ||
| else: | ||
| kwargs = query_to_mem0_getall_args(str(agent_id), query) | ||
| raw_result = await asyncio.to_thread(self._client.get_all, **kwargs) | ||
| raw_list = raw_result.get("results", []) | ||
| entries = tuple( | ||
| mem0_result_to_entry(item, str(agent_id)) for item in raw_list | ||
| ) | ||
| entries = apply_post_filters(entries, query) |
There was a problem hiding this comment.
Local filters run after an already-capped fetch.
Lines 327-331 and 638-649 ask Mem0 for only query.limit raw hits, then Lines 336 and 659-662 apply category/tag/time/relevance and exclude_agent filtering locally. If the first page is dominated by filtered-out items, these methods return too few results even when more eligible memories exist. Page or over-fetch until the final filtered set reaches query.limit or the backend is exhausted.
Also applies to: 637-663
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 326 - 336, The
adapter currently requests only query.limit raw hits (via
query_to_mem0_search_args / query_to_mem0_getall_args and self._client.search /
get_all) then applies apply_post_filters, which can leave fewer than query.limit
results if many are filtered out; change the logic to over-fetch / paginate:
call query_to_mem0_search_args or query_to_mem0_getall_args with a larger limit
or with offset pagination and loop calling self._client.search /
self._client.get_all, converting results with mem0_result_to_entry and
accumulating entries, applying apply_post_filters on the accumulated set each
iteration and stop once the filtered entries reach query.limit or the backend is
exhausted; ensure the same change is applied to both places that use these calls
(the blocks using query_to_mem0_search_args/query_to_mem0_getall_args,
mem0_result_to_entry, and apply_post_filters).
| async def get( | ||
| self, | ||
| agent_id: NotBlankStr, | ||
| memory_id: NotBlankStr, | ||
| ) -> MemoryEntry | None: | ||
| """Get a specific memory entry by ID. | ||
|
|
||
| Args: | ||
| agent_id: Owning agent identifier. | ||
| memory_id: Memory identifier. | ||
|
|
||
| Returns: | ||
| The memory entry, or ``None`` if not found. | ||
|
|
||
| Raises: | ||
| MemoryConnectionError: If the backend is not connected. | ||
| MemoryRetrievalError: If the backend query fails. | ||
| """ | ||
| self._require_connected() | ||
| try: | ||
| raw = await asyncio.to_thread(self._client.get, str(memory_id)) | ||
| if raw is None: | ||
| logger.debug( | ||
| MEMORY_ENTRY_FETCHED, | ||
| agent_id=agent_id, | ||
| memory_id=memory_id, | ||
| found=False, | ||
| ) | ||
| return None | ||
| entry = mem0_result_to_entry(raw, str(agent_id)) | ||
| except MemoryRetrievalError as exc: | ||
| logger.warning( | ||
| MEMORY_ENTRY_FETCH_FAILED, | ||
| agent_id=agent_id, | ||
| memory_id=memory_id, | ||
| error=str(exc), | ||
| error_type="MemoryRetrievalError", | ||
| ) | ||
| raise | ||
| except MemoryError, RecursionError: | ||
| raise | ||
| except Exception as exc: | ||
| logger.warning( | ||
| MEMORY_ENTRY_FETCH_FAILED, | ||
| agent_id=agent_id, | ||
| memory_id=memory_id, | ||
| error=str(exc), | ||
| error_type=type(exc).__name__, | ||
| ) | ||
| msg = f"Failed to get memory {memory_id}: {exc}" | ||
| raise MemoryRetrievalError(msg) from exc | ||
| else: | ||
| logger.debug( | ||
| MEMORY_ENTRY_FETCHED, | ||
| agent_id=agent_id, | ||
| memory_id=memory_id, | ||
| found=True, | ||
| ) | ||
| return entry | ||
|
|
||
| async def delete( | ||
| self, | ||
| agent_id: NotBlankStr, | ||
| memory_id: NotBlankStr, | ||
| ) -> bool: | ||
| """Delete a specific memory entry. | ||
|
|
||
| Args: | ||
| agent_id: Owning agent identifier. | ||
| memory_id: Memory identifier. | ||
|
|
||
| Returns: | ||
| ``True`` if the entry was deleted, ``False`` if not found. | ||
|
|
||
| Raises: | ||
| MemoryConnectionError: If the backend is not connected. | ||
| MemoryStoreError: If the delete operation fails. | ||
| """ | ||
| self._require_connected() | ||
| try: | ||
| # Check existence first — Mem0 delete doesn't indicate | ||
| # whether the entry existed. | ||
| existing = await asyncio.to_thread(self._client.get, str(memory_id)) | ||
| if existing is None: | ||
| logger.debug( | ||
| MEMORY_ENTRY_DELETED, | ||
| agent_id=agent_id, | ||
| memory_id=memory_id, | ||
| found=False, | ||
| ) | ||
| return False | ||
| await asyncio.to_thread(self._client.delete, str(memory_id)) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the complete get() and delete() methods
cat -n src/ai_company/memory/backends/mem0/adapter.py | sed -n '364,455p'Repository: Aureliolo/synthorg
Length of output: 3703
🏁 Script executed:
# Find the mem0_result_to_entry function to understand owner handling
rg -n "mem0_result_to_entry" src/ai_company/memory/backends/mem0/ -A 15Repository: Aureliolo/synthorg
Length of output: 6730
🏁 Script executed:
# Look for how agent_id is handled throughout this file
rg -n "agent_id" src/ai_company/memory/backends/mem0/adapter.py -B 2 -A 2Repository: Aureliolo/synthorg
Length of output: 9153
🏁 Script executed:
# Check the imports and what Mem0 client is being used
head -50 src/ai_company/memory/backends/mem0/adapter.pyRepository: Aureliolo/synthorg
Length of output: 1416
🏁 Script executed:
# Get the complete mem0_result_to_entry implementation
sed -n '167,200p' src/ai_company/memory/backends/mem0/mappers.pyRepository: Aureliolo/synthorg
Length of output: 1123
🏁 Script executed:
# Check if there's any validation of the returned user_id against the passed agent_id
rg -n "user_id" src/ai_company/memory/backends/mem0/mappers.py -B 3 -A 3Repository: Aureliolo/synthorg
Length of output: 394
🏁 Script executed:
# Look for the full mappers file to understand the conversion logic
wc -l src/ai_company/memory/backends/mem0/mappers.pyRepository: Aureliolo/synthorg
Length of output: 112
🏁 Script executed:
# Get the broader context around mem0_result_to_entry
sed -n '167,230p' src/ai_company/memory/backends/mem0/mappers.pyRepository: Aureliolo/synthorg
Length of output: 1966
🏁 Script executed:
# Check if Mem0 responses include a user_id field that could be validated
rg -n "user_id" src/ai_company/memory/backends/mem0/mappers.py -B 5 -A 5 | head -60Repository: Aureliolo/synthorg
Length of output: 531
🏁 Script executed:
# Look at retrieve() to see how it uses user_id filtering
sed -n '303,362p' src/ai_company/memory/backends/mem0/adapter.pyRepository: Aureliolo/synthorg
Length of output: 2236
🏁 Script executed:
# Check the Mem0 client initialization and configuration to understand scoping
sed -n '1,100p' src/ai_company/memory/backends/mem0/adapter.py | tail -60Repository: Aureliolo/synthorg
Length of output: 1730
🏁 Script executed:
# Check if there's any configuration that might make the client scoped to a user
sed -n '100,240p' src/ai_company/memory/backends/mem0/adapter.pyRepository: Aureliolo/synthorg
Length of output: 4630
🏁 Script executed:
# Look at the config to see if there's any user_id scoping at the client level
cat src/ai_company/memory/backends/mem0/config.pyRepository: Aureliolo/synthorg
Length of output: 4083
🏁 Script executed:
# Double-check: look at what fields are in the Mem0 result dict by examining any tests or examples
rg -n "user_id" src/ai_company/memory/backends/mem0/mappers.py -B 2 -A 2 | grep -A 10 "raw.get"Repository: Aureliolo/synthorg
Length of output: 44
🏁 Script executed:
# Check if there's any documentation or tests showing what fields are in Mem0 results
fd -e py mem0 -path "*/tests/*" | head -5Repository: Aureliolo/synthorg
Length of output: 230
🏁 Script executed:
# Verify by looking at store() method to see if it validates the user_id in responses
sed -n '245,300p' src/ai_company/memory/backends/mem0/adapter.pyRepository: Aureliolo/synthorg
Length of output: 1898
Add ownership validation to get() and delete() before returning or removing the record.
The get() and delete() methods call Mem0's ID-based operations without passing a user_id filter, unlike retrieve(), store(), and other methods that explicitly scope by agent. Additionally, mem0_result_to_entry() never validates that the fetched entry's owner matches the caller; it simply re-labels the entry with the caller's agent_id. This creates a potential cross-agent access path if Mem0 IDs are globally addressable. Retrieve the entry, validate that its owner (if returned by Mem0) matches the provided agent_id, and reject mismatches before returning or deleting the record—following the pattern already used in retract() (line 739).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 364 - 455, The
get() and delete() methods currently fetch Mem0 records by ID without validating
ownership and mem0_result_to_entry() relabels the entry with the caller's
agent_id; update get() and delete() to retrieve the raw record, check the
record's owner/user_id (as returned by Mem0) against the provided agent_id and
reject mismatches before returning or deleting (follow the ownership-check
pattern used in retract()); for get(), if owner mismatch log
MEMORY_ENTRY_FETCH_FAILED and raise MemoryRetrievalError, and for delete() log
MEMORY_ENTRY_DELETED with found=False and raise MemoryStoreError (or return
False per existing semantics) when ownership does not match; also adjust
mem0_result_to_entry() usage so it does not overwrite the original owner field
when validating ownership.
| created_at = parse_mem0_datetime(raw.get("created_at")) | ||
| if created_at is None: | ||
| created_at = datetime.now(UTC) |
There was a problem hiding this comment.
Don't synthesize created_at from now().
When parsing fails, Line 207 makes the row look brand-new. That skews recency ranking and since/until filters, so malformed timestamps can float to the top of retrieval results. Reject the row or use a deterministic floor timestamp instead.
One safe option
created_at = parse_mem0_datetime(raw.get("created_at"))
if created_at is None:
- created_at = datetime.now(UTC)
+ msg = f"Mem0 result {memory_id} has missing or malformed created_at"
+ logger.warning(
+ MEMORY_MODEL_INVALID,
+ field="created_at",
+ raw_value=raw.get("created_at"),
+ reason=msg,
+ )
+ raise MemoryRetrievalError(msg)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 205 - 207, The
current code synthesizes created_at = datetime.now(UTC) when
parse_mem0_datetime(raw.get("created_at")) returns None, which improperly makes
malformed rows appear brand-new; change this to either reject the row (raise a
validation error/return None so the caller drops it) or assign a deterministic
floor timestamp (e.g., a constant EPOCH_FLOOR like
datetime(1970,1,1,tzinfo=UTC)) instead of datetime.now; update the handling in
the mappers.py logic around parse_mem0_datetime and created_at so callers get a
clear rejection or a stable floor value and add/declare a named constant (e.g.,
EPOCH_FLOOR or MIN_CREATED_AT) if choosing the deterministic floor option.
| results_list = result.get("results") | ||
| if not isinstance(results_list, list) or not results_list: | ||
| msg = f"Mem0 add returned no results for {context}" | ||
| logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) | ||
| raise MemoryStoreError(msg) | ||
| first = results_list[0] | ||
| if "id" not in first: | ||
| msg = f"Mem0 add result missing 'id' for {context}: keys={list(first.keys())}" | ||
| logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) | ||
| raise MemoryStoreError(msg) | ||
| return NotBlankStr(str(first["id"])) |
There was a problem hiding this comment.
Reject blank or non-object add results explicitly.
"id" not in first only checks key presence. {"id": ""} or a non-dict first result currently falls through into NotBlankStr(...) / first.keys() and gets reported as a generic store failure instead of the intended validation error.
Suggested hardening
def validate_add_result(result: dict[str, Any], *, context: str) -> NotBlankStr:
@@
- results_list = result.get("results")
+ results_list = result.get("results") if isinstance(result, dict) else None
if not isinstance(results_list, list) or not results_list:
msg = f"Mem0 add returned no results for {context}"
logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg)
raise MemoryStoreError(msg)
first = results_list[0]
- if "id" not in first:
- msg = f"Mem0 add result missing 'id' for {context}: keys={list(first.keys())}"
+ if not isinstance(first, dict):
+ msg = f"Mem0 add result is not an object for {context}: {type(first).__name__}"
+ logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg)
+ raise MemoryStoreError(msg)
+ raw_id = first.get("id")
+ if raw_id is None or not str(raw_id).strip():
+ msg = (
+ f"Mem0 add result missing or blank 'id' for {context}: "
+ f"keys={list(first.keys())}"
+ )
logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg)
raise MemoryStoreError(msg)
- return NotBlankStr(str(first["id"]))
+ return NotBlankStr(str(raw_id))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 332 - 342, The
add-result handling in the mem0 mapper must explicitly reject non-dict or blank
'id' values: in the block using result.get("results") and the variable first,
first check that results_list is a non-empty list, then assert isinstance(first,
dict) before accessing keys, and verify that first.get("id") is a non-empty
string (not "", None, or other types) before returning
NotBlankStr(str(first["id"])); if those checks fail, call
logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) and raise
MemoryStoreError(msg) with a clear message so you never call first.keys() on
non-dicts or wrap blank ids in NotBlankStr.
| # Should produce at least one message with memory context | ||
| assert len(messages) >= 1 | ||
| # Content should include both memories (they pass min_relevance) | ||
| combined = " ".join(m.content for m in messages if m.content) | ||
| assert "concise responses" in combined |
There was a problem hiding this comment.
Assert the second injected memory too.
Line 158 says both memories should survive the relevance gate, but the test only checks the first one. A regression that drops the episodic memory would still pass.
Suggested assertion
combined = " ".join(m.content for m in messages if m.content)
assert "concise responses" in combined
+ assert "last task was code review" in combined🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integration/memory/test_mem0_backend.py` around lines 156 - 160, The
test only asserts one expected memory phrase; update the assertions to verify
both injected memories survive relevance filtering by checking for the second
expected substring in the combined text (use the existing messages/combined
variables). For example, after the existing assert "concise responses" in
combined, add an assertion that the second memory phrase (e.g., "episodic
memory" or the exact second memory content used in the test setup) is present in
combined so the test fails if the episodic memory is dropped.
| @pytest.mark.unit | ||
| class TestConnectionGuard: | ||
| async def test_store_raises_when_disconnected( | ||
| self, | ||
| mem0_config: Mem0BackendConfig, | ||
| ) -> None: | ||
| b = Mem0MemoryBackend(mem0_config=mem0_config) | ||
| with pytest.raises(MemoryConnectionError, match="Not connected"): | ||
| await b.store("test-agent-001", _make_store_request()) | ||
|
|
||
| async def test_retrieve_raises_when_disconnected( | ||
| self, | ||
| mem0_config: Mem0BackendConfig, | ||
| ) -> None: | ||
| b = Mem0MemoryBackend(mem0_config=mem0_config) | ||
| with pytest.raises(MemoryConnectionError, match="Not connected"): | ||
| await b.retrieve("test-agent-001", MemoryQuery(text="test")) | ||
|
|
||
| async def test_get_raises_when_disconnected( | ||
| self, | ||
| mem0_config: Mem0BackendConfig, | ||
| ) -> None: | ||
| b = Mem0MemoryBackend(mem0_config=mem0_config) | ||
| with pytest.raises(MemoryConnectionError, match="Not connected"): | ||
| await b.get("test-agent-001", "mem-001") | ||
|
|
||
| async def test_delete_raises_when_disconnected( | ||
| self, | ||
| mem0_config: Mem0BackendConfig, | ||
| ) -> None: | ||
| b = Mem0MemoryBackend(mem0_config=mem0_config) | ||
| with pytest.raises(MemoryConnectionError, match="Not connected"): | ||
| await b.delete("test-agent-001", "mem-001") | ||
|
|
||
| async def test_count_raises_when_disconnected( | ||
| self, | ||
| mem0_config: Mem0BackendConfig, | ||
| ) -> None: | ||
| b = Mem0MemoryBackend(mem0_config=mem0_config) | ||
| with pytest.raises(MemoryConnectionError, match="Not connected"): | ||
| await b.count("test-agent-001") | ||
|
|
||
| async def test_publish_raises_when_disconnected( | ||
| self, | ||
| mem0_config: Mem0BackendConfig, | ||
| ) -> None: | ||
| b = Mem0MemoryBackend(mem0_config=mem0_config) | ||
| with pytest.raises(MemoryConnectionError, match="Not connected"): | ||
| await b.publish("test-agent-001", _make_store_request()) | ||
|
|
||
| async def test_search_shared_raises_when_disconnected( | ||
| self, | ||
| mem0_config: Mem0BackendConfig, | ||
| ) -> None: | ||
| b = Mem0MemoryBackend(mem0_config=mem0_config) | ||
| with pytest.raises(MemoryConnectionError, match="Not connected"): | ||
| await b.search_shared(MemoryQuery(text="test")) | ||
|
|
||
| async def test_retract_raises_when_disconnected( | ||
| self, | ||
| mem0_config: Mem0BackendConfig, | ||
| ) -> None: | ||
| b = Mem0MemoryBackend(mem0_config=mem0_config) | ||
| with pytest.raises(MemoryConnectionError, match="Not connected"): | ||
| await b.retract("test-agent-001", "mem-001") | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Parametrize the disconnected-guard matrix.
These eight tests differ only by the invoked method and the expected MemoryConnectionError. A single parametrized table will keep this surface much easier to extend and audit. As per coding guidelines, "Prefer @pytest.mark.parametrize for testing similar cases".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/memory/backends/mem0/test_adapter.py` around lines 308 - 373,
Replace the eight nearly-identical tests in TestConnectionGuard with a single
parametrized test using `@pytest.mark.parametrize` that lists the target method
names (store, retrieve, get, delete, count, publish, search_shared, retract) and
their respective argument tuples (e.g., ("test-agent-001",
_make_store_request()), ( "test-agent-001", MemoryQuery(text="test")),
("test-agent-001", "mem-001"), ("test-agent-001",), etc.), then inside the test
instantiate Mem0MemoryBackend(mem0_config=mem0_config), use getattr(b,
method_name) to obtain the coroutine, and assert it raises MemoryConnectionError
with match "Not connected" when awaited; keep references to Mem0MemoryBackend,
MemoryConnectionError, _make_store_request, and MemoryQuery to locate the code.
There was a problem hiding this comment.
♻️ Duplicate comments (10)
src/ai_company/memory/backends/mem0/mappers.py (5)
103-164:⚠️ Potential issue | 🟠 MajorTreat non-dict
metadataas absent.The function checks
if not raw_metadata:(line 114) but doesn't verify it's actually a dict. A truthy malformed payload likemetadata=[]ormetadata="oops"would raiseAttributeErrorwhen calling.get()on line 121 and subsequent lines. Add anisinstance(raw_metadata, dict)check.🛡️ Suggested fix
def parse_mem0_metadata( raw_metadata: dict[str, Any] | None, ) -> tuple[MemoryCategory, MemoryMetadata, AwareDatetime | None]: - if not raw_metadata: + if not raw_metadata or not isinstance(raw_metadata, dict): return ( MemoryCategory.WORKING, MemoryMetadata(), None, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 103 - 164, The function parse_mem0_metadata currently only checks truthiness of raw_metadata and will raise when raw_metadata is a non-dict (e.g., list or string); update the initial guard to treat any non-dict as absent by checking isinstance(raw_metadata, dict) and returning the default (MemoryCategory.WORKING, MemoryMetadata(), None) when it's not a dict or is falsy, so all subsequent calls like raw_metadata.get(...) are safe.
337-346:⚠️ Potential issue | 🟡 MinorValidate that
firstis a dict before calling.get().If
results_list[0]is not a dict (e.g., a string or number), line 338first.get("id")will raiseAttributeError. Add a type check for robustness.🛡️ Suggested fix
first = results_list[0] + if not isinstance(first, dict): + msg = f"Mem0 add result is not an object for {context}: {type(first).__name__}" + logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) + raise MemoryStoreError(msg) raw_id = first.get("id")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 337 - 346, The code assumes results_list[0] is a dict and calls first.get("id") which can raise AttributeError for non-dict items; update the block in the mapper where results_list and first are used to first verify type: check isinstance(first, dict) (or mapping) before accessing .get, and if not a dict log the same MEMORY_ENTRY_STORE_FAILED via logger with context and an explanatory msg and raise MemoryStoreError; keep the existing behavior of returning NotBlankStr(str(raw_id)) only after successful type and non-blank id validation so callers of the function (and symbols results_list, first, logger, MEMORY_ENTRY_STORE_FAILED, MemoryStoreError, NotBlankStr) are protected from type errors.
349-371:⚠️ Potential issue | 🟠 MajorSame non-dict metadata issue in
extract_category().Line 355 uses
raw.get("metadata", {})which is safe for missing keys, but ifmetadataexists as a non-dict value (e.g.,"oops"), line 358metadata.get(...)will fail. Apply the sameisinstanceguard.🛡️ Suggested fix
def extract_category(raw: dict[str, Any]) -> MemoryCategory: metadata = raw.get("metadata", {}) - if not metadata: + if not metadata or not isinstance(metadata, dict): return MemoryCategory.WORKING🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 349 - 371, The extract_category function currently assumes metadata is a dict; guard metadata with isinstance(metadata, dict) after metadata = raw.get("metadata", {}) and treat non-dict values as missing (return MemoryCategory.WORKING), mirroring the other mapper fixes; optionally log via MEMORY_MODEL_INVALID with field="metadata" and raw_value=metadata before returning default; keep existing handling for recognized/unrecognized category using _PREFIX and MemoryCategory.
374-380:⚠️ Potential issue | 🟠 MajorSame non-dict metadata issue in
extract_publisher().Apply the
isinstance(metadata, dict)guard here as well for consistency.🛡️ Suggested fix
def extract_publisher(raw: dict[str, Any]) -> str | None: metadata = raw.get("metadata", {}) - if not metadata: + if not metadata or not isinstance(metadata, dict): return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 374 - 380, The extract_publisher function assumes raw["metadata"] is a dict and can call .get; add the same guard used elsewhere by checking isinstance(metadata, dict) after obtaining metadata and return None if it's not a dict so you don't call .get on non-dict types; update extract_publisher to only call metadata.get(_PUBLISHER_KEY) when metadata is a dict and otherwise return None.
205-207:⚠️ Potential issue | 🟠 MajorConsider rejecting rows with missing
created_atinstead of synthesizingnow().When
parse_mem0_datetimereturnsNone, the code defaults todatetime.now(UTC)(line 207). This makes malformed rows appear brand-new, which skews recency ranking andsince/untilfilters. Consider raisingMemoryRetrievalErroror using a deterministic floor timestamp instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 205 - 207, The code currently substitutes datetime.now(UTC) when parse_mem0_datetime(raw.get("created_at")) returns None, which makes malformed rows appear new; instead, modify the handling in the mapper (around created_at assignment) to reject such rows deterministically: if parse_mem0_datetime(...) returns None, raise MemoryRetrievalError with a clear message including the offending raw row id/metadata (or, if a deterministic fallback is preferred, set created_at to a defined constant floor timestamp like MEM0_MIN_TIMESTAMP), and ensure callers of the mapper can surface or handle MemoryRetrievalError appropriately; update any tests or error handling paths that expect the old behavior.tests/unit/memory/backends/mem0/test_adapter.py (1)
308-373: 🧹 Nitpick | 🔵 TrivialConsider parametrizing the disconnected-guard tests.
These eight tests share identical structure and could be consolidated into a single parametrized test. This would make the test suite more maintainable and easier to extend when new methods are added.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/memory/backends/mem0/test_adapter.py` around lines 308 - 373, Consolidate the eight identical disconnected-guard tests into one parametrized pytest test: create a single async test that instantiates Mem0MemoryBackend(mem0_config=mem0_config) and use pytest.mark.parametrize to pass tuples of (method_name, args_tuple) for "store", "retrieve", "get", "delete", "count", "publish", "search_shared", and "retract" (use _make_store_request() or MemoryQuery(text="test") where appropriate); inside the test use getattr(b, method_name)(*args) awaited and assert it raises MemoryConnectionError with match "Not connected". Ensure test references Mem0MemoryBackend, MemoryConnectionError, _make_store_request, and MemoryQuery so readers can locate the original methods.src/ai_company/memory/backends/mem0/adapter.py (4)
644-669:⚠️ Potential issue | 🟠 MajorSame post-filter limitation applies to
search_shared().The
search_shared()method also fetchesquery.limititems (lines 649, 655) and then applies filters plusexclude_agent(lines 666-669), which may result in fewer items than requested.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 644 - 669, search_shared() currently requests exactly query.limit items from self._client.search / get_all then applies apply_post_filters and exclude_agent, which can produce fewer results than requested; fix by over-fetching and trimming: call self._client.search / get_all with a larger limit (e.g., min(max_overfetch, query.limit * X) or loop/paginate requesting additional pages) until you have at least query.limit post-filtered entries or no more raw results, then convert via mem0_result_to_entry, apply apply_post_filters and the exclude_agent filter and finally trim to query.limit before returning; update the calls to self._client.search and self._client.get_all and the post-processing around raw_result/raw_list to implement this over-fetch/pagination strategy.
364-422:⚠️ Potential issue | 🟠 MajorAdd ownership validation in
get()before returning the entry.The
get()method fetches by memory ID without validating that the record belongs to the providedagent_id. Unlikeretrieve()which scopes byuser_id, andretract()which validates publisher ownership,get()could return entries belonging to other agents if Mem0 IDs are globally addressable. Consider validating ownership before returning.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 364 - 422, get() currently returns an entry fetched by memory_id without checking it belongs to the given agent_id; after converting raw -> entry via mem0_result_to_entry (or immediately after that call), validate the entry's owner (e.g., entry.publisher_id, entry.agent_id or the appropriate ownership field on MemoryEntry) against the provided agent_id, and if it does not match log a warning (similar to MEMORY_ENTRY_FETCH_FAILED/MEMORY_ENTRY_FETCHED pattern) and return None (or raise a MemoryRetrievalError if your codebase prefers); ensure the ownership check happens before the final debug log and before returning the entry so entries owned by other agents are not exposed.
326-336:⚠️ Potential issue | 🟠 MajorPost-filters run after an already-capped fetch.
The retrieve logic fetches only
query.limititems from Mem0 (lines 327-331), then applies local filters (line 336). If many items are filtered out, the result set may be smaller thanquery.limiteven when more eligible memories exist. Consider over-fetching or paginating until the filtered set reachesquery.limit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 326 - 336, The current retrieve flow uses query_to_mem0_search_args / query_to_mem0_getall_args to fetch only query.limit items from self._client.search / get_all, then runs apply_post_filters (via mem0_result_to_entry) which can reduce results below query.limit; change the logic to over-fetch or paginate: call the client in a loop (using offset/next token or increasing page size in kwargs) accumulating raw_results -> mem0_result_to_entry conversions and applying post-filters incrementally until the filtered entries length >= query.limit or no more results are returned, deduplicate entries as you accumulate, and then trim to query.limit before returning. Ensure the loop uses the same kwargs shape produced by query_to_mem0_search_args / query_to_mem0_getall_args and preserves existing behavior when there are fewer overall matches.
424-484:⚠️ Potential issue | 🟠 MajorAdd ownership validation in
delete()before deleting the entry.Similar to
get(), thedelete()method fetches by memory ID without validating ownership. This could allow one agent to delete another agent's memories if IDs are globally addressable. Theretract()method (line 746) demonstrates the ownership check pattern that should be applied here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/adapter.py` around lines 424 - 484, The delete() method fetches the entry but doesn't verify that the fetched entry belongs to the requesting agent; mirror the ownership check used in retract() and get(): after retrieving existing = await asyncio.to_thread(self._client.get, str(memory_id)) verify the entry's owner field (use the same attribute used in retract(), e.g., existing.agent_id or existing.owner_id) matches the passed agent_id, and if it does not, log the deletion as not-found/forbidden (reuse MEMORY_ENTRY_DELETED with found=False or MEMORY_ENTRY_DELETE_FAILED as appropriate) and return False without calling self._client.delete; keep existing exception handling and asyncio.to_thread usage for delete when ownership matches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/ai_company/memory/backends/mem0/adapter.py`:
- Around line 644-669: search_shared() currently requests exactly query.limit
items from self._client.search / get_all then applies apply_post_filters and
exclude_agent, which can produce fewer results than requested; fix by
over-fetching and trimming: call self._client.search / get_all with a larger
limit (e.g., min(max_overfetch, query.limit * X) or loop/paginate requesting
additional pages) until you have at least query.limit post-filtered entries or
no more raw results, then convert via mem0_result_to_entry, apply
apply_post_filters and the exclude_agent filter and finally trim to query.limit
before returning; update the calls to self._client.search and
self._client.get_all and the post-processing around raw_result/raw_list to
implement this over-fetch/pagination strategy.
- Around line 364-422: get() currently returns an entry fetched by memory_id
without checking it belongs to the given agent_id; after converting raw -> entry
via mem0_result_to_entry (or immediately after that call), validate the entry's
owner (e.g., entry.publisher_id, entry.agent_id or the appropriate ownership
field on MemoryEntry) against the provided agent_id, and if it does not match
log a warning (similar to MEMORY_ENTRY_FETCH_FAILED/MEMORY_ENTRY_FETCHED
pattern) and return None (or raise a MemoryRetrievalError if your codebase
prefers); ensure the ownership check happens before the final debug log and
before returning the entry so entries owned by other agents are not exposed.
- Around line 326-336: The current retrieve flow uses query_to_mem0_search_args
/ query_to_mem0_getall_args to fetch only query.limit items from
self._client.search / get_all, then runs apply_post_filters (via
mem0_result_to_entry) which can reduce results below query.limit; change the
logic to over-fetch or paginate: call the client in a loop (using offset/next
token or increasing page size in kwargs) accumulating raw_results ->
mem0_result_to_entry conversions and applying post-filters incrementally until
the filtered entries length >= query.limit or no more results are returned,
deduplicate entries as you accumulate, and then trim to query.limit before
returning. Ensure the loop uses the same kwargs shape produced by
query_to_mem0_search_args / query_to_mem0_getall_args and preserves existing
behavior when there are fewer overall matches.
- Around line 424-484: The delete() method fetches the entry but doesn't verify
that the fetched entry belongs to the requesting agent; mirror the ownership
check used in retract() and get(): after retrieving existing = await
asyncio.to_thread(self._client.get, str(memory_id)) verify the entry's owner
field (use the same attribute used in retract(), e.g., existing.agent_id or
existing.owner_id) matches the passed agent_id, and if it does not, log the
deletion as not-found/forbidden (reuse MEMORY_ENTRY_DELETED with found=False or
MEMORY_ENTRY_DELETE_FAILED as appropriate) and return False without calling
self._client.delete; keep existing exception handling and asyncio.to_thread
usage for delete when ownership matches.
In `@src/ai_company/memory/backends/mem0/mappers.py`:
- Around line 103-164: The function parse_mem0_metadata currently only checks
truthiness of raw_metadata and will raise when raw_metadata is a non-dict (e.g.,
list or string); update the initial guard to treat any non-dict as absent by
checking isinstance(raw_metadata, dict) and returning the default
(MemoryCategory.WORKING, MemoryMetadata(), None) when it's not a dict or is
falsy, so all subsequent calls like raw_metadata.get(...) are safe.
- Around line 337-346: The code assumes results_list[0] is a dict and calls
first.get("id") which can raise AttributeError for non-dict items; update the
block in the mapper where results_list and first are used to first verify type:
check isinstance(first, dict) (or mapping) before accessing .get, and if not a
dict log the same MEMORY_ENTRY_STORE_FAILED via logger with context and an
explanatory msg and raise MemoryStoreError; keep the existing behavior of
returning NotBlankStr(str(raw_id)) only after successful type and non-blank id
validation so callers of the function (and symbols results_list, first, logger,
MEMORY_ENTRY_STORE_FAILED, MemoryStoreError, NotBlankStr) are protected from
type errors.
- Around line 349-371: The extract_category function currently assumes metadata
is a dict; guard metadata with isinstance(metadata, dict) after metadata =
raw.get("metadata", {}) and treat non-dict values as missing (return
MemoryCategory.WORKING), mirroring the other mapper fixes; optionally log via
MEMORY_MODEL_INVALID with field="metadata" and raw_value=metadata before
returning default; keep existing handling for recognized/unrecognized category
using _PREFIX and MemoryCategory.
- Around line 374-380: The extract_publisher function assumes raw["metadata"] is
a dict and can call .get; add the same guard used elsewhere by checking
isinstance(metadata, dict) after obtaining metadata and return None if it's not
a dict so you don't call .get on non-dict types; update extract_publisher to
only call metadata.get(_PUBLISHER_KEY) when metadata is a dict and otherwise
return None.
- Around line 205-207: The code currently substitutes datetime.now(UTC) when
parse_mem0_datetime(raw.get("created_at")) returns None, which makes malformed
rows appear new; instead, modify the handling in the mapper (around created_at
assignment) to reject such rows deterministically: if parse_mem0_datetime(...)
returns None, raise MemoryRetrievalError with a clear message including the
offending raw row id/metadata (or, if a deterministic fallback is preferred, set
created_at to a defined constant floor timestamp like MEM0_MIN_TIMESTAMP), and
ensure callers of the mapper can surface or handle MemoryRetrievalError
appropriately; update any tests or error handling paths that expect the old
behavior.
In `@tests/unit/memory/backends/mem0/test_adapter.py`:
- Around line 308-373: Consolidate the eight identical disconnected-guard tests
into one parametrized pytest test: create a single async test that instantiates
Mem0MemoryBackend(mem0_config=mem0_config) and use pytest.mark.parametrize to
pass tuples of (method_name, args_tuple) for "store", "retrieve", "get",
"delete", "count", "publish", "search_shared", and "retract" (use
_make_store_request() or MemoryQuery(text="test") where appropriate); inside the
test use getattr(b, method_name)(*args) awaited and assert it raises
MemoryConnectionError with match "Not connected". Ensure test references
Mem0MemoryBackend, MemoryConnectionError, _make_store_request, and MemoryQuery
so readers can locate the original methods.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 30dde13f-7c5f-4805-8160-22b084477d2d
📒 Files selected for processing (7)
.github/workflows/dependency-review.ymlsrc/ai_company/memory/backends/mem0/adapter.pysrc/ai_company/memory/backends/mem0/mappers.pysrc/ai_company/memory/factory.pysrc/ai_company/observability/events/memory.pytests/unit/memory/backends/mem0/test_adapter.pytests/unit/memory/backends/mem0/test_mappers.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Agent
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Useexcept A, B:syntax (no parentheses) — ruff enforces PEP 758 except syntax on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction +MappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention andcopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (usingmodel_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict). Use@computed_fieldfor derived values instead of storing + validating redundant fields. UseNotBlankStrfor all identifier/name fields — including optional and tuple variants — instead of manual whitespace validators.
Async concurrency: preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task. Existing code is being migrated incrementally.
Line length: 88 characters (ruff)
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)
Files:
src/ai_company/observability/events/memory.pysrc/ai_company/memory/factory.pysrc/ai_company/memory/backends/mem0/adapter.pytests/unit/memory/backends/mem0/test_adapter.pytests/unit/memory/backends/mem0/test_mappers.pysrc/ai_company/memory/backends/mem0/mappers.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic MUST have:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Variable name: alwayslogger(not_logger, notlog)
Event names: always use constants from the domain-specific module underai_company.observability.events(e.g.PROVIDER_CALL_STARTfromevents.provider). Import directly:from ai_company.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs in logging: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG for object creation, internal flow, entry/exit of key functions
Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. Tests must usetest-provider,test-small-001, etc.
Files:
src/ai_company/observability/events/memory.pysrc/ai_company/memory/factory.pysrc/ai_company/memory/backends/mem0/adapter.pysrc/ai_company/memory/backends/mem0/mappers.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Tests: markers@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Coverage: 80% minimum (enforced in CI)
Async:asyncio_mode = "auto"— no manual@pytest.mark.asyncioneeded
Timeout: 30 seconds per test
Parallelism:pytest-xdistvia-n auto— ALWAYS include-n autowhen running pytest, never run tests sequentially
Prefer@pytest.mark.parametrizefor testing similar cases
Vendor-agnostic everywhere: Tests must usetest-provider,test-small-001, etc. (Vendor names may only appear in operations design page, .claude/ skill/agent files, or third-party import paths)
Files:
tests/unit/memory/backends/mem0/test_adapter.pytests/unit/memory/backends/mem0/test_mappers.py
🧠 Learnings (4)
📚 Learning: 2026-03-13T06:12:19.113Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T06:12:19.113Z
Learning: Applies to src/ai_company/**/*.py : Event names: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/ai_company/observability/events/memory.py
📚 Learning: 2026-03-13T06:12:19.113Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T06:12:19.113Z
Learning: Applies to src/ai_company/**/*.py : All error paths must log at WARNING or ERROR with context before raising
Applied to files:
src/ai_company/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-13T06:12:19.113Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T06:12:19.113Z
Learning: Applies to src/ai_company/providers/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry. `RetryExhaustedError` signals that all retries failed — the engine layer catches this to trigger fallback chains.
Applied to files:
src/ai_company/memory/backends/mem0/adapter.py
📚 Learning: 2026-03-13T06:12:19.113Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T06:12:19.113Z
Learning: Applies to tests/**/*.py : Prefer `pytest.mark.parametrize` for testing similar cases
Applied to files:
tests/unit/memory/backends/mem0/test_adapter.py
🧬 Code graph analysis (2)
src/ai_company/memory/backends/mem0/adapter.py (3)
src/ai_company/core/enums.py (1)
MemoryCategory(101-108)src/ai_company/memory/errors.py (4)
MemoryConnectionError(17-18)MemoryRetrievalError(25-26)MemoryStoreError(21-22)MemoryError(13-14)tests/integration/memory/test_mem0_backend.py (1)
backend(42-51)
tests/unit/memory/backends/mem0/test_mappers.py (4)
src/ai_company/core/enums.py (1)
MemoryCategory(101-108)src/ai_company/memory/backends/mem0/mappers.py (11)
apply_post_filters(280-312)build_mem0_metadata(38-57)extract_category(349-371)extract_publisher(374-380)mem0_result_to_entry(167-226)normalize_relevance_score(89-100)parse_mem0_datetime(60-86)parse_mem0_metadata(103-164)query_to_mem0_getall_args(261-277)query_to_mem0_search_args(229-258)validate_add_result(318-346)src/ai_company/memory/errors.py (2)
MemoryRetrievalError(25-26)MemoryStoreError(21-22)src/ai_company/memory/models.py (4)
MemoryEntry(82-150)MemoryMetadata(20-52)MemoryQuery(153-230)MemoryStoreRequest(55-79)
🔇 Additional comments (10)
src/ai_company/observability/events/memory.py (1)
22-22: LGTM!The new event constant follows the established
memory.<entity>.<action>naming convention and is properly placed in the Backend lifecycle section. This addresses the past review feedback about using a more specific event for embedder validation failures..github/workflows/dependency-review.yml (1)
32-51: LGTM!License allow-list updates are well-documented with manual verification notes. The added packages (
mem0ai,numpy,qdrant-client,posthog) and licenses (LicenseRef-scancode-protobuf,ZPL-2.1) are all permissive and properly justified in comments.src/ai_company/memory/factory.py (1)
48-93: LGTM!The Mem0 backend creation logic properly validates the embedder configuration with clear error messages and structured logging. The switch to
MEMORY_BACKEND_CONFIG_INVALIDevent (addressing the past review feedback) provides better semantic clarity for configuration errors versus unknown backends.tests/unit/memory/backends/mem0/test_mappers.py (1)
1-491: LGTM!Comprehensive test suite covering all mapper functions with good edge case coverage. Tests follow vendor-agnostic naming conventions (
test-agent-001,test-provider) and include proper markers. The test structure clearly maps to each public mapper function.tests/unit/memory/backends/mem0/test_adapter.py (1)
1-1117: Comprehensive test coverage for the Mem0 adapter.The test suite thoroughly exercises the adapter's lifecycle, CRUD operations, shared knowledge flows, and error handling paths. Good coverage of edge cases including
MemoryErrorre-raising and validation failures.src/ai_company/memory/backends/mem0/adapter.py (2)
497-501: Documentation of count limitation is helpful.The docstring now explicitly documents that results are capped at
max_memories_per_agentand may underestimate. This addresses the past review concern by making the limitation transparent to callers.
699-789:retract()correctly validates publisher ownership.Good implementation of the ownership check pattern (lines 732-758) that should serve as the model for
get()anddelete().src/ai_company/memory/backends/mem0/mappers.py (3)
38-57:build_mem0_metadatais well-structured.Clean serialization logic with proper prefixing and conditional field inclusion.
60-86:parse_mem0_datetimehandles edge cases properly.Good defensive handling of None/empty input, malformed strings, and naive datetime conversion to UTC with appropriate logging.
280-312:apply_post_filterslogic is correct.The filtering logic properly handles all query parameters while preserving order.
There was a problem hiding this comment.
Pull request overview
Implements and hardens a concrete Mem0-backed MemoryBackend implementation, making Mem0 configuration vendor-agnostic (explicit provider/model) while adding robust adapter behavior and extensive test coverage. This fits into the memory subsystem by providing the first real backend wired through the factory and validated via unit + integration tests.
Changes:
- Add Mem0 backend implementation (
adapter.py,mappers.py,config.py) plus factory wiring and exports. - Harden parsing/validation paths (IDs/content/categories/datetimes), add health probes, and improve structured logging/events.
- Add Mem0 dependency + license-review exceptions, update docs/README, and add comprehensive unit/integration tests.
Reviewed changes
Copilot reviewed 20 out of 23 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Locks new transitive dependencies introduced by mem0ai and related packages. |
| pyproject.toml | Adds mem0ai==1.0.5 and a mypy ignore override for mem0.*. |
| .github/workflows/dependency-review.yml | Updates dependency-review allowlists for new license metadata cases and missing SPDX metadata. |
| docs/roadmap/index.md | Removes “Mem0 adapter” from remaining roadmap items (now implemented). |
| docs/design/memory.md | Updates design doc to reflect Mem0 as implemented. |
| README.md | Updates status section to include Mem0 backend adapter as implemented. |
| CLAUDE.md | Updates repo structure notes and mentions mem0ai dependency. |
| src/ai_company/observability/events/memory.py | Adds MEMORY_BACKEND_CONFIG_INVALID event constant for factory validation logging. |
| src/ai_company/memory/factory.py | Implements backend creation for mem0, requires embedder config, adds structured logging. |
| src/ai_company/memory/backends/mem0/mappers.py | Adds Mem0 ↔ domain mapping helpers, defensive parsing, post-filters, and add-result validation. |
| src/ai_company/memory/backends/mem0/config.py | Introduces Mem0-specific config models and dict builder; adds traversal validation; embedder config now required. |
| src/ai_company/memory/backends/mem0/adapter.py | Adds Mem0 adapter implementing CRUD + shared knowledge operations with health probes and defensive error handling. |
| src/ai_company/memory/backends/mem0/init.py | Re-exports Mem0 backend/config types. |
| src/ai_company/memory/backends/init.py | Exposes concrete backends package exports. |
| src/ai_company/memory/init.py | Re-exports Mem0MemoryBackend and Mem0EmbedderConfig from ai_company.memory. |
| tests/unit/memory/test_init.py | Updates export expectations to include Mem0 symbols. |
| tests/unit/memory/test_factory.py | Updates factory tests for real Mem0 backend creation + embedder requirement validation. |
| tests/unit/memory/backends/mem0/test_mappers.py | Adds unit coverage for mapper functions and edge cases. |
| tests/unit/memory/backends/mem0/test_config.py | Adds unit coverage for Mem0 config validation and config dict building. |
| tests/unit/memory/backends/mem0/test_adapter.py | Adds unit coverage for adapter lifecycle/CRUD/shared operations and error paths. |
| tests/integration/memory/test_mem0_backend.py | Adds integration tests validating adapter behavior through retrieval pipeline using a mocked client. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if not raw_metadata: | ||
| return ( | ||
| MemoryCategory.WORKING, | ||
| MemoryMetadata(), | ||
| None, | ||
| ) | ||
|
|
||
| category_str = raw_metadata.get(f"{_PREFIX}category") | ||
| if category_str: | ||
| try: |
| metadata = raw.get("metadata", {}) | ||
| if not metadata: | ||
| return MemoryCategory.WORKING | ||
| cat_str = metadata.get(f"{_PREFIX}category") | ||
| if cat_str: | ||
| try: | ||
| return MemoryCategory(cat_str) |
| def extract_publisher(raw: dict[str, Any]) -> str | None: | ||
| """Extract the publisher agent ID from a shared memory dict.""" | ||
| metadata = raw.get("metadata", {}) | ||
| if not metadata: | ||
| return None | ||
| value: str | None = metadata.get(_PUBLISHER_KEY) | ||
| return value |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/ai_company/memory/backends/mem0/mappers.py (1)
343-352:⚠️ Potential issue | 🟡 MinorAdd explicit type check for
firstbefore accessing.get().If
results_list[0]is not a dict (e.g., a string or number),first.get("id")raisesAttributeErrorinstead of the expectedMemoryStoreError. This was flagged in a prior review.🛡️ Proposed fix
first = results_list[0] + if not isinstance(first, dict): + msg = f"Mem0 add result is not an object for {context}: {type(first).__name__}" + logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) + raise MemoryStoreError(msg) raw_id = first.get("id")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/memory/backends/mem0/mappers.py` around lines 343 - 352, The code assumes results_list[0] is a dict and calls first.get("id"), which raises AttributeError for non-dict items; update the extraction logic in the block that handles results_list (referencing results_list, first, and the return of NotBlankStr) to first check the type of first (e.g., isinstance(first, dict)) before calling .get(), and if it is not a dict construct the same warning and raise MemoryStoreError (using logger with MEMORY_ENTRY_STORE_FAILED, context, error=...) with a clear message about unexpected result type and keys/contents; keep the existing behavior for missing/blank id and return NotBlankStr(str(raw_id)) when valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/unit/memory/backends/mem0/test_mappers.py`:
- Around line 394-439: Add a unit test in the TestValidateAddResult class to
cover the case where results[0] is not a dict: create a method named
test_non_dict_first_result_raises that calls validate_add_result({"results":
["not-a-dict"]}, context="test") and asserts it raises MemoryStoreError with a
match like "not an object"; this will expose the AttributeError and ensure
validate_add_result properly validates the first result element.
---
Duplicate comments:
In `@src/ai_company/memory/backends/mem0/mappers.py`:
- Around line 343-352: The code assumes results_list[0] is a dict and calls
first.get("id"), which raises AttributeError for non-dict items; update the
extraction logic in the block that handles results_list (referencing
results_list, first, and the return of NotBlankStr) to first check the type of
first (e.g., isinstance(first, dict)) before calling .get(), and if it is not a
dict construct the same warning and raise MemoryStoreError (using logger with
MEMORY_ENTRY_STORE_FAILED, context, error=...) with a clear message about
unexpected result type and keys/contents; keep the existing behavior for
missing/blank id and return NotBlankStr(str(raw_id)) when valid.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 144746d0-6fd8-48b0-85ec-3d6e4c606065
📒 Files selected for processing (3)
src/ai_company/memory/backends/mem0/mappers.pytests/integration/memory/test_mem0_backend.pytests/unit/memory/backends/mem0/test_mappers.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Useexcept A, B:syntax (no parentheses) — ruff enforces PEP 758 except syntax on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction +MappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention andcopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (usingmodel_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Models: Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict). Use@computed_fieldfor derived values instead of storing + validating redundant fields. UseNotBlankStrfor all identifier/name fields — including optional and tuple variants — instead of manual whitespace validators.
Async concurrency: preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task. Existing code is being migrated incrementally.
Line length: 88 characters (ruff)
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)
Files:
tests/unit/memory/backends/mem0/test_mappers.pysrc/ai_company/memory/backends/mem0/mappers.pytests/integration/memory/test_mem0_backend.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Tests: markers@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Coverage: 80% minimum (enforced in CI)
Async:asyncio_mode = "auto"— no manual@pytest.mark.asyncioneeded
Timeout: 30 seconds per test
Parallelism:pytest-xdistvia-n auto— ALWAYS include-n autowhen running pytest, never run tests sequentially
Prefer@pytest.mark.parametrizefor testing similar cases
Vendor-agnostic everywhere: Tests must usetest-provider,test-small-001, etc. (Vendor names may only appear in operations design page, .claude/ skill/agent files, or third-party import paths)
Files:
tests/unit/memory/backends/mem0/test_mappers.pytests/integration/memory/test_mem0_backend.py
src/ai_company/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/ai_company/**/*.py: Every module with business logic MUST have:from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Never useimport logging/logging.getLogger()/print()in application code
Variable name: alwayslogger(not_logger, notlog)
Event names: always use constants from the domain-specific module underai_company.observability.events(e.g.PROVIDER_CALL_STARTfromevents.provider). Import directly:from ai_company.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs in logging: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG for object creation, internal flow, entry/exit of key functions
Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. Tests must usetest-provider,test-small-001, etc.
Files:
src/ai_company/memory/backends/mem0/mappers.py
🧬 Code graph analysis (3)
tests/unit/memory/backends/mem0/test_mappers.py (1)
src/ai_company/memory/backends/mem0/mappers.py (11)
apply_post_filters(286-318)build_mem0_metadata(38-57)extract_category(355-377)extract_publisher(380-386)mem0_result_to_entry(167-232)normalize_relevance_score(89-100)parse_mem0_datetime(60-86)parse_mem0_metadata(103-164)query_to_mem0_getall_args(267-283)query_to_mem0_search_args(235-264)validate_add_result(324-352)
src/ai_company/memory/backends/mem0/mappers.py (5)
src/ai_company/core/enums.py (1)
MemoryCategory(101-108)src/ai_company/memory/errors.py (2)
MemoryRetrievalError(25-26)MemoryStoreError(21-22)src/ai_company/memory/models.py (4)
MemoryEntry(82-150)MemoryMetadata(20-52)MemoryQuery(153-230)MemoryStoreRequest(55-79)src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/memory/backends/mem0/adapter.py (1)
get(364-422)
tests/integration/memory/test_mem0_backend.py (5)
src/ai_company/memory/backends/mem0/adapter.py (9)
Mem0MemoryBackend(77-789)store(245-301)retrieve(303-362)publish(559-621)search_shared(623-697)get(364-422)delete(424-484)retract(699-789)count(486-555)src/ai_company/memory/backends/mem0/config.py (2)
Mem0BackendConfig(46-78)Mem0EmbedderConfig(17-43)src/ai_company/memory/models.py (2)
MemoryQuery(153-230)MemoryStoreRequest(55-79)src/ai_company/memory/retrieval_config.py (1)
MemoryRetrievalConfig(20-132)src/ai_company/memory/retriever.py (1)
ContextInjectionStrategy(97-400)
🔇 Additional comments (28)
src/ai_company/memory/backends/mem0/mappers.py (11)
1-29: LGTM!Module setup follows guidelines: proper docstring, no
__future__imports, correct logger initialization withget_logger(__name__), and appropriate imports.
38-57: LGTM!Clean metadata serialization with proper type hints, docstring, and immutable construction pattern.
60-86: LGTM!Robust datetime parsing with proper exception handling using PEP 758 syntax (
except ValueError, TypeError:), appropriate fallback toNoneon malformed input, and UTC assignment for naive datetimes.
89-100: LGTM!Simple, correct clamping implementation that preserves
Nonevalues.
103-164: LGTM!Robust metadata deserialization that properly handles non-dict metadata types (Line 114), invalid categories, non-numeric confidence values, and various tag formats. The past review concern about non-object metadata is addressed.
205-213: Defaultingcreated_attonow()may skew recency ranking.When
created_atis missing or unparseable, usingdatetime.now(UTC)makes the entry appear brand-new. This can affectsince/untilfilters and recency-weighted ranking, potentially surfacing malformed records at the top of results.Consider either rejecting entries with missing timestamps (raise
MemoryRetrievalError) or using a deterministic floor timestamp (e.g.,datetime(1970, 1, 1, tzinfo=UTC)) to make these entries rank lower rather than higher.
235-264: LGTM!Proper validation that
query.textis set before building search arguments, with appropriate logging and error raising.
267-283: LGTM!Simple and correct implementation for metadata-only queries.
286-318: LGTM!Post-filtering logic is correct: half-open time interval
[since, until), AND semantics for tags, and proper handling ofNonerelevance scores (they bypass the min_relevance check).
355-377: LGTM!Properly handles non-dict metadata types (Line 362-363), returning
WORKINGas the safe default.
380-386: LGTM!Consistent defensive handling of non-dict metadata, returning
Nonewhen publisher cannot be extracted.tests/unit/memory/backends/mem0/test_mappers.py (10)
1-32: LGTM!Proper test module setup with correct imports, timeout marker, and unit test marker usage.
34-54: LGTM!Clean test helper with sensible defaults for constructing
MemoryEntryinstances.
57-91: LGTM!Good coverage of
build_mem0_metadatawith basic and full metadata scenarios.
93-121: LGTM!Comprehensive datetime parsing tests covering None, empty string, aware/naive timezones, and malformed inputs.
124-141: LGTM!Good boundary testing for relevance score normalization.
143-186: LGTM!Thorough metadata parsing tests including edge cases for invalid categories and empty/whitespace tags.
189-269: LGTM!Excellent coverage of
mem0_result_to_entryincluding validation of missing/blank IDs, empty content, and thedatakey fallback path.
272-295: LGTM!Query argument translation tests properly validate both happy path and error case.
298-391: LGTM!Comprehensive post-filter tests covering category, tags, time ranges (including exclusive
until), min relevance, and combined filters.
442-507: LGTM!Thorough tests for
extract_categoryandextract_publisherincluding non-dict metadata edge cases (list, string) which validates the defensive fixes in the production code.tests/integration/memory/test_mem0_backend.py (7)
1-32: LGTM!Proper integration test setup with vendor-agnostic embedder configuration helper.
35-51: LGTM!Appropriate test fixture setup that injects a mock client and bypasses the connection flow. The
noqa: S108for the temp path is acceptable in test context.
54-106: LGTM!Good integration test validating the store → retrieve flow with proper mock setup and assertions on memory ID, content, and relevance score.
107-161: LGTM!Full pipeline integration test properly validates that both memories survive ranking and appear in the formatted output. The past review concern about asserting the second memory is now addressed (Line 161).
163-221: LGTM!Comprehensive shared knowledge flow test covering publish → search_shared → retract with proper publisher metadata verification.
223-253: LGTM!Good test for the
exclude_agentfilter insearch_shared, verifying that the requesting agent's memories are excluded from results.
255-288: LGTM!Proper test for count functionality including category filtering.
| @pytest.mark.unit | ||
| class TestValidateAddResult: | ||
| def test_valid_result(self) -> None: | ||
| result = {"results": [{"id": "mem-001", "event": "ADD"}]} | ||
| memory_id = validate_add_result(result, context="test") | ||
| assert memory_id == "mem-001" | ||
|
|
||
| def test_empty_results_raises(self) -> None: | ||
| result: dict[str, Any] = {"results": []} | ||
| with pytest.raises(MemoryStoreError, match="no results"): | ||
| validate_add_result(result, context="test") | ||
|
|
||
| def test_missing_results_key_raises(self) -> None: | ||
| result = {"data": "something"} | ||
| with pytest.raises(MemoryStoreError, match="no results"): | ||
| validate_add_result(result, context="test") | ||
|
|
||
| def test_non_list_results_raises(self) -> None: | ||
| result = {"results": "not-a-list"} | ||
| with pytest.raises(MemoryStoreError, match="no results"): | ||
| validate_add_result(result, context="test") | ||
|
|
||
| def test_missing_id_raises(self) -> None: | ||
| result = {"results": [{"memory": "no id"}]} | ||
| with pytest.raises(MemoryStoreError, match="missing or blank 'id'"): | ||
| validate_add_result(result, context="test") | ||
|
|
||
| def test_none_id_raises(self) -> None: | ||
| result = {"results": [{"id": None, "event": "ADD"}]} | ||
| with pytest.raises(MemoryStoreError, match="missing or blank 'id'"): | ||
| validate_add_result(result, context="test") | ||
|
|
||
| def test_blank_id_raises(self) -> None: | ||
| result = {"results": [{"id": "", "event": "ADD"}]} | ||
| with pytest.raises(MemoryStoreError, match="missing or blank 'id'"): | ||
| validate_add_result(result, context="test") | ||
|
|
||
| def test_whitespace_only_id_raises(self) -> None: | ||
| result = {"results": [{"id": " ", "event": "ADD"}]} | ||
| with pytest.raises(MemoryStoreError, match="missing or blank 'id'"): | ||
| validate_add_result(result, context="test") | ||
|
|
||
| def test_numeric_id_coerced_to_string(self) -> None: | ||
| result = {"results": [{"id": 42, "event": "ADD"}]} | ||
| memory_id = validate_add_result(result, context="test") | ||
| assert memory_id == "42" |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add test for non-dict first result element.
The test suite validates empty, non-list, and missing results, but doesn't cover the case where results[0] is not a dict (e.g., {"results": ["just-a-string"]}). This would expose the AttributeError bug in the production code.
🧪 Suggested test case
def test_non_dict_first_result_raises(self) -> None:
result = {"results": ["not-a-dict"]}
with pytest.raises(MemoryStoreError, match="not an object"):
validate_add_result(result, context="test")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/memory/backends/mem0/test_mappers.py` around lines 394 - 439, Add
a unit test in the TestValidateAddResult class to cover the case where
results[0] is not a dict: create a method named
test_non_dict_first_result_raises that calls validate_add_result({"results":
["not-a-dict"]}, context="test") and asserts it raises MemoryStoreError with a
match like "not an object"; this will expose the AttributeError and ensure
validate_add_result properly validates the first result element.
Add concrete MemoryBackend implementation using Mem0 (embedded Qdrant + SQLite) as the storage layer, unblocking all downstream memory features. - Mem0MemoryBackend implements MemoryBackend, MemoryCapabilities, and SharedKnowledgeStore protocols with asyncio.to_thread for all sync Mem0 calls - Mapping layer (mappers.py) for bidirectional domain model <-> Mem0 dict conversion with _synthorg_ metadata prefix - Mem0BackendConfig + config builder deriving from CompanyMemoryConfig - Factory wired up with deferred import (no longer raises MemoryConfigError for mem0 backend) - Shared knowledge via reserved __synthorg_shared__ namespace with publisher ownership tracking - 95 unit tests (adapter, mappers, config) + 6 integration tests (retrieval pipeline, shared knowledge flow) Closes #206
Pre-reviewed by 9 agents, 33 findings addressed: - Remove vendor-specific defaults from Mem0EmbedderConfig (now required) - Fix metadata dict mutation in publish() (use spread instead) - Add path traversal validation on Mem0BackendConfig.data_dir - Extract _validate_add_result helper for store/publish result validation - Add explicit ImportError handling in connect() - Make health_check() probe backend with lightweight get_all call - Add defensive category parsing in _extract_category (handle invalid enums) - Add missing publisher check in retract() (not a shared memory entry) - Replace C-style loop in search_shared with generator expression - Use meaningful variable names (raw_entries/filtered vs triple result) - Change logger.exception to logger.warning for self-raised errors - Add structured error/error_type kwargs to all re-raise logs - Add MemoryConnectionError to Raises docstrings for shared methods - Update factory to require embedder config (no vendor defaults) - Remove dead store_request_to_mem0_args function and its tests - Add tests: path traversal, missing id, empty content, malformed datetime, invalid category, no-publisher retract, delete-after-get failure, protocol conformance, health_check probe failure
…iewers - Add MemoryError/RecursionError guards before all except Exception blocks - Capture exceptions in re-raise blocks for structured logging - Move helper functions (validate_add_result, extract_category, extract_publisher) from adapter.py to mappers.py to keep adapter under 800-line limit - Add type/tag/confidence validation in mappers (items 9, 11, 23) - Add logging before all raise statements (items 7, 20, 21) - Fix health_check exception log level from debug to warning - Add search_shared query context to error logs - Strengthen factory.py type annotation (embedder: Mem0EmbedderConfig | None) - Parametrize config traversal tests, add ImportError/MemoryError/publish tests - Update CLAUDE.md, README, roadmap, and design spec for Mem0 adapter status
- Fix dependency review CI: add LicenseRef-scancode-protobuf, ZPL-2.1 to license allow-list and allow-dependencies-licenses for packages with null SPDX metadata (mem0ai, numpy, qdrant-client, posthog) - Fix validate_add_result blank-ID gap: check for None and whitespace IDs, not just missing key (Greptile finding) - Fix factory.py wrong event constant: use MEMORY_BACKEND_CONFIG_INVALID instead of MEMORY_BACKEND_UNKNOWN for embedder config errors (Greptile) - Fix retract() bare except: capture as exc and log error detail - Document count() limitation: capped at max_memories_per_agent - Add 37 new tests: validate_add_result (blank/None/whitespace/numeric ID, non-list results), extract_category, extract_publisher, MemoryError re-raise for all operations, blank-ID through store/retrieve, shared namespace fallback, count empty results, retract delete failure
f7d017c to
28eba7a
Compare
There was a problem hiding this comment.
Pull request overview
This PR finalizes and hardens a concrete Mem0-backed implementation of the MemoryBackend/SharedKnowledgeStore protocols, and updates configuration/factory wiring so Mem0 can be instantiated with explicitly-specified embedder settings (vendor-agnostic defaults removed).
Changes:
- Introduces Mem0 backend implementation (
adapter.py,mappers.py,config.py) with defensive parsing, health probes, shared-knowledge publish/search/retract, and structured logging. - Updates the memory factory + public exports to support creating a Mem0 backend with a required
embedderconfig. - Adds unit + integration tests covering edge cases and protocol conformance; updates docs/status text and adds
mem0aidependency.
Reviewed changes
Copilot reviewed 20 out of 23 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Adds mem0ai and transitive deps to the lockfile. |
| pyproject.toml | Adds mem0ai dependency; adds mypy ignore for mem0.*. |
| src/ai_company/memory/factory.py | Creates Mem0 backend when configured; requires embedder param. |
| src/ai_company/memory/backends/mem0/adapter.py | Implements Mem0-backed MemoryBackend + shared knowledge operations. |
| src/ai_company/memory/backends/mem0/mappers.py | Adds pure mapping/validation helpers for Mem0 dicts ↔ domain models. |
| src/ai_company/memory/backends/mem0/config.py | Adds Mem0 backend config models and config dict builder. |
| src/ai_company/memory/backends/mem0/init.py | Exports Mem0 backend/config symbols. |
| src/ai_company/memory/backends/init.py | Exposes concrete backend(s). |
| src/ai_company/memory/init.py | Re-exports Mem0 backend/config at package level. |
| src/ai_company/observability/events/memory.py | Adds config-invalid event constant. |
| tests/unit/memory/test_init.py | Updates export expectations. |
| tests/unit/memory/test_factory.py | Updates factory tests for embedder requirement and backend creation. |
| tests/unit/memory/backends/mem0/test_config.py | New tests for Mem0 config validation/building. |
| tests/unit/memory/backends/mem0/test_mappers.py | New tests for mapping/parsing/validation helpers. |
| tests/unit/memory/backends/mem0/test_adapter.py | New comprehensive adapter tests (CRUD + shared knowledge + edge cases). |
| tests/integration/memory/test_mem0_backend.py | New integration-style tests plugging adapter into retrieval pipeline. |
| docs/roadmap/index.md | Removes “Mem0 adapter” from remaining roadmap items. |
| docs/design/memory.md | Updates diagram text to mark Mem0 implemented. |
| README.md | Updates status statement to include Mem0 backend adapter. |
| CLAUDE.md | Updates memory subsystem description; notes mem0ai as optional. |
| .github/workflows/dependency-review.yml | Allows additional licenses/metadata for new deps. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ) | ||
| logger.warning(MEMORY_ENTRY_STORE_FAILED, context=context, error=msg) | ||
| raise MemoryStoreError(msg) | ||
| return NotBlankStr(str(raw_id)) |
| try: | ||
| config_dict = build_mem0_config_dict(self._mem0_config) | ||
| client = await asyncio.to_thread(Memory.from_config, config_dict) | ||
| except MemoryError, RecursionError: |
| user_id=_SHARED_NAMESPACE, | ||
| limit=1, | ||
| ) | ||
| except MemoryError, RecursionError: |
| @property | ||
| def backend_name(self) -> NotBlankStr: | ||
| """Human-readable backend identifier.""" | ||
| return NotBlankStr("mem0") |
| dependencies = [ | ||
| "aiodocker==0.26.0", | ||
| "aiosqlite==0.22.1", | ||
| "argon2-cffi==25.1.0", | ||
| "jinja2==3.1.6", | ||
| "jsonschema==4.26.0", | ||
| "litellm==1.82.1", | ||
| "litestar[standard,structlog,pydantic,brotli,prometheus]==2.21.1", | ||
| "mcp==1.26.0", | ||
| "mem0ai==1.0.5", | ||
| "pydantic==2.12.5", | ||
| "pyjwt[crypto]==2.11.0", |
| return { | ||
| "vector_store": { | ||
| "provider": "qdrant", | ||
| "config": { | ||
| "collection_name": config.collection_name, | ||
| "embedding_model_dims": config.embedder.dims, | ||
| "path": f"{config.data_dir}/qdrant", | ||
| }, | ||
| }, | ||
| "embedder": { | ||
| "provider": config.embedder.provider, | ||
| "config": { | ||
| "model": config.embedder.model, | ||
| }, | ||
| }, | ||
| "history_db_path": f"{config.data_dir}/history.db", | ||
| # Mem0 config schema version — required by Memory.from_config(). | ||
| "version": "v1.1", |
| raw_tags = [raw_tags] | ||
| elif not isinstance(raw_tags, (list, tuple)): | ||
| raw_tags = () | ||
| tags = tuple(NotBlankStr(t) for t in raw_tags if t and str(t).strip()) |
| if not raw: | ||
| return None | ||
| try: | ||
| dt = datetime.fromisoformat(raw) | ||
| except ValueError, TypeError: | ||
| logger.warning( | ||
| MEMORY_MODEL_INVALID, | ||
| field="datetime", | ||
| raw_value=raw, | ||
| reason="malformed ISO 8601 datetime, returning None", | ||
| ) | ||
| return None |
| raw_id = raw.get("id") | ||
| if raw_id is None or not str(raw_id).strip(): | ||
| msg = f"Mem0 result has missing or blank 'id': keys={list(raw.keys())}" | ||
| logger.warning( | ||
| MEMORY_MODEL_INVALID, | ||
| field="id", | ||
| raw_value=raw_id, | ||
| reason=msg, | ||
| ) | ||
| raise MemoryRetrievalError(msg) | ||
| memory_id = NotBlankStr(str(raw_id)) | ||
|
|
||
| raw_content = raw.get("memory") or raw.get("data") | ||
| if not raw_content or not str(raw_content).strip(): | ||
| msg = f"Mem0 result {raw.get('id', '?')} has empty content" | ||
| logger.warning( | ||
| MEMORY_MODEL_INVALID, | ||
| field="content", | ||
| raw_value=raw_content, | ||
| reason=msg, | ||
| ) | ||
| raise MemoryRetrievalError(msg) | ||
| content = NotBlankStr(str(raw_content)) | ||
|
|
||
| created_at = parse_mem0_datetime(raw.get("created_at")) | ||
| if created_at is None: | ||
| logger.debug( | ||
| MEMORY_MODEL_INVALID, | ||
| field="created_at", | ||
| memory_id=str(raw.get("id", "?")), | ||
| reason="missing or unparseable created_at, defaulting to now()", | ||
| ) | ||
| created_at = datetime.now(UTC) | ||
| updated_at = parse_mem0_datetime(raw.get("updated_at")) | ||
|
|
||
| raw_metadata = raw.get("metadata") | ||
| category, metadata, expires_at = parse_mem0_metadata(raw_metadata) | ||
|
|
||
| raw_score = raw.get("score") | ||
| relevance_score = normalize_relevance_score(raw_score) | ||
|
|
||
| return MemoryEntry( | ||
| id=memory_id, | ||
| agent_id=NotBlankStr(agent_id), | ||
| category=category, |
| entries = tuple( | ||
| mem0_result_to_entry(item, str(agent_id)) for item in raw_list | ||
| ) |
There was a problem hiding this comment.
Single malformed item aborts entire batch retrieval
Both retrieve() and search_shared() build the result tuple by eagerly evaluating a generator that calls mem0_result_to_entry() per item. mem0_result_to_entry raises MemoryRetrievalError for any item with a missing/blank ID or empty content. Because the exception is thrown inside the tuple() call, it propagates out of the try block and the entire retrieval fails — all valid results from the same batch are discarded.
This is particularly impactful for search_shared(): if even one stored shared-knowledge entry is malformed (due to a direct Mem0 write, data corruption, or a bug in a prior version), the whole shared knowledge store becomes completely inaccessible to all agents until the corrupt record is manually removed.
mem0_result_to_entry already logs a warning before raising, so a try/except skip-and-warn pattern at the call site would preserve that observability while returning partial results instead of failing completely:
# retrieve() – lines 333–335
entries_list: list[MemoryEntry] = []
for item in raw_list:
try:
entries_list.append(mem0_result_to_entry(item, str(agent_id)))
except MemoryRetrievalError:
pass # warning already logged inside mem0_result_to_entry
entries = tuple(entries_list)The same pattern should be applied to search_shared() lines 659–665.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/memory/backends/mem0/adapter.py
Line: 333-335
Comment:
**Single malformed item aborts entire batch retrieval**
Both `retrieve()` and `search_shared()` build the result tuple by eagerly evaluating a generator that calls `mem0_result_to_entry()` per item. `mem0_result_to_entry` raises `MemoryRetrievalError` for any item with a missing/blank ID or empty content. Because the exception is thrown inside the `tuple()` call, it propagates out of the `try` block and the entire retrieval fails — all valid results from the same batch are discarded.
This is particularly impactful for `search_shared()`: if even one stored shared-knowledge entry is malformed (due to a direct Mem0 write, data corruption, or a bug in a prior version), the whole shared knowledge store becomes **completely inaccessible** to all agents until the corrupt record is manually removed.
`mem0_result_to_entry` already logs a warning before raising, so a try/except skip-and-warn pattern at the call site would preserve that observability while returning partial results instead of failing completely:
```python
# retrieve() – lines 333–335
entries_list: list[MemoryEntry] = []
for item in raw_list:
try:
entries_list.append(mem0_result_to_entry(item, str(agent_id)))
except MemoryRetrievalError:
pass # warning already logged inside mem0_result_to_entry
entries = tuple(entries_list)
```
The same pattern should be applied to `search_shared()` lines 659–665.
How can I resolve this? If you propose a fix, please make it concise.| if ( | ||
| query.min_relevance > 0.0 | ||
| and entry.relevance_score is not None | ||
| and entry.relevance_score < query.min_relevance | ||
| ): | ||
| continue |
There was a problem hiding this comment.
Unscored entries bypass min_relevance filter silently
The short-circuit on line 313 (and entry.relevance_score is not None) means that any entry without a relevance score — which includes all results from get_all() since Mem0 only computes scores during vector search — will pass the min_relevance filter unconditionally, regardless of how high the caller sets it.
A caller who invokes retrieve() without query.text (which takes the get_all path) and also sets query.min_relevance = 0.5 expecting to restrict results to high-confidence hits will silently receive all stored entries. The current behaviour is defensible ("can't filter what wasn't scored"), but since it diverges from what the parameter name implies and is not documented, it is likely to produce confusing results in practice.
At minimum the docstring for apply_post_filters should document that min_relevance is a no-op for entries that were not retrieved via vector search.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/memory/backends/mem0/mappers.py
Line: 311-316
Comment:
**Unscored entries bypass `min_relevance` filter silently**
The short-circuit on line 313 (`and entry.relevance_score is not None`) means that any entry without a relevance score — which includes **all** results from `get_all()` since Mem0 only computes scores during vector search — will pass the `min_relevance` filter unconditionally, regardless of how high the caller sets it.
A caller who invokes `retrieve()` without `query.text` (which takes the `get_all` path) and also sets `query.min_relevance = 0.5` expecting to restrict results to high-confidence hits will silently receive all stored entries. The current behaviour is defensible ("can't filter what wasn't scored"), but since it diverges from what the parameter name implies and is not documented, it is likely to produce confusing results in practice.
At minimum the docstring for `apply_post_filters` should document that `min_relevance` is a no-op for entries that were not retrieved via vector search.
How can I resolve this? If you propose a fix, please make it concise.…indings Root cause of round-2 lychee CI failure: lycheeverse/lychee-action v2.8.0's install step does not handle lychee v0.24.x's new nested archive layout. The action's #330 fix (merged 2026-04-24) has not shipped a v2.9 release yet (issue #332 open). Our local install_cli_tools.sh already handles the nested archive correctly, so the cleanest fix is to drop the action entirely and use the script in CI. Removes drift: the script (local pre-push hook) and CI now install the byte-identical binary from one Renovate-tracked version pin. Eliminates upstream-action release-cadence coupling. Changes: - scripts/install_cli_tools.sh: positional subcommand dispatcher (all | lychee | golangci-lint, default all). PATH-staleness check after both installs (CodeRabbit major actionable + AI-agent inline prompt): if the PATH-resolved binary differs from the install_dir binary AND its --version disagrees with the pin, fail fast with a 'fix your PATH or remove the stale binary' hint. SC2064 disabled on the EXIT trap with rationale comment (early expansion is intentional since the tmpdir is function-local). - .github/workflows/lychee.yml: replaced lycheeverse/lychee-action step with three steps: install via bash scripts/install_cli_tools.sh lychee (LYCHEE_INSTALL_DIR at step-level env per actionlint), echo to GITHUB_PATH, then run lychee directly. --format markdown teed into GITHUB_STEP_SUMMARY preserves the action's job-summary feature. Path-filter widened to include scripts/install_cli_tools.sh so the workflow re-runs when the install changes. - renovate.json: reverted the version|lycheeVersion regex alternation (no workflow uses lycheeVersion: now that the action is gone). - docs/guides/dynamic-scoring.md: fixed misleading cross-reference (CodeRabbit minor). scoring-hyperparameters.md is operator-tunable weights / thresholds, not an architecture overview; the sentence now says exactly that instead of pretending it documents composite/weighted/ranked/multi-objective strategies.
…indings Root cause of round-2 lychee CI failure: lycheeverse/lychee-action v2.8.0's install step does not handle lychee v0.24.x's new nested archive layout. The action's #330 fix (merged 2026-04-24) has not shipped a v2.9 release yet (issue #332 open). Our local install_cli_tools.sh already handles the nested archive correctly, so the cleanest fix is to drop the action entirely and use the script in CI. Removes drift: the script (local pre-push hook) and CI now install the byte-identical binary from one Renovate-tracked version pin. Eliminates upstream-action release-cadence coupling. Changes: - scripts/install_cli_tools.sh: positional subcommand dispatcher (all | lychee | golangci-lint, default all). PATH-staleness check after both installs (CodeRabbit major actionable + AI-agent inline prompt): if the PATH-resolved binary differs from the install_dir binary AND its --version disagrees with the pin, fail fast with a 'fix your PATH or remove the stale binary' hint. SC2064 disabled on the EXIT trap with rationale comment (early expansion is intentional since the tmpdir is function-local). - .github/workflows/lychee.yml: replaced lycheeverse/lychee-action step with three steps: install via bash scripts/install_cli_tools.sh lychee (LYCHEE_INSTALL_DIR at step-level env per actionlint), echo to GITHUB_PATH, then run lychee directly. --format markdown teed into GITHUB_STEP_SUMMARY preserves the action's job-summary feature. Path-filter widened to include scripts/install_cli_tools.sh so the workflow re-runs when the install changes. - renovate.json: reverted the version|lycheeVersion regex alternation (no workflow uses lycheeVersion: now that the action is gone). - docs/guides/dynamic-scoring.md: fixed misleading cross-reference (CodeRabbit minor). scoring-hyperparameters.md is operator-tunable weights / thresholds, not an architecture overview; the sentence now says exactly that instead of pretending it documents composite/weighted/ranked/multi-objective strategies.
Summary
Mem0EmbedderConfig—providerandmodelare now required fields (no"openai"/"text-embedding-3-small"defaults), enforcing the vendor-agnostic ruleImportErrorhandling, real health check probes, and immutable metadata constructionlogger.exceptionwithlogger.warningfor self-raised errors, add structurederror/error_typekwargs to all re-raise logsembedderconfig parameter (no vendor defaults baked in)store_request_to_mem0_args) and extract_validate_add_resulthelperChanges
backends/mem0/config.pyMem0EmbedderConfigfields now required;Mem0BackendConfigpath traversal validator;build_config_from_company_configtakesembedderkwargbackends/mem0/adapter.pyImportErrorbranch, health probe, publisher check in retract, generator expression insearch_shared, meaningful variable names, defensive_extract_category,_validate_add_resulthelperbackends/mem0/mappers.pytry/exceptinparse_mem0_datetimeandparse_mem0_metadata; ID/content validation inmem0_result_to_entry; remove deadstore_request_to_mem0_argsmemory/factory.pyembedderparam, validate type, raiseMemoryConfigErrorif missing for mem0 backendmemory/__init__.pyMem0EmbedderConfigTest plan
uv run ruff check src/ tests/— passesuv run ruff format src/ tests/— cleanuv run mypy src/ tests/— no errorsuv run pytest tests/ -n auto --cov=ai_company --cov-fail-under=80— 7213 passed, 94.59% coverageReview coverage
Pre-reviewed by 9 agents (code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, docs-consistency). 33 findings identified, 32 implemented, 1 deferred (
_client: Any— Mem0 has no typed interface).Closes #206