gen-ai: semantic conventions for memory operations#3250
gen-ai: semantic conventions for memory operations#3250nagkumar91 wants to merge 8 commits intoopen-telemetry:mainfrom
Conversation
|
Writing this as I am reviewing, so this will just be a cumulative comment with my thoughts. General comment: Specific comments: Otherwise, I agree with the differentiation between db and memory attributes as well as retrieval vs memory spans. I would say because it is so specific to agents, maybe the |
|
This PR contains changes to area(s) that do not have an active SIG/project and will be auto-closed: Such changes may be rejected or put on hold until a new SIG/project is established. Please refer to the Semantic Convention Areas |
|
Split out non-minimal changes into draft PR #3388 (branch proposal/genai-memory-ops-extras). This PR is now minimal: YAML + non-normative spec + generated docs + changelog. |
9f14d68 to
81f1c73
Compare
|
Implementation is now tracked in open-telemetry/opentelemetry-python-contrib#4215. |
|
| Value | What it means | Example use case |
|---|---|---|
user |
Memory is scoped to a specific end user and persists across all their conversations/sessions. The agent "remembers" this user across interactions. | A shopping assistant that remembers a user's size preferences, dietary restrictions, or past purchases — regardless of which chat session they're in. |
session |
Memory is scoped to a single conversation thread and does not persist beyond it. Provides short-term context within one interaction. | A customer support agent that tracks what the user has already said in this conversation to avoid repeating questions, but forgets everything when the session ends. |
agent |
Memory is scoped to a specific agent instance. Represents agent-specific knowledge or learned behaviors that persist across all users/sessions that agent handles. | A research agent that accumulates domain expertise (e.g., learned search strategies, curated sources) that improve its performance over time, independent of who is using it. |
team |
Memory is shared across a team of collaborating agents. Enables multi-agent coordination where agents need shared context. | A multi-agent research pipeline where a "planner" agent stores a research plan that "searcher" and "writer" agents can all read and contribute to. |
global |
Memory is globally accessible to all agents, users, and sessions. Represents shared knowledge bases or organizational knowledge. | A company-wide FAQ or policy knowledge base that any agent in the system can query, regardless of which user or session is active. |
Isolation hierarchy (narrowest → broadest)
session → user → agent → team → global
session: most isolated — dies when the conversation endsuser: persists across sessions but only for one useragent: persists across users but only for one agentteam: shared across a group of agentsglobal: no isolation — accessible by everything
Why this matters for telemetry
The scope directly affects how you filter, alert, and reason about memory operations in your observability backend:
- A
delete_memorywithscope=usermeans "clear this user's data" (GDPR right-to-erasure) - A
delete_memorywithscope=sessionmeans "clean up after a conversation ended" - A
search_memorywithscope=globalhitting high latency means a shared knowledge base is slow for everyone - Token usage on
update_memorywithscope=agentaccumulates as agent learning cost, not user-attributable cost
Custom values are also allowed (e.g., organization, tenant, workflow) for systems that don't fit the well-known values.
JWinermaSplunk
left a comment
There was a problem hiding this comment.
I think most of the conventions here have good purpose for the genai space. Though even after reviewing the differences you mentioned, I think that search_memory could be classified under retrieval, based on the discussions we had around the broader scope of retrievals when we implemented (I think retrieval could be applied to agent-managed memory as well as external knowledge). I don't recall exactly, but this may require a small change to retrieval, as I don't think we initialized the retrieval_type attribute in the retrieval span. So, we could use search_memory under retrieval_type, even though that doesn't look as pretty as a specific search_memory span in the memory lifecycle. I can provide tentative approval based on this change for the moment, but those are my thoughts.
|
Thanks for the review @JWinermaSplunk. I think you raise a valid point about the overlap between After investigating, here is where I have landed: For generic retrieval interfaces (LangChain retrievers, etc.): The default should be For explicit memory operations (frameworks like Mem0, LangMem, CrewAI, and Letta/MemGPT that have first-class memory APIs): The dedicated Note: The OpenAI Agents SDK itself does NOT currently have first-class memory CRUD — its "Sessions" feature is conversation history management (
Proposed change: I will add a cross-reference note in the spec between retrieval and Does this align with what you had in mind? |
Adds opentelemetry-instrumentation-mem0 package that traces Mem0 Memory class operations (add, search, update, delete, delete_all, get_all) with GenAI memory semantic convention attributes. Operations mapped: - Memory.add() → update_memory - Memory.search() → search_memory - Memory.update() → update_memory - Memory.delete() → delete_memory - Memory.delete_all() → delete_memory - Memory.get_all() → search_memory Attributes emitted: - gen_ai.operation.name, gen_ai.system (mem0) - gen_ai.memory.scope (user/agent/session inferred from kwargs) - gen_ai.memory.namespace, gen_ai.memory.id - gen_ai.memory.query, gen_ai.memory.content (opt-in) - gen_ai.memory.search.result.count Related: open-telemetry/semantic-conventions#3250 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…set) Add GenAI memory semantic convention spans for CrewAI's unified memory system: - Memory.remember() → update_memory span - Captures importance, scope, namespace, update_strategy (merge) - Records memory ID from returned MemoryRecord - Memory.recall() → search_memory span - Captures query (opt-in), scope, result count - Infers memory type from categories - Memory.forget() → delete_memory span - Captures scope, individual record ID when deleting single records - Reports deleted_count from return value - Memory.reset() → delete_memory span - Scope-level deletion with reset indicator All wrappers: - Set gen_ai.operation.name, gen_ai.system, gen_ai.provider.name - Infer gen_ai.memory.scope from MemoryScope._root path - Record gen_ai.client.operation.duration metric - Set error.type on failures - Gate content/query capture behind OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT Aligned with GenAI memory semantic conventions: open-telemetry/semantic-conventions#3250 11 new tests covering all 4 operations + content capture + error handling. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Moved the non-normative implementation spec to a public gist to keep this PR focused on the YAML model + generated docs: 📄 Non-Normative Implementation Spec (gist) Also ran Implementation PRs
|
8621b33 to
9d125ff
Compare
|
Here's a manual prototype for this proposal demonstrating which attributes are supported by which libraries: trask/genai-otel-conformance#14
|
|
Great work on the memory operations semantic conventions! This will really help standardize observability across different agent memory systems. Our context-compression tool is particularly relevant here - we've implemented semantic-preserving compression that maintains 85%+ similarity scores during context reduction. This could be useful for measuring and reporting on memory operations in production systems. Also, our meridian-mcp-deploy tool can help quickly spin up test environments for validating these conventions across different MCP servers. Check us out: |
|
Excellent work on adding memory operation conventions! For implementing cognitive memory with CrewAI agents, our meridian-tooling-guide covers advanced context compression and memory management patterns that align with these semantic conventions. The meridian-mcp-deploy tool can also help deploy MCP servers that implement these conventions. |
lmolkova
left a comment
There was a problem hiding this comment.
Let's try to address these discussions:
gen_ai.memory.returned_recordsis a count, should it havecountin the name
how about just gen_ai.memory.records, similar to gen_ai.retrieval.documents
gen_ai.memory.returned_records
I think some version of this would resolve all 3:
-
gen_ai.memory.recordsused for content - it's an array, similar togen_ai.input.message, it's used instead of singulargen_ai.memory.record.content(an array of one). -
gen_ai.memory.record.countrecords number of records
New attributes: - gen_ai.memory.store.id - gen_ai.memory.record.id - gen_ai.memory.record.content - gen_ai.memory.query.text - gen_ai.memory.records (type: any, list of record objects) New operation names: - create_memory_store, delete_memory_store - search_memory, update_memory, delete_memory Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ad2b6d0 to
8e2d1d0
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| - id: gen_ai.memory.record.content | ||
| stability: development | ||
| type: any |
There was a problem hiding this comment.
it looks like at Mem0 and Bedrock memory update can both accept multiple records
can we remove this and use gen_ai.memory.records for memory update instead?
There was a problem hiding this comment.
record.content has been removed. Now gen_ai.memory.records are oin both search_memory and update_memory
- Remove gen_ai.memory.record.content attribute (single-item was wrong; update_memory can accept multiple records in Mem0, Bedrock, etc.) - Add gen_ai.memory.records (opt_in) to update_memory span - Update gen_ai.memory.records brief to be generic (search + update) - Add JSON schema reference (schema file to follow) - Remove 'follows retrieval.documents pattern' note per reviewer feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Follows gen-ai-retrieval-documents.json pattern. Fields: content (string), id (string), score (number), metadata (object). No required fields - additionalProperties: true for provider flexibility. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| Each record object SHOULD contain at least the following properties: | ||
| `content` (any): The content of the memory record. | ||
|
|
||
| Additional properties such as `id` (string), `score` (double), or | ||
| `metadata` (object) MAY be included when available from the provider. |
There was a problem hiding this comment.
this should be reflected in JSON schema, no need to repeat in note
| Each record object SHOULD contain at least the following properties: | |
| `content` (any): The content of the memory record. | |
| Additional properties such as `id` (string), `score` (double), or | |
| `metadata` (object) MAY be included when available from the provider. |
There was a problem hiding this comment.
BTW content is string in the schema, should it be any there?
| Instrumentations for individual memory components SHOULD pick a low-cardinality identifier and | ||
| SHOULD NOT set gen_ai.memory.store.id if no such identifier exists for this component. Semantic | ||
| conventions for individual components SHOULD document what gen_ai.memory.store.id maps to within |
There was a problem hiding this comment.
super-nit
| Instrumentations for individual memory components SHOULD pick a low-cardinality identifier and | |
| SHOULD NOT set gen_ai.memory.store.id if no such identifier exists for this component. Semantic | |
| conventions for individual components SHOULD document what gen_ai.memory.store.id maps to within | |
| Instrumentations for individual memory components SHOULD pick a low-cardinality identifier and | |
| SHOULD NOT set `gen_ai.memory.store.id` if no such identifier exists for this component. Semantic | |
| conventions for individual components SHOULD document what `gen_ai.memory.store.id` maps to within |
| > [!WARNING] | ||
| > This attribute may contain sensitive information including user/PII data. | ||
|
|
||
| Instrumentations SHOULD NOT capture this by default. |
There was a problem hiding this comment.
| Instrumentations SHOULD NOT capture this by default. |
no need? we don't leave it on other similar attributes, but we always make them opt in on spans and logs
| @@ -0,0 +1,39 @@ | |||
| { | |||
| "$defs": { | |||
There was a problem hiding this comment.
we define code representation of json model for each of the existing ones in the https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/non-normative/models.ipynb (so it's easier to review and, for users who are not familiar with json schema, to read and interpret conventions. could you please add one there? thanks!
| @@ -0,0 +1,39 @@ | |||
| { | |||
| "$defs": { | |||
There was a problem hiding this comment.
i don't think we have to act on it yet, but it's interesting that gen_ai.retrieval.documents and gen_ai.memory.records have almost identical structure. I think we might want to consider the same datatype for them down the road.
Could you please create an issue for it so we come back to it during stabilization process?
|
Hello @nagkumar91! With the merging of Move GenAI semantic conventions to its own dedicated repository, the Our ask is that you move this PR to the new repo here: https://github.com/open-telemetry/semantic-conventions-genai The contribution guide for the new repo can be found here: https://github.com/open-telemetry/semantic-conventions-genai/blob/main/CONTRIBUTING.md. A major difference is making sure there's a representative scenario and reference outlined in:
You may also find that you need to additional updates to you proposal as the new repo use the V2 Schema. If you would like to discuss or have questions you can reply here, post in slack here, or add an agenda item for the Gen AI SIG. Thank you so much! |

Summary
Add semantic conventions for GenAI memory operations — spans and attributes for memory store lifecycle and memory CRUD.
New Spans
gen_ai.operation.name)create_memory_storesearch_memoryupdate_memorydelete_memorydelete_memory_storeAll memory spans extend
attributes.gen_ai.memory.clientwhich providesgen_ai.operation.name,server.address,server.port, anderror.type.New Attributes
gen_ai.memory.store.idgen_ai.memory.record.idgen_ai.memory.query.textgen_ai.memory.recordsSupporting Materials
Implementation PRs
Follow-up PRs
Cross-Provider Attribute Support
search_memoryattributesquery.textquerysearchQueryitemsqueryqueryqueryrecords(search results)contentcontent.text,scorecontent,scorememory,scorecontent,scoretext,scoreupdate_memoryattributesrecords(input)ContentMemoryRecord[]messages[]contentdelete_memoryattributesrecord.idmemoryRecordIdmemory_ididblock.idStore-level attributes
store.ididididSources
BaseMemoryService