Skip to content

gen-ai: semantic conventions for memory operations#3250

Open
nagkumar91 wants to merge 8 commits intoopen-telemetry:mainfrom
nagkumar91:proposal/genai-memory-ops
Open

gen-ai: semantic conventions for memory operations#3250
nagkumar91 wants to merge 8 commits intoopen-telemetry:mainfrom
nagkumar91:proposal/genai-memory-ops

Conversation

@nagkumar91
Copy link
Copy Markdown
Contributor

@nagkumar91 nagkumar91 commented Jan 6, 2026

Summary

Add semantic conventions for GenAI memory operations — spans and attributes for memory store lifecycle and memory CRUD.

New Spans

Span (gen_ai.operation.name) Description
create_memory_store Create/initialize a memory store
search_memory Search/retrieve memories by query
update_memory Add/update memories (upsert semantics)
delete_memory Delete memory records
delete_memory_store Delete an entire memory store

All memory spans extend attributes.gen_ai.memory.client which provides gen_ai.operation.name, server.address, server.port, and error.type.

New Attributes

Attribute Type Spans Req Level Description
gen_ai.memory.store.id string all conditionally_required Unique identifier of the memory store
gen_ai.memory.record.id string delete_memory conditionally_required Unique identifier of a single memory record
gen_ai.memory.query.text string search_memory opt_in Query string for search operations (PII-sensitive)
gen_ai.memory.records any search_memory, update_memory opt_in List of memory records stored or retrieved (PII-sensitive). Follows JSON schema

Supporting Materials

Resource Description
📄 Non-Normative Implementation Spec Detailed spec with framework mappings, trace examples, and rationale

Implementation PRs

PR Framework Status
opentelemetry-python-contrib#4252 Mem0 (full memory CRUD) Open
traceloop/openllmetry#3713 CrewAI (remember/recall/forget/reset) Open

Follow-up PRs

PR Description
#3569 Add optional memory attributes (store.name, scope, similarity.threshold, expiration_date, agent.id, conversation.id)

Cross-Provider Attribute Support

search_memory attributes

Attribute Google ADK AWS Bedrock Azure Mem0 CrewAI Letta
query.text query searchQuery items query query query
records (search results) content content.text, score content, score memory, score content, score text, score

update_memory attributes

Attribute Google ADK AWS Bedrock Azure Mem0 CrewAI Letta
records (input) ✅ single Content ✅ batch MemoryRecord[] ✅ single record ✅ batch messages[] ✅ single content ✅ single block

delete_memory attributes

Attribute Google ADK AWS Bedrock Azure Mem0 CrewAI Letta
record.id memoryRecordId memory_id id block.id

Store-level attributes

Attribute Google ADK AWS Bedrock Azure Mem0 CrewAI Letta
store.id id id id

Sources

  1. Google ADKMemory docs · BaseMemoryService
  2. AWS Bedrock AgentCoreQuickstart · Data-plane API
  3. Azure AI FoundryMemory concepts · Memory how-to
  4. Mem0Documentation · Python client
  5. CrewAIMemory concepts · Unified memory
  6. Letta/MemGPTDocumentation · Agent API

@github-actions github-actions Bot added enhancement New feature or request area:gen-ai labels Jan 7, 2026
@nagkumar91 nagkumar91 changed the title gen-ai: non-normative memory operations proposal gen-ai: semantic conventions for memory operations Jan 7, 2026
@nagkumar91 nagkumar91 marked this pull request as ready for review January 26, 2026 16:38
@nagkumar91 nagkumar91 requested review from a team as code owners January 26, 2026 16:38
@JWinermaSplunk
Copy link
Copy Markdown
Contributor

Writing this as I am reviewing, so this will just be a cumulative comment with my thoughts.

General comment:
This is a relatively large PR (I know that may be because of the examples you also provided, and it would be nice to see how large it is without them), but it might be better to split into smaller, separate PRs, maybe by span type for example?

Specific comments:
It would be helpful to provide some precedent for some of these attributes in prototypes i.e. existing instrumentation with enums where these values are defined for gen_ai.memory.scope, gen_ai.memory.type, gen_ai.memory.update.strategy, etc. Should gen_ai.memory.query also not be captured by default, similar to gen_ai.memory.content?

Otherwise, I agree with the differentiation between db and memory attributes as well as retrieval vs memory spans. I would say because it is so specific to agents, maybe the agent_memory namespace may be more appropriate than just memory? But, that's just a thought.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 4, 2026

This PR contains changes to area(s) that do not have an active SIG/project and will be auto-closed:

Such changes may be rejected or put on hold until a new SIG/project is established.

Please refer to the Semantic Convention Areas
document to see the current active SIGs and also to learn how to kick start a new one.

@nagkumar91
Copy link
Copy Markdown
Contributor Author

Split out non-minimal changes into draft PR #3388 (branch proposal/genai-memory-ops-extras). This PR is now minimal: YAML + non-normative spec + generated docs + changelog.

@lmolkova lmolkova reopened this Feb 5, 2026
@nagkumar91 nagkumar91 force-pushed the proposal/genai-memory-ops branch from 9f14d68 to 81f1c73 Compare February 6, 2026 18:23
Comment thread model/gen-ai/spans.yaml
@nagkumar91
Copy link
Copy Markdown
Contributor Author

Implementation is now tracked in open-telemetry/opentelemetry-python-contrib#4215.

@nagkumar91
Copy link
Copy Markdown
Contributor Author

gen_ai.memory.scope — Values and What Each Means

Addressing the ask for more detail on the possible values. The current well-known values for gen_ai.memory.scope and what each represents:

Value What it means Example use case
user Memory is scoped to a specific end user and persists across all their conversations/sessions. The agent "remembers" this user across interactions. A shopping assistant that remembers a user's size preferences, dietary restrictions, or past purchases — regardless of which chat session they're in.
session Memory is scoped to a single conversation thread and does not persist beyond it. Provides short-term context within one interaction. A customer support agent that tracks what the user has already said in this conversation to avoid repeating questions, but forgets everything when the session ends.
agent Memory is scoped to a specific agent instance. Represents agent-specific knowledge or learned behaviors that persist across all users/sessions that agent handles. A research agent that accumulates domain expertise (e.g., learned search strategies, curated sources) that improve its performance over time, independent of who is using it.
team Memory is shared across a team of collaborating agents. Enables multi-agent coordination where agents need shared context. A multi-agent research pipeline where a "planner" agent stores a research plan that "searcher" and "writer" agents can all read and contribute to.
global Memory is globally accessible to all agents, users, and sessions. Represents shared knowledge bases or organizational knowledge. A company-wide FAQ or policy knowledge base that any agent in the system can query, regardless of which user or session is active.

Isolation hierarchy (narrowest → broadest)

session → user → agent → team → global
  • session: most isolated — dies when the conversation ends
  • user: persists across sessions but only for one user
  • agent: persists across users but only for one agent
  • team: shared across a group of agents
  • global: no isolation — accessible by everything

Why this matters for telemetry

The scope directly affects how you filter, alert, and reason about memory operations in your observability backend:

  • A delete_memory with scope=user means "clear this user's data" (GDPR right-to-erasure)
  • A delete_memory with scope=session means "clean up after a conversation ended"
  • A search_memory with scope=global hitting high latency means a shared knowledge base is slow for everyone
  • Token usage on update_memory with scope=agent accumulates as agent learning cost, not user-attributable cost

Custom values are also allowed (e.g., organization, tenant, workflow) for systems that don't fit the well-known values.

Copy link
Copy Markdown
Contributor

@JWinermaSplunk JWinermaSplunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of the conventions here have good purpose for the genai space. Though even after reviewing the differences you mentioned, I think that search_memory could be classified under retrieval, based on the discussions we had around the broader scope of retrievals when we implemented (I think retrieval could be applied to agent-managed memory as well as external knowledge). I don't recall exactly, but this may require a small change to retrieval, as I don't think we initialized the retrieval_type attribute in the retrieval span. So, we could use search_memory under retrieval_type, even though that doesn't look as pretty as a specific search_memory span in the memory lifecycle. I can provide tentative approval based on this change for the moment, but those are my thoughts.

@nagkumar91
Copy link
Copy Markdown
Contributor Author

nagkumar91 commented Feb 23, 2026

Thanks for the review @JWinermaSplunk. I think you raise a valid point about the overlap between search_memory and retrieval, especially for frameworks like LangChain where the retriever abstraction does not distinguish between memory and external knowledge.

After investigating, here is where I have landed:

For generic retrieval interfaces (LangChain retrievers, etc.): The default should be retrieval. When the retriever is known to be a memory retriever (via metadata like memory_store_name), the instrumentation can supplement the span with gen_ai.memory.* attributes. This avoids incorrectly classifying every retriever as a memory operation.

For explicit memory operations (frameworks like Mem0, LangMem, CrewAI, and Letta/MemGPT that have first-class memory APIs): The dedicated search_memory operation still makes sense because it is part of a full CRUD lifecycle (search_memory, update_memory, delete_memory, create_memory_store, delete_memory_store). These frameworks all expose search/add/update/delete as distinct memory operations — pulling just the read operation into retrieval while write/delete remain under memory would break lifecycle coherence and make it harder to correlate a memory search with a subsequent update or deletion in traces.

Note: The OpenAI Agents SDK itself does NOT currently have first-class memory CRUD — its "Sessions" feature is conversation history management (get_items/add_items/clear_session), not semantic memory. For actual memory, users integrate external services (Mem0, etc.) via @function_tool. The frameworks that DO have explicit memory lifecycle APIs include:

Framework Search Create/Update Delete Store Mgmt
Mem0 client.search(query, filters) client.add(messages) client.delete(id), batch_delete(), delete_all() Managed service
LangMem create_search_memory_tool() create_manage_memory_tool() create_manage_memory_tool() Via BaseStore
CrewAI memory.recall(query) memory.remember(content) memory.forget(scope) LongTermMemory
Letta/MemGPT blocks.retrieve(), passages.search() blocks.create/update() blocks.delete() Block + archival

Proposed change: I will add a cross-reference note in the spec between retrieval and search_memory documenting that systems using a generic retrieval interface for memory (like LangChain's retriever) MAY use retrieval with supplemental gen_ai.memory.* attributes when they cannot distinguish memory from external knowledge. I will also update the implementation PR (open-telemetry/opentelemetry-python-contrib#4215) so LangChain retrievers default to retrieval and only use search_memory when metadata explicitly indicates a memory retriever.

Does this align with what you had in mind?

nagkumar91 added a commit to nagkumar91/opentelemetry-python-contrib that referenced this pull request Feb 23, 2026
Adds opentelemetry-instrumentation-mem0 package that traces Mem0 Memory
class operations (add, search, update, delete, delete_all, get_all) with
GenAI memory semantic convention attributes.

Operations mapped:
- Memory.add() → update_memory
- Memory.search() → search_memory
- Memory.update() → update_memory
- Memory.delete() → delete_memory
- Memory.delete_all() → delete_memory
- Memory.get_all() → search_memory

Attributes emitted:
- gen_ai.operation.name, gen_ai.system (mem0)
- gen_ai.memory.scope (user/agent/session inferred from kwargs)
- gen_ai.memory.namespace, gen_ai.memory.id
- gen_ai.memory.query, gen_ai.memory.content (opt-in)
- gen_ai.memory.search.result.count

Related: open-telemetry/semantic-conventions#3250

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
nagkumar91 added a commit to nagkumar91/openllmetry that referenced this pull request Feb 23, 2026
…set)

Add GenAI memory semantic convention spans for CrewAI's unified memory system:

- Memory.remember() → update_memory span
  - Captures importance, scope, namespace, update_strategy (merge)
  - Records memory ID from returned MemoryRecord

- Memory.recall() → search_memory span
  - Captures query (opt-in), scope, result count
  - Infers memory type from categories

- Memory.forget() → delete_memory span
  - Captures scope, individual record ID when deleting single records
  - Reports deleted_count from return value

- Memory.reset() → delete_memory span
  - Scope-level deletion with reset indicator

All wrappers:
- Set gen_ai.operation.name, gen_ai.system, gen_ai.provider.name
- Infer gen_ai.memory.scope from MemoryScope._root path
- Record gen_ai.client.operation.duration metric
- Set error.type on failures
- Gate content/query capture behind OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT

Aligned with GenAI memory semantic conventions:
open-telemetry/semantic-conventions#3250

11 new tests covering all 4 operations + content capture + error handling.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@nagkumar91
Copy link
Copy Markdown
Contributor Author

Moved the non-normative implementation spec to a public gist to keep this PR focused on the YAML model + generated docs:

📄 Non-Normative Implementation Spec (gist)

Also ran make generate-all — all generated markdown is in sync with the YAML model. The check-policies violation (openai.api.type backward compatibility) is pre-existing on main and unrelated to this PR.

Implementation PRs

PR Framework Status
opentelemetry-python-contrib#4215 LangChain (retrieval + memory detection) Open
opentelemetry-python-contrib#4252 Mem0 (full memory CRUD) Open
traceloop/openllmetry#3713 CrewAI (remember/recall/forget/reset) Open

Comment thread docs/non-normative/rpc-migration.md
Comment thread model/gen-ai/spans.yaml
@nagkumar91 nagkumar91 requested review from a team as code owners February 24, 2026 14:50
@nagkumar91 nagkumar91 force-pushed the proposal/genai-memory-ops branch from 8621b33 to 9d125ff Compare April 2, 2026 01:05
@trask
Copy link
Copy Markdown
Member

trask commented Apr 2, 2026

Here's a manual prototype for this proposal demonstrating which attributes are supported by which libraries:

trask/genai-otel-conformance#14

image

Comment thread model/gen-ai/spans.yaml
Comment thread model/gen-ai/spans.yaml
Comment thread model/gen-ai/spans.yaml Outdated
Comment thread model/gen-ai/spans.yaml Outdated
Comment thread model/gen-ai/registry.yaml Outdated
Comment thread model/gen-ai/registry.yaml Outdated
@lmolkova lmolkova moved this from Awaiting codeowners approval to Needs More Approval in Semantic Conventions Triage Apr 6, 2026
Comment thread model/gen-ai/registry.yaml Outdated
Comment thread model/gen-ai/registry.yaml Outdated
@meridianmindx
Copy link
Copy Markdown

Great work on the memory operations semantic conventions! This will really help standardize observability across different agent memory systems.

Our context-compression tool is particularly relevant here - we've implemented semantic-preserving compression that maintains 85%+ similarity scores during context reduction. This could be useful for measuring and reporting on memory operations in production systems.

Also, our meridian-mcp-deploy tool can help quickly spin up test environments for validating these conventions across different MCP servers.

Check us out:

@meridianmindx
Copy link
Copy Markdown

Excellent work on adding memory operation conventions! For implementing cognitive memory with CrewAI agents, our meridian-tooling-guide covers advanced context compression and memory management patterns that align with these semantic conventions. The meridian-mcp-deploy tool can also help deploy MCP servers that implement these conventions.

Comment thread model/gen-ai/spans.yaml
Copy link
Copy Markdown
Member

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to address these discussions:

gen_ai.memory.returned_records is a count, should it have count in the name

how about just gen_ai.memory.records, similar to gen_ai.retrieval.documents
gen_ai.memory.returned_records

I think some version of this would resolve all 3:

  • gen_ai.memory.records used for content - it's an array, similar to gen_ai.input.message, it's used instead of singular gen_ai.memory.record.content (an array of one).

  • gen_ai.memory.record.count records number of records

Comment thread model/gen-ai/registry.yaml Outdated
New attributes:
- gen_ai.memory.store.id
- gen_ai.memory.record.id
- gen_ai.memory.record.content
- gen_ai.memory.query.text
- gen_ai.memory.records (type: any, list of record objects)

New operation names:
- create_memory_store, delete_memory_store
- search_memory, update_memory, delete_memory

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@nagkumar91 nagkumar91 force-pushed the proposal/genai-memory-ops branch from ad2b6d0 to 8e2d1d0 Compare April 14, 2026 00:30
nagkumar91 and others added 2 commits April 13, 2026 17:32
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread model/gen-ai/registry.yaml Outdated
Comment on lines +708 to +710
- id: gen_ai.memory.record.content
stability: development
type: any
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like at Mem0 and Bedrock memory update can both accept multiple records

can we remove this and use gen_ai.memory.records for memory update instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

record.content has been removed. Now gen_ai.memory.records are oin both search_memory and update_memory

Comment thread model/gen-ai/registry.yaml
nagkumar91 and others added 2 commits April 14, 2026 07:35
- Remove gen_ai.memory.record.content attribute (single-item was wrong;
  update_memory can accept multiple records in Mem0, Bedrock, etc.)
- Add gen_ai.memory.records (opt_in) to update_memory span
- Update gen_ai.memory.records brief to be generic (search + update)
- Add JSON schema reference (schema file to follow)
- Remove 'follows retrieval.documents pattern' note per reviewer feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Follows gen-ai-retrieval-documents.json pattern.
Fields: content (string), id (string), score (number), metadata (object).
No required fields - additionalProperties: true for provider flexibility.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread model/gen-ai/registry.yaml
nagkumar91 and others added 2 commits April 14, 2026 08:53
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few minor comments, looks great otherwise!

Comment on lines +743 to +747
Each record object SHOULD contain at least the following properties:
`content` (any): The content of the memory record.

Additional properties such as `id` (string), `score` (double), or
`metadata` (object) MAY be included when available from the provider.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be reflected in JSON schema, no need to repeat in note

Suggested change
Each record object SHOULD contain at least the following properties:
`content` (any): The content of the memory record.
Additional properties such as `id` (string), `score` (double), or
`metadata` (object) MAY be included when available from the provider.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW content is string in the schema, should it be any there?

Comment on lines +712 to +714
Instrumentations for individual memory components SHOULD pick a low-cardinality identifier and
SHOULD NOT set gen_ai.memory.store.id if no such identifier exists for this component. Semantic
conventions for individual components SHOULD document what gen_ai.memory.store.id maps to within
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super-nit

Suggested change
Instrumentations for individual memory components SHOULD pick a low-cardinality identifier and
SHOULD NOT set gen_ai.memory.store.id if no such identifier exists for this component. Semantic
conventions for individual components SHOULD document what gen_ai.memory.store.id maps to within
Instrumentations for individual memory components SHOULD pick a low-cardinality identifier and
SHOULD NOT set `gen_ai.memory.store.id` if no such identifier exists for this component. Semantic
conventions for individual components SHOULD document what `gen_ai.memory.store.id` maps to within

> [!WARNING]
> This attribute may contain sensitive information including user/PII data.

Instrumentations SHOULD NOT capture this by default.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Instrumentations SHOULD NOT capture this by default.

no need? we don't leave it on other similar attributes, but we always make them opt in on spans and logs

@@ -0,0 +1,39 @@
{
"$defs": {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we define code representation of json model for each of the existing ones in the https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/non-normative/models.ipynb (so it's easier to review and, for users who are not familiar with json schema, to read and interpret conventions. could you please add one there? thanks!

@@ -0,0 +1,39 @@
{
"$defs": {
Copy link
Copy Markdown
Member

@lmolkova lmolkova Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we have to act on it yet, but it's interesting that gen_ai.retrieval.documents and gen_ai.memory.records have almost identical structure. I think we might want to consider the same datatype for them down the road.

Could you please create an issue for it so we come back to it during stabilization process?

@lmolkova lmolkova moved this from Needs More Approval to Ready to be Merged in Semantic Conventions Triage Apr 27, 2026
@lmolkova lmolkova moved this from Ready to be Merged to Awaiting codeowners approval in Semantic Conventions Triage Apr 28, 2026
@lmolkova lmolkova moved this from Awaiting codeowners approval to Needs More Approval in Semantic Conventions Triage May 1, 2026
@lmolkova lmolkova moved this from Needs More Approval to Ready to be Merged in Semantic Conventions Triage May 1, 2026
@wolfgangcodes
Copy link
Copy Markdown

Hello @nagkumar91!

With the merging of Move GenAI semantic conventions to its own dedicated repository, the area:gen-ai semantic conventions have a new home. We've made this split in an to be better able to support the rapid pace of change that we are seeing with Gen AI related semantic conventions.

Our ask is that you move this PR to the new repo here: https://github.com/open-telemetry/semantic-conventions-genai

The contribution guide for the new repo can be found here: https://github.com/open-telemetry/semantic-conventions-genai/blob/main/CONTRIBUTING.md. A major difference is making sure there's a representative scenario and reference outlined in:

4. Update reference scenarioss
Changes under model/ or docs/ typically require updating the reference scenarios under reference/ to demonstrate that the proposed updates are capturable.

You may also find that you need to additional updates to you proposal as the new repo use the V2 Schema.

If you would like to discuss or have questions you can reply here, post in slack here, or add an agenda item for the Gen AI SIG.

Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:gen-ai enhancement New feature or request

Development

Successfully merging this pull request may close these issues.

7 participants