Blog for AI Agent Observability #6390

gyliu513 · 2025-02-20T17:14:01Z

Fixed #6389

@lmolkova @solsu01 ^^

content/en/blog/2025/ai-agent-observability/index.md

cartermp

Quick pass through, looks fairly good so far

content/en/blog/2025/ai-agent-observability/index.md

cartermp · 2025-02-21T00:41:36Z

content/en/blog/2025/ai-agent-observability/index.md

+    features.
+  - Risk of version lock-in if the framework’s OpenTelemetry dependencies lag
+    behind upstream updates.
+  - Less flexibility for advanced users who prefer custom instrumentation.


Is this actually true? I've not heard of baking in instrumentation making it difficult for users who also wish to augment autoinstrumentation with custom instrumentation, nor those who wish to just turn off autoinstrumentation and do it all themselves.

For baked-in, the customer will probably always need to wait for the whole release of the agent framework if they want to have some customized instrumentation; with otel, it will be relatively easier, only the instrumentation needs update and customer can package that, this is reasonable?

content/en/blog/2025/ai-agent-observability/index.md

gyliu513

Thanks @cartermp and @TaoChenOSU, all comments are now addressed.

content/en/blog/2025/ai-agent-observability/index.md

gyliu513 · 2025-02-22T14:51:28Z

content/en/blog/2025/ai-agent-observability/index.md

+    features.
+  - Risk of version lock-in if the framework’s OpenTelemetry dependencies lag
+    behind upstream updates.
+  - Less flexibility for advanced users who prefer custom instrumentation.


For baked-in, the customer will probably always need to wait for the whole release of the agent framework if they want to have some customized instrumentation; with otel, it will be relatively easier, only the instrumentation needs update and customer can package that, this is reasonable?

content/en/blog/2025/ai-agent-observability/index.md

michaelsafyan · 2025-02-24T16:15:38Z

content/en/blog/2025/ai-agent-observability/index.md

+    behind upstream updates.
+  - Less flexibility for advanced users who prefer custom instrumentation.
+- Some best practices to follow if you consider this approach:
+  - Provide a configuration setting that lets users easily enable or disable


Is it just users that need to be able to easily enable/disable the telemetry?

Isn't it other instrumentation packages, also, that need to be able to suppress the instrumentation (in order to prevent duplicative instrumentation)?

Do we want to recommend a uniform approach to this setting?

For example, it looks like one approach used is to have a per-instrumentation/per-library key in the OTel context (e.g. "suppress_instrumentation_${library_name}") and to check for the presence of that in the context to avoid instrumentation. (This way, calling libraries that generate duplicate information can set the key).

@michaelsafyan this is the baked-in option, if you take a look at crewai at https://docs.crewai.com/telemetry, you will see it has a parameter named as OTEL_SDK_DISABLED to config.

michaelsafyan · 2025-02-24T16:17:11Z

content/en/blog/2025/ai-agent-observability/index.md

+- Pros
+  - You can take on the maintenance overhead of keeping the instrumentation for
+    telemetry up-to-date.
+  - Simplifies adoption for users unfamiliar with OpenTelemetry configuration.


I'm not sure that this is actually true given the zero-configuration mechanisms in OTel for setup.

Customer still need to install otel instrumentation packages for different llm providers, sometimes, they may need to config otel collector to integrate with some 3rd party observability platforms.

With baked-in option, the agent may have build-in UI for observability which does not request any configuration, hope this helps.

content/en/blog/2025/ai-agent-observability/index.md

michaelsafyan · 2025-02-24T16:18:26Z

content/en/blog/2025/ai-agent-observability/index.md

+  - Adds bloat to the framework for users who do not need observability
+    features.
+  - Risk of version lock-in if the framework’s OpenTelemetry dependencies lag
+    behind upstream updates.


A related con/risk to call out:

You may not get feedback/review from OTel contributors familiar with current Semantic Conventions

Your instrumentation may lag with respect to best practices/conventions (not just the version of the OTel library dependencies).

michaelsafyan · 2025-02-24T16:19:00Z

content/en/blog/2025/ai-agent-observability/index.md

+    [OpenTelemetry registry](/ecosystem/registry/) if you choose this path.
+- As a developer of an agent application, you may want to choose an agent
+  framework with baked-in instrumentation if you prefer…
+  - Minimal dependencies on external packages in your agent app code


Is the number of dependencies in the long run truly that different?

Yes, I think so. Like for otel, we need to install instrumentation code, otel collector, but with baked-in, do not need to install those dependencies manually.

michaelsafyan · 2025-02-24T16:19:36Z

content/en/blog/2025/ai-agent-observability/index.md

+- As a developer of an agent application, you may want to choose an agent
+  framework with baked-in instrumentation if you prefer…
+  - Minimal dependencies on external packages in your agent app code
+  - Out-of-the-box observability without manual setup.


Isn't it still likely that manual setup may still be required? For example, even if the library includes instrumentation, it is not going to wire-up Open Telemetry to the appropriate backends.

Here the manual set up will be mainly for the agent framework itself, like CrewAI or others. After the agent framework installed, there is no need to setup observability manually as it should be build-in with crewai or agent framework itself. Hope this helps.

michaelsafyan · 2025-02-24T16:21:55Z

content/en/blog/2025/ai-agent-observability/index.md

+  - Minimal dependencies on external packages in your agent app code
+  - Out-of-the-box observability without manual setup.
+
+#### Option 2: Instrumentation via OpenTelemetry contrib


I think Option 2 actually should be split up into:

"Option 2: External instrumentation" - "Option 2a: External instrumentation in your own repository/package" - "Option 2b: External instrumentation in an Open Telemetry-owned repository/package"

(Or maybe you just make "2a" and "2b" into 2 and 3).

michaelsafyan · 2025-02-24T16:22:55Z

content/en/blog/2025/ai-agent-observability/index.md

+As a developer of an agent framework, here are some pros and cons of this
+baked-in instrumentation:
+
+- Pros


Some additional benefits:

More likely to leverage best practices around Semantic Conventions

More likely to leverage best practices around zero-code instrumentation

michaelsafyan · 2025-02-24T16:24:08Z

content/en/blog/2025/ai-agent-observability/index.md

+  - Allows users to mix and match contrib libraries for their specific needs
+    (e.g., cloud providers, LLM vendors).
+- Cons
+  - Users must manually install and configure contrib libraries, increasing


This is untrue. With zero-code setup, the instrumentation libraries can be auto-discovered.

See:

https://opentelemetry.io/docs/zero-code/python/

It is actually easier to leverage this with this approach, because opentelemetry-boostrap -a will auto-install instrumentation libraries from this repo, but it won't auto-install other instrumentation packages.

michaelsafyan · 2025-02-24T16:24:50Z

content/en/blog/2025/ai-agent-observability/index.md

+  - Users must manually install and configure contrib libraries, increasing
+    setup complexity.
+  - Risk of fragmentation if users rely on incompatible or outdated contrib
+    packages.


Can you explain? How would they leverage "incompatible" ones? Won't that result in an error when attempting to install the dependencies?

yes, but sometimes the install can be succeed, and there maybe some runtime inconsistencies, incompatibilities, and maintenance issues that can arise when different users or frameworks depend on different versions of contributed (contrib) packages

michaelsafyan · 2025-02-24T16:26:02Z

content/en/blog/2025/ai-agent-observability/index.md

+    setup complexity.
+  - Risk of fragmentation if users rely on incompatible or outdated contrib
+    packages.
+  - Less control over telemetry quality and coverage compared to baked-in


True, though:

The quality is likely to be higher given the higher bar held in that repo for telemetry quality.

There is still the ability to contribute to it in order to improve it.

I think the "less control" is more relevant in relation to velocity rather than in relation to quality.

good point, updated to below

Development velocity slows down when there are too many PRs in the OpenTelemetry review queue.

content/en/blog/2025/ai-agent-observability/index.md

content/en/blog/2025/ai-agent-observability/ai-agent.png

content/en/blog/2025/ai-agent-observability/index.md

gyliu513

Thanks all for the comments, really appreciated!

content/en/blog/2025/ai-agent-observability/ai-agent.png

content/en/blog/2025/ai-agent-observability/index.md

gyliu513 · 2025-02-25T21:30:52Z

content/en/blog/2025/ai-agent-observability/index.md

+  - Minimal dependencies on external packages in your agent app code
+  - Out-of-the-box observability without manual setup.
+
+#### Option 2: Instrumentation via OpenTelemetry contrib


gyliu513 · 2025-02-25T21:31:24Z

content/en/blog/2025/ai-agent-observability/index.md

+As a developer of an agent framework, here are some pros and cons of this
+baked-in instrumentation:
+
+- Pros


gyliu513 · 2025-02-25T21:31:56Z

content/en/blog/2025/ai-agent-observability/index.md

+  - Allows users to mix and match contrib libraries for their specific needs
+    (e.g., cloud providers, LLM vendors).
+- Cons
+  - Users must manually install and configure contrib libraries, increasing


gyliu513 · 2025-02-25T21:35:50Z

content/en/blog/2025/ai-agent-observability/index.md

+  - Users must manually install and configure contrib libraries, increasing
+    setup complexity.
+  - Risk of fragmentation if users rely on incompatible or outdated contrib
+    packages.


yes, but sometimes the install can be succeed, and there maybe some runtime inconsistencies, incompatibilities, and maintenance issues that can arise when different users or frameworks depend on different versions of contributed (contrib) packages

gyliu513 · 2025-02-25T21:38:55Z

content/en/blog/2025/ai-agent-observability/index.md

+    setup complexity.
+  - Risk of fragmentation if users rely on incompatible or outdated contrib
+    packages.
+  - Less control over telemetry quality and coverage compared to baked-in


good point, updated to below

Development velocity slows down when there are too many PRs in the OpenTelemetry review queue.

Signed-off-by: Guangya Liu <[email protected]> Co-authored-by: Sujay Solomon <[email protected]>

tiffany76

Here's a first pass at some copy edits to bring this in line with our style guide. I'll try to finish up tomorrow.

Also, there is a typo in the agent-agent-framework.png file. "Framwork" should be "Framework". Thanks!

tiffany76 · 2025-02-26T03:37:56Z

content/en/blog/2025/ai-agent-observability/index.md

+cSpell:ignore: genai Guangya PydanticAI Sujay
+---
+
+## 2025: The Year of AI Agents


Suggested change

## 2025: The Year of AI Agents

## 2025: The year of AI agents

tiffany76 · 2025-02-26T03:38:36Z

content/en/blog/2025/ai-agent-observability/index.md

+## 2025: The Year of AI Agents
+
+AI Agents are becoming the next big leap in artificial intelligence in 2025.
+From autonomous workflows to intelligent decision-making, AI Agents will power


Suggested change

From autonomous workflows to intelligent decision-making, AI Agents will power

From autonomous workflows to intelligent decision making, AI Agents will power

tiffany76 · 2025-02-26T03:40:21Z

content/en/blog/2025/ai-agent-observability/index.md

+AI Agents are becoming the next big leap in artificial intelligence in 2025.
+From autonomous workflows to intelligent decision-making, AI Agents will power
+numerous applications across industries. However, with this evolution comes the
+critical need for AI Agent Observability - especially when scaling these agents


Suggested change

critical need for AI Agent Observability - especially when scaling these agents

critical need for AI agent observability, especially when scaling these agents

tiffany76 · 2025-02-26T03:41:05Z

content/en/blog/2025/ai-agent-observability/index.md

+critical need for AI Agent Observability - especially when scaling these agents
+to meet enterprise needs. Without proper monitoring, tracing, and logging
+mechanisms, diagnosing issues, improving efficiency, and ensuring reliability in
+AI Agent-driven applications will be challenging.


Suggested change

AI Agent-driven applications will be challenging.

AI agent-driven applications will be challenging.

tiffany76 · 2025-02-26T03:41:20Z

content/en/blog/2025/ai-agent-observability/index.md

+mechanisms, diagnosing issues, improving efficiency, and ensuring reliability in
+AI Agent-driven applications will be challenging.
+
+### What is an AI Agent


Suggested change

### What is an AI Agent

### What is an AI agent?

tiffany76 · 2025-02-26T03:50:00Z

content/en/blog/2025/ai-agent-observability/index.md

+It is crucial to distinguish between **AI Agent Application** and **AI Agent
+Frameworks**:


Suggested change

It is crucial to distinguish between **AI Agent Application** and **AI Agent

Frameworks**:

It is crucial to distinguish between **AI agent applications** and **AI agent

frameworks**:

tiffany76 · 2025-02-26T03:50:42Z

content/en/blog/2025/ai-agent-observability/index.md

+
+![AI Agent Application vs AI Agent Framework](agent-agent-framework.png)
+
+- **AI Agent application** refer to individual AI-driven entities that perform


Suggested change

- **AI Agent application** refer to individual AI-driven entities that perform

- **AI agent applications** refer to individual AI-driven entities that perform

tiffany76 · 2025-02-26T03:51:59Z

content/en/blog/2025/ai-agent-observability/index.md

+- **AI Agent Framework** provide the necessary infrastructure to develop,
+  manage, and deploy AI Agents often in a more streamlined way than building an
+  agent from scratch. Examples include


Suggested change

- **AI Agent Framework** provide the necessary infrastructure to develop,

manage, and deploy AI Agents often in a more streamlined way than building an

agent from scratch. Examples include

- **AI agent frameworks** provide the necessary infrastructure to develop,

manage, and deploy AI agents often in a more streamlined way than building an

agent from scratch. Examples include the following:

tiffany76 · 2025-02-26T03:53:03Z

content/en/blog/2025/ai-agent-observability/index.md

+  [LangGraph](https://www.langchain.com/langgraph),
+  [PydanticAI](https://ai.pydantic.dev/) and more.
+
+### Establishing a Standardized Semantic Convention


Suggested change

### Establishing a Standardized Semantic Convention

### Establishing a standardized semantic convention

tiffany76 · 2025-02-26T03:54:40Z

content/en/blog/2025/ai-agent-observability/index.md

+Today, the
+[GenAI observability project](https://github.com/open-telemetry/community/blob/main/projects/gen-ai.md)
+within OpenTelemetry is actively working on defining semantic conventions to
+standardize AI Agent observability. This effort is primarily driven by:


Suggested change

standardize AI Agent observability. This effort is primarily driven by:

standardize AI agent observability. This effort is primarily driven by:

gyliu513 requested a review from a team as a code owner February 20, 2025 17:14

github-actions bot added the blog label Feb 20, 2025

opentelemetrybot requested a review from a team February 20, 2025 17:14

gyliu513 force-pushed the agent branch 2 times, most recently from 07f18f7 to 213c221 Compare February 20, 2025 18:09

michaelsafyan reviewed Feb 20, 2025

View reviewed changes

content/en/blog/2025/ai-agent-observability/index.md Outdated Show resolved Hide resolved

samuelcolvin reviewed Feb 20, 2025

View reviewed changes

content/en/blog/2025/ai-agent-observability/index.md Outdated Show resolved Hide resolved

gyliu513 force-pushed the agent branch from 213c221 to 2a3fcef Compare February 20, 2025 19:54

opentelemetrybot requested a review from a team February 20, 2025 19:54

gyliu513 force-pushed the agent branch 3 times, most recently from e9d1487 to 7d42ebf Compare February 20, 2025 21:23

cartermp requested changes Feb 21, 2025

View reviewed changes

TaoChenOSU reviewed Feb 21, 2025

View reviewed changes

content/en/blog/2025/ai-agent-observability/index.md Outdated Show resolved Hide resolved

content/en/blog/2025/ai-agent-observability/index.md Outdated Show resolved Hide resolved

TaoChenOSU reviewed Feb 21, 2025

View reviewed changes

content/en/blog/2025/ai-agent-observability/index.md Outdated Show resolved Hide resolved

gyliu513 commented Feb 22, 2025

View reviewed changes

gyliu513 force-pushed the agent branch from 7d42ebf to d1237bb Compare February 22, 2025 15:17

opentelemetrybot requested a review from a team February 22, 2025 15:17

gyliu513 force-pushed the agent branch from d1237bb to 1a1dc26 Compare February 22, 2025 15:29

michaelsafyan reviewed Feb 24, 2025

View reviewed changes

TaoChenOSU reviewed Feb 24, 2025

View reviewed changes

content/en/blog/2025/ai-agent-observability/index.md Outdated Show resolved Hide resolved

michaelsafyan reviewed Feb 24, 2025

View reviewed changes

content/en/blog/2025/ai-agent-observability/index.md Show resolved Hide resolved

punya reviewed Feb 24, 2025

View reviewed changes

gyliu513 commented Feb 25, 2025

View reviewed changes

Blog for AI Agent Observability

237ebe8

Signed-off-by: Guangya Liu <[email protected]> Co-authored-by: Sujay Solomon <[email protected]>

gyliu513 force-pushed the agent branch from 1a1dc26 to 237ebe8 Compare February 25, 2025 21:57

tiffany76 reviewed Feb 26, 2025

View reviewed changes

	## 2025: The Year of AI Agents
	## 2025: The year of AI agents

	From autonomous workflows to intelligent decision-making, AI Agents will power
	From autonomous workflows to intelligent decision making, AI Agents will power

	critical need for AI Agent Observability - especially when scaling these agents
	critical need for AI agent observability, especially when scaling these agents

	AI Agent-driven applications will be challenging.
	AI agent-driven applications will be challenging.

		It is crucial to distinguish between AI Agent Application and **AI Agent
		Frameworks**:


		![AI Agent Application vs AI Agent Framework](agent-agent-framework.png)

		- AI Agent application refer to individual AI-driven entities that perform

	- AI Agent application refer to individual AI-driven entities that perform
	- AI agent applications refer to individual AI-driven entities that perform

	### Establishing a Standardized Semantic Convention
	### Establishing a standardized semantic convention

	standardize AI Agent observability. This effort is primarily driven by:
	standardize AI agent observability. This effort is primarily driven by:

Blog for AI Agent Observability #6390

Are you sure you want to change the base?

Blog for AI Agent Observability #6390

Conversation

gyliu513 commented Feb 20, 2025

cartermp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gyliu513 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gyliu513 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tiffany76 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment