Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,29 +1,39 @@
### Tracing LangChain Retrievers, LLMs, Chains, and Tools using Elastic APM and LangSmith
# LangChain Tracers

This document describes how to trace LangChain retrievers, LLMs, chains, and tools using Elastic APM and LangSmith.
This package provides tracers for monitoring and debugging LangChain retrievers, LLMs, chains, and tools within Kibana.

If the `assistantModelEvaluation` experimental feature flag is enabled, and an APM server is configured, messages that have a corresponding trace will have an additional `View APM trace` action in the message title bar:
## Available Tracers

<p align="center">
<img width="500" src="https://github.com/elastic/kibana/assets/2946766/e0b372ee-139a-4eed-8b09-f01dd88c72b0" />
</p>
| Tracer | Purpose | Configuration |
|--------|---------|---------------|
| `APMTracer` | Elastic APM integration for distributed tracing | Kibana APM settings |
| `TelemetryTracer` | Event-based telemetry for usage analytics | Analytics service setup |
| `LangChainTracer` (LangSmith) | LangSmith integration for LLM observability | Environment variables or session storage |

Viewing the trace you can see a breakdown of the time spent in each retriever, llm, chain, and tool:
<p align="center">
<img width="500" src="https://github.com/elastic/kibana/assets/2946766/f7cbd4bc-207c-4c88-a032-70a8de4f9b9a" />
</p>
## APMTracer

The Evaluation interface has been updated to support adding additional metadata like `Project Name`, `Run Name`, and pulling test datasets from LangSmith. Predictions can now also be run without having to run an Evaluation, so datasets can quickly be run for manual analysis.
The `APMTracer` integrates with Elastic APM to provide distributed tracing of LangChain operations. It creates spans for retrievers, LLMs, chains, and tools, allowing you to visualize the execution flow in APM.

<p align="center">
<img width="500" src="https://github.com/elastic/kibana/assets/2946766/acebf719-29fd-4fcc-aef1-99fd00ca800a" />
</p>
### Usage

```typescript
import { APMTracer } from '@kbn/langchain/server/tracers';

<p align="center">
<img width="500" src="https://github.com/elastic/kibana/assets/2946766/7081d993-cbe0-4465-a734-ff9be14d7d0d" />
</p>
const tracer = new APMTracer(
{ projectName: 'my-project', exampleId: 'optional-example-id' },
logger
);

// Pass to LangChain callbacks
const result = await chain.invoke(input, { callbacks: [tracer] });
```

### Traced Operations

- `onRetrieverStart/End/Error` - Document retrieval operations
- `onLLMStart/End/Error` - LLM invocations
- `onChainStart/End/Error` - Chain executions
- `onToolStart/End/Error` - Tool calls

### Configuring APM

Expand All @@ -33,7 +43,7 @@ First, enable the `assistantModelEvaluation` experimental feature flag by adding
xpack.securitySolution.enableExperimental: [ 'assistantModelEvaluation' ]
```

Next, you'll need an APM server to collect the traces. You can either [follow the documentation for installing](https://www.elastic.co/guide/en/apm/guide/current/installing.html) the released artifact, or [run from source](https://github.com/elastic/apm-server#apm-server-development) and set up using the [quickstart guide provided](https://www.elastic.co/guide/en/apm/guide/current/apm-quick-start.html) (be sure to install the APM Server integration to ensure the necessary indices are created! In dev environments you must click `Display beta integrations` on main Integrations page to ensure the latest package is installed.). Once your APM server is running, add your APM server configuration to your `kibana.dev.yml` as well using the following:
Next, you'll need an APM server to collect the traces. You can either [follow the documentation for installing](https://www.elastic.co/guide/en/apm/guide/current/installing.html) the released artifact, or [run from source](https://github.com/elastic/apm-server#apm-server-development) and set up using the [quickstart guide provided](https://www.elastic.co/guide/en/apm/guide/current/apm-quick-start.html) (be sure to install the APM Server integration to ensure the necessary indices are created! In dev environments you must click `Display beta integrations` on main Integrations page to ensure the latest package is installed.). Once your APM server is running, add your APM server configuration to your `kibana.dev.yml` as well using the following:

```
# APM
Expand All @@ -54,7 +64,95 @@ If using a remote APM Server/Kibana instance for viewing traces, you can set the
> If connecting to a cloud APM server (like our [ai-assistant apm deployment](https://ai-assistant-apm-do-not-delete.kb.us-central1.gcp.cloud.es.io/)), follow [these steps](https://www.elastic.co/guide/en/apm/guide/current/api-key.html#create-an-api-key) to create an API key, and then set it via `apiKey` and also set your `serverUrl` as shown in the APM Integration details within fleet.

> [!NOTE]
> If you're an Elastic developer running Kibana from source, you can just enable APM as above, and _not_ include a `serverUrl`, and your traces will be sent to the https://kibana-cloud-apm.elastic.dev cluster.
> If you're an Elastic developer running Kibana from source, you can just enable APM as above, and _not_ include a `serverUrl`, and your traces will be sent to the https://kibana-cloud-apm.elastic.dev cluster.

### Viewing Traces

If the `assistantModelEvaluation` experimental feature flag is enabled, and an APM server is configured, messages that have a corresponding trace will have an additional `View APM trace` action in the message title bar:

<p align="center">
<img width="500" src="https://github.com/elastic/kibana/assets/2946766/e0b372ee-139a-4eed-8b09-f01dd88c72b0" />
</p>

Viewing the trace you can see a breakdown of the time spent in each retriever, llm, chain, and tool:

<p align="center">
<img width="500" src="https://github.com/elastic/kibana/assets/2946766/f7cbd4bc-207c-4c88-a032-70a8de4f9b9a" />
</p>

## TelemetryTracer

The `TelemetryTracer` provides event-based telemetry for tracking LangChain usage analytics. It reports events to Kibana's analytics service for monitoring assistant interactions.

### Usage

```typescript
import { TelemetryTracer } from '@kbn/langchain/server/tracers';

const tracer = new TelemetryTracer(
{
elasticTools: ['tool1', 'tool2'], // List of known Elastic tool names
telemetry: analyticsService,
telemetryParams: {
assistantStreamingEnabled: true,
actionTypeId: '.gen-ai',
isEnabledKnowledgeBase: true,
eventType: 'invoke_assistant',
model: 'gpt-4',
},
},
logger
);

// Pass to LangChain callbacks
const result = await chain.invoke(input, { callbacks: [tracer] });
```

### Tracked Events

- **`invoke_assistant`** - Emitted on chain completion with:
- Duration in milliseconds
- Tools invoked (with counts)
- Model and configuration details
- Knowledge base status

- **`invoke_assistant_error`** - Emitted on tool errors with:
- Error message and location
- Action type and model info
- Configuration state

### Telemetry Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `assistantStreamingEnabled` | `boolean` | Whether streaming is enabled |
| `actionTypeId` | `string` | The connector action type ID |
| `isEnabledKnowledgeBase` | `boolean` | Whether knowledge base is active |
| `eventType` | `string` | The telemetry event type to report |
| `model` | `string` (optional) | The LLM model being used |

## LangSmith Tracer

The LangSmith tracer integrates with [LangSmith](https://docs.smith.langchain.com/) for LLM observability and testing.

### Usage

```typescript
import { getLangSmithTracer, isLangSmithEnabled } from '@kbn/langchain/server/tracers';

// Check if LangSmith is enabled
if (isLangSmithEnabled()) {
const tracers = getLangSmithTracer({
apiKey: 'your-api-key', // Optional, reads from env if not provided
projectName: 'my-project',
exampleId: 'optional-dataset-example-id',
logger,
});

// Pass to LangChain callbacks
const result = await chain.invoke(input, { callbacks: tracers });
}
```

### Configuring LangSmith

Expand All @@ -70,3 +168,30 @@ export LANGCHAIN_PROJECT="8.12 ESQL Query Generation"

If wanting to configure LangSmith in cloud or other environments where you may not have the ability to set env vars, you can set the `LangSmith Project` and `LangSmith API Key` values in session storage as outlined in https://github.com/elastic/kibana/pull/180227.

### Dataset Integration

The Evaluation interface supports adding additional metadata like `Project Name`, `Run Name`, and pulling test datasets from LangSmith. Predictions can now also be run without having to run an Evaluation, so datasets can quickly be run for manual analysis.

<p align="center">
<img width="500" src="https://github.com/elastic/kibana/assets/2946766/acebf719-29fd-4fcc-aef1-99fd00ca800a" />
</p>

<p align="center">
<img width="500" src="https://github.com/elastic/kibana/assets/2946766/7081d993-cbe0-4465-a734-ff9be14d7d0d" />
</p>

## Combining Multiple Tracers

You can use multiple tracers simultaneously by passing them all to the callbacks array:

```typescript
import { APMTracer, TelemetryTracer, getLangSmithTracer } from '@kbn/langchain/server/tracers';

const tracers = [
new APMTracer({ projectName: 'my-project' }, logger),
new TelemetryTracer({ elasticTools, telemetry, telemetryParams }, logger),
...getLangSmithTracer({ apiKey, projectName, logger }),
];

const result = await chain.invoke(input, { callbacks: tracers });
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

export { APMTracer } from './apm';
export { TelemetryTracer } from './telemetry';
export type { TelemetryParams, LangChainTracerFields } from './telemetry';
export { getLangSmithTracer, isLangSmithEnabled } from './langsmith';
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
*/

export { TelemetryTracer } from './telemetry_tracer';
export type { TelemetryParams, LangChainTracerFields } from './telemetry_tracer';
Original file line number Diff line number Diff line change
Expand Up @@ -152,21 +152,26 @@ If plausible organizational or product-specific knowledge is involved, default t

Precedence sequence (stop at first applicable):
1. User-specified tool: If the user explicitly requests or has previously instructed you (for this session or similar queries) to use a specific tool and it is not clearly unsafe or irrelevant, use it first. If unsuitable or unavailable, skip and continue.
2. Specialized tool: Use a domain-targeted tool that directly produces the needed answer more precisely than a general search.
${
experimentalFeatures.skills
? ` 2. Skill discovery (MANDATORY): Before using any general-purpose or specialized tool, check the SKILLS section. If any available skill description is relevant to the user's query, you MUST load it first by calling \\\`filestore.read\\\` with the skill's path. The loaded skill will provide domain-specific instructions and may unlock inline tools that are more precise than general alternatives. Only proceed to the next steps after loading the relevant skill.
3. Specialized tool: Use a domain-targeted tool that directly produces the needed answer more precisely than a general search. Prefer inline tools loaded from a skill over general-purpose tools.`
: ` 2. Specialized tool: Use a domain-targeted tool that directly produces the needed answer more precisely than a general search.`
}
Examples of specialized categories (illustrative, only use if available and relevant):
• Custom domain / vertical analyzers (e.g., detection engineering, incident triage, attack pattern classifiers).
• External system connectors (e.g., SaaS platform search) or federated knowledge base connectors (e.g., Confluence / wiki / code repo / ticketing / CRM / knowledge store), when required data resides outside Elasticsearch.
• Structured analytics & aggregation tools (metrics, time-series rollups, statistical or anomaly detection utilities).
• Log or event pattern mining, clustering, summarization, correlation, causality, or root-cause analytic utilities.
3. General search fallback: If no user-specified or specialized tool applies, call \`${
${experimentalFeatures.skills ? '4' : '3'}. General search fallback: If no user-specified${experimentalFeatures.skills ? ', skill,' : ''} or specialized tool applies, call \`${
tools.search
}\` (if available). **It can discover indices itself—do NOT call index tools just to find an index**.
4. Index inspection fallback: Use \`${tools.indexExplorer}\` or \`${
${experimentalFeatures.skills ? '5' : '4'}. Index inspection fallback: Use \`${tools.indexExplorer}\` or \`${
tools.listIndices
}\` ONLY if (a) the user explicitly asks to list / inspect indices / fields / metadata, OR (b) \`${
tools.search
}\` is unavailable and structural discovery is necessary.
5. Additional calls: If initial results do not fully answer all explicit sub-parts, issue targeted follow-up tool calls before asking the user for more info.
${experimentalFeatures.skills ? '6' : '5'}. Additional calls: If initial results do not fully answer all explicit sub-parts, issue targeted follow-up tool calls before asking the user for more info.
Constraints:
- Do not delay an initial eligible search for non-mandatory clarifications.
- **Ask 1-2 focused questions only if a mandatory parameter is missing and blocks any tool call.**
Expand All @@ -180,9 +185,13 @@ Constraints:
- If the query matches a category for bypassing research, your decision is made. Your only task is to respond in plain text to initiate the handover. Do not proceed to the next steps.
Step 2 — Plan Research (if necessary)
- If the query is informational and requires research, formulate a step-by-step plan to find the answer.
- Parse user intent, sub-questions, entities, constraints, etc.
- Parse user intent, sub-questions, entities, constraints, etc.${
experimentalFeatures.skills
? `\n - Check the SKILLS section: if any skill matches the query, your first action MUST be to load it via \\\`filestore.read\\\`.`
: ''
}
Step 3 — Execute & Iterate
- Apply the Tool Selection Policy to execute the first step of your plan.
- Apply the Tool Selection Policy to execute the first step of your plan${experimentalFeatures.skills ? ' (skill loading takes priority)' : ''}.
- After each tool call, review the gathered information.
- If more information is needed, update your plan and execute the next tool call.
Step 4 — Conclude Research
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ import type { ProcessedConversation } from './prepare_conversation';

export const selectTools = async ({
conversation,
previousDynamicToolIds,
previousDynamicToolIds = [],
skills,
request,
toolProvider,
Expand All @@ -37,19 +37,19 @@ export const selectTools = async ({
filestore,
spaceId,
runner,
experimentalFeatures,
experimentalFeatures = { filestore: false, skills: false },
}: {
conversation: ProcessedConversation;
previousDynamicToolIds: string[];
skills: SkillsService;
previousDynamicToolIds?: string[];
skills?: SkillsService;
request: KibanaRequest;
toolProvider: ToolProvider;
attachmentsService: AttachmentsService;
filestore: IFileStore;
filestore?: IFileStore;
agentConfiguration: AgentConfiguration;
spaceId: string;
runner: ScopedRunner;
experimentalFeatures: ExperimentalFeatures;
experimentalFeatures?: ExperimentalFeatures;
}) => {
const formatContext: AttachmentFormatContext = { request, spaceId };

Expand All @@ -74,7 +74,7 @@ export const selectTools = async ({
});

// create tools for filesystem (only if feature is enabled)
const filestoreTools = experimentalFeatures.filestore
const filestoreTools = experimentalFeatures.filestore && filestore
? getStoreTools({ filestore }).map((tool) => builtinToolToExecutable({ tool, runner }))
: [];

Expand Down Expand Up @@ -105,17 +105,19 @@ export const selectTools = async ({
request,
});

const dynamicInlineTools = (
await Promise.all(
skills
.list()
.filter((skill) => skill.getInlineTools !== undefined)
.map((skill) => skill.getInlineTools!())
const dynamicInlineTools = skills
? (
await Promise.all(
skills
.list()
.filter((skill) => skill.getInlineTools !== undefined)
.map((skill) => skill.getInlineTools!())
)
)
)
.flat()
.filter((tool) => previousDynamicToolIds.includes(tool.id))
.map((tool) => skills.convertSkillTool(tool));
.flat()
.filter((tool) => previousDynamicToolIds.includes(tool.id))
.map((tool) => skills.convertSkillTool(tool))
: [];

return {
staticTools: [...dedupedStaticTools.values()],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,10 @@ export const getSkillsInstructions = async ({
: [
'## SKILLS',
[
'Load a skill using filestore tools to get detailed instructions for a specific task.',
'Skills provide specialized knowledge and best practices for specific tasks.',
"Use them when a task matches a skill's description or the skill is useful for the task.",
'Before using any general-purpose tool or model knowledge, you MUST first check the available skills below.',
'If ANY skill description matches or is relevant to the user query, you MUST load it by calling `filestore.read` with the skill path BEFORE calling any other tool.',
'Skills provide specialized knowledge, domain-specific instructions, and access to inline tools that produce more accurate results than general-purpose alternatives.',
'Skipping a relevant skill and going directly to general tools (e.g., search, execute_esql) is a protocol violation.',
'Only the skills listed here are available:',
].join(' '),
generateXmlTree({
Expand Down
Loading