[Obs AI] Supports for Logs AI Insight in ES|QL mode in Discover#258595
[Obs AI] Supports for Logs AI Insight in ES|QL mode in Discover#258595neptunian merged 6 commits intoelastic:mainfrom
Conversation
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Async chunks
History
|
viduni94
left a comment
There was a problem hiding this comment.
LGTM (CR only)
Left one comment
| @@ -154,6 +166,7 @@ export function getObservabilityAgentBuilderAiInsightsRouteRepository(): ServerR | |||
| plugins, | |||
| index, | |||
| id, | |||
| fields, | |||
There was a problem hiding this comment.
Should we add a size/limit on fields or accept only a list of fields (LOG_DOCUMENT_FIELDS) to avoid sending a massive list of fields to the LLM?
There was a problem hiding this comment.
I am leaning towards no, because I don't think it's going to hurt but only benefit having more fields. I'm not sure why we bother limiting the log fetch to begin with. Maybe we should remove that filter. CC @arturoliduena . I don't think size is going to be an issue, a single doc is small enough to be trivial for an http request.
There was a problem hiding this comment.
think the main concern here is controlling token usage rather than request size.
I agree that more fields can help the LLM context, it usually doesn’t hurt.
Maybe we don’t need a strict limit here, but at least some maxFields or something.
There was a problem hiding this comment.
It's one filtered document so I don't think we should be too concerned about it. I think we should be more concerned about the level of trace data we are fetching and injecting into the prompt:
const { traces } = await getTraces({
core,
plugins,
logger,
esClient,
index,
start: windowStart,
end: windowEnd,
kqlFilter: `_id: ${id}`,
maxTraces: 10,
maxDocsPerTrace: 100,
});
| logEntry = Object.fromEntries( | ||
| Object.entries(fields ?? {}).filter(([, v]) => v != null) |
There was a problem hiding this comment.
Could we filter out null on the client side?
There was a problem hiding this comment.
It is. Maybe I don't need this server side one, but I put it in both places. https://github.com/elastic/kibana/pull/258595/changes#diff-55daf8fc97ac45e642ebaa371c2f8b29f77836e3aa41a74b681e3ff7eea9e881R47
| if (hasDocIdentity) { | ||
| attachments.push({ | ||
| type: OBSERVABILITY_LOG_ATTACHMENT_TYPE_ID, | ||
| data: { index, id }, | ||
| }); | ||
| } |
There was a problem hiding this comment.
when hasDocIdentity is false and hasFields is true, should we add a attachment:
{
type: OBSERVABILITY_LOG_ATTACHMENT_TYPE_ID,
data: { fields },
}
There was a problem hiding this comment.
I guess I don't see the point since the whole document is already in the conversation. What would it give us?
|
Thanks @neptunian, for fixing this. |
|
Starting backport for target branches: 9.3 https://github.com/elastic/kibana/actions/runs/23506730912 |
💔 All backports failed
Manual backportTo create the backport manually run: Questions ?Please refer to the Backport tool documentation |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
…tic#258595) ## Summary Log AI Insight in Discover was unavailable in ES|QL mode because ES|QL doesn't return `_id` and `_index` metadata by default, which the component relied on to re-fetch the document from Elasticsearch. The ideal solution would be an ES-level setting to include metadata fields by default (under discussion with the ES team but not yet prioritized). As a workaround, this adds a fallback path that passes the log entry fields directly from the client when `_id`/`_index` aren't available. This is feasible because the existing server-side fetch (`getLogDocumentById`) only retrieves 8 specific fields (`@timestamp`, `message`, `log.level`, `service.name`, `trace.id`, `span.id`, `http.response.status_code`, `error.exception.message`) — all of which are already present in the default ES|QL query results (`FROM logs-*`). So for the majority of queries we send the same fields to the LLM in either mode and no context is lost. **Limitations:** If a user writes an ES|QL query that drops fields (e.g. `FROM logs-* | KEEP message`), the LLM will have less context to work with compared to the KQL path, which always re-fetches the full set of fields by document ID. The insight will still render but may produce a less detailed analysis. We can also no longer pass the log document attachment in esql mode without the id so that will not be available but its usually not needed since the fields available (logEntry) are already embedded in the context. The existing KQL/Lucene path (fetch by `_id`/`_index`) is unchanged. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…tic#258595) ## Summary Log AI Insight in Discover was unavailable in ES|QL mode because ES|QL doesn't return `_id` and `_index` metadata by default, which the component relied on to re-fetch the document from Elasticsearch. The ideal solution would be an ES-level setting to include metadata fields by default (under discussion with the ES team but not yet prioritized). As a workaround, this adds a fallback path that passes the log entry fields directly from the client when `_id`/`_index` aren't available. This is feasible because the existing server-side fetch (`getLogDocumentById`) only retrieves 8 specific fields (`@timestamp`, `message`, `log.level`, `service.name`, `trace.id`, `span.id`, `http.response.status_code`, `error.exception.message`) — all of which are already present in the default ES|QL query results (`FROM logs-*`). So for the majority of queries we send the same fields to the LLM in either mode and no context is lost. **Limitations:** If a user writes an ES|QL query that drops fields (e.g. `FROM logs-* | KEEP message`), the LLM will have less context to work with compared to the KQL path, which always re-fetches the full set of fields by document ID. The insight will still render but may produce a less detailed analysis. We can also no longer pass the log document attachment in esql mode without the id so that will not be available but its usually not needed since the fields available (logEntry) are already embedded in the context. The existing KQL/Lucene path (fetch by `_id`/`_index`) is unchanged. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> (cherry picked from commit 6589017) # Conflicts: # x-pack/solutions/observability/plugins/observability_agent_builder/public/components/insights/log_ai_insight.tsx # x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/get_log_ai_insights.ts # x-pack/solutions/observability/plugins/observability_agent_builder/server/routes/ai_insights/route.ts # x-pack/solutions/observability/test/api_integration_deployment_agnostic/apis/observability_agent_builder/ai_insights/log.spec.ts
|
Looks like this PR has a backport PR but it still hasn't been merged. Please merge it ASAP to keep the branches relatively in sync. |
…#258595) (#259780) # Backport This will backport the following commits from `main` to `9.3`: - [[Obs AI] Supports for Logs AI Insight in ES|QL mode in Discover (#258595)](#258595) <!--- Backport version: 11.0.1 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport) <!--BACKPORT [{"author":{"name":"Sandra G","email":"neptunian@users.noreply.github.com"},"sourceCommit":{"committedDate":"2026-03-24T18:48:38Z","message":"[Obs AI] Supports for Logs AI Insight in ES|QL mode in Discover (#258595)\n\n## Summary\n\nLog AI Insight in Discover was unavailable in ES|QL mode because ES|QL\ndoesn't return `_id` and `_index` metadata by default, which the\ncomponent relied on to re-fetch the document from Elasticsearch. The\nideal solution would be an ES-level setting to include metadata fields\nby default (under discussion with the ES team but not yet prioritized).\nAs a workaround, this adds a fallback path that passes the log entry\nfields directly from the client when `_id`/`_index` aren't available.\nThis is feasible because the existing server-side fetch\n(`getLogDocumentById`) only retrieves 8 specific fields (`@timestamp`,\n`message`, `log.level`, `service.name`, `trace.id`, `span.id`,\n`http.response.status_code`, `error.exception.message`) — all of which\nare already present in the default ES|QL query results (`FROM logs-*`).\nSo for the majority of queries we send the same fields to the LLM in\neither mode and no context is lost.\n\n**Limitations:** If a user writes an ES|QL query that drops fields (e.g.\n`FROM logs-* | KEEP message`), the LLM will have less context to work\nwith compared to the KQL path, which always re-fetches the full set of\nfields by document ID. The insight will still render but may produce a\nless detailed analysis. We can also no longer pass the log document\nattachment in esql mode without the id so that will not be available but\nits usually not needed since the fields available (logEntry) are already\nembedded in the context.\n\nThe existing KQL/Lucene path (fetch by `_id`/`_index`) is unchanged.\n\n---------\n\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"65890174fda7fc801a4a3564f855928d5079854b","branchLabelMapping":{"^v9.4.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","backport missing","backport:version","v9.3.0","v9.4.0"],"title":"[Obs AI] Supports for Logs AI Insight in ES|QL mode in Discover","number":258595,"url":"https://github.com/elastic/kibana/pull/258595","mergeCommit":{"message":"[Obs AI] Supports for Logs AI Insight in ES|QL mode in Discover (#258595)\n\n## Summary\n\nLog AI Insight in Discover was unavailable in ES|QL mode because ES|QL\ndoesn't return `_id` and `_index` metadata by default, which the\ncomponent relied on to re-fetch the document from Elasticsearch. The\nideal solution would be an ES-level setting to include metadata fields\nby default (under discussion with the ES team but not yet prioritized).\nAs a workaround, this adds a fallback path that passes the log entry\nfields directly from the client when `_id`/`_index` aren't available.\nThis is feasible because the existing server-side fetch\n(`getLogDocumentById`) only retrieves 8 specific fields (`@timestamp`,\n`message`, `log.level`, `service.name`, `trace.id`, `span.id`,\n`http.response.status_code`, `error.exception.message`) — all of which\nare already present in the default ES|QL query results (`FROM logs-*`).\nSo for the majority of queries we send the same fields to the LLM in\neither mode and no context is lost.\n\n**Limitations:** If a user writes an ES|QL query that drops fields (e.g.\n`FROM logs-* | KEEP message`), the LLM will have less context to work\nwith compared to the KQL path, which always re-fetches the full set of\nfields by document ID. The insight will still render but may produce a\nless detailed analysis. We can also no longer pass the log document\nattachment in esql mode without the id so that will not be available but\nits usually not needed since the fields available (logEntry) are already\nembedded in the context.\n\nThe existing KQL/Lucene path (fetch by `_id`/`_index`) is unchanged.\n\n---------\n\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"65890174fda7fc801a4a3564f855928d5079854b"}},"sourceBranch":"main","suggestedTargetBranches":["9.3"],"targetPullRequestStates":[{"branch":"9.3","label":"v9.3.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.4.0","branchLabelMappingKey":"^v9.4.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/258595","number":258595,"mergeCommit":{"message":"[Obs AI] Supports for Logs AI Insight in ES|QL mode in Discover (#258595)\n\n## Summary\n\nLog AI Insight in Discover was unavailable in ES|QL mode because ES|QL\ndoesn't return `_id` and `_index` metadata by default, which the\ncomponent relied on to re-fetch the document from Elasticsearch. The\nideal solution would be an ES-level setting to include metadata fields\nby default (under discussion with the ES team but not yet prioritized).\nAs a workaround, this adds a fallback path that passes the log entry\nfields directly from the client when `_id`/`_index` aren't available.\nThis is feasible because the existing server-side fetch\n(`getLogDocumentById`) only retrieves 8 specific fields (`@timestamp`,\n`message`, `log.level`, `service.name`, `trace.id`, `span.id`,\n`http.response.status_code`, `error.exception.message`) — all of which\nare already present in the default ES|QL query results (`FROM logs-*`).\nSo for the majority of queries we send the same fields to the LLM in\neither mode and no context is lost.\n\n**Limitations:** If a user writes an ES|QL query that drops fields (e.g.\n`FROM logs-* | KEEP message`), the LLM will have less context to work\nwith compared to the KQL path, which always re-fetches the full set of\nfields by document ID. The insight will still render but may produce a\nless detailed analysis. We can also no longer pass the log document\nattachment in esql mode without the id so that will not be available but\nits usually not needed since the fields available (logEntry) are already\nembedded in the context.\n\nThe existing KQL/Lucene path (fetch by `_id`/`_index`) is unchanged.\n\n---------\n\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"65890174fda7fc801a4a3564f855928d5079854b"}}]}] BACKPORT--> --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Summary
Log AI Insight in Discover was unavailable in ES|QL mode because ES|QL doesn't return
_idand_indexmetadata by default, which the component relied on to re-fetch the document from Elasticsearch. The ideal solution would be an ES-level setting to include metadata fields by default (under discussion with the ES team but not yet prioritized). As a workaround, this adds a fallback path that passes the log entry fields directly from the client when_id/_indexaren't available.This is feasible because the existing server-side fetch (
getLogDocumentById) only retrieves 8 specific fields (@timestamp,message,log.level,service.name,trace.id,span.id,http.response.status_code,error.exception.message) — all of which are already present in the default ES|QL query results (FROM logs-*). So for the majority of queries we send the same fields to the LLM in either mode and no context is lost.Limitations: If a user writes an ES|QL query that drops fields (e.g.
FROM logs-* | KEEP message), the LLM will have less context to work with compared to the KQL path, which always re-fetches the full set of fields by document ID. The insight will still render but may produce a less detailed analysis. We can also no longer pass the log document attachment in esql mode without the id so that will not be available but its usually not needed since the fields available (logEntry) are already embedded in the context.The existing KQL/Lucene path (fetch by
_id/_index) is unchanged.