Added AI Insight evals #263561
Conversation
SrdjanLL
left a comment
There was a problem hiding this comment.
Great work!
I just left some minor comments (mainly questions and a suggestion for avoiding bespoke polling implementation).
| const deadline = Date.now() + ALERT_POLL_TIMEOUT_MS; | ||
|
|
||
| await esClient.indices.refresh({ index: scenario.alertRule.alertsIndex }); | ||
| while (Date.now() < deadline) { |
There was a problem hiding this comment.
I suggest using pRetry here for polling with exponential backoff, similar to how it's done here.
| * The AI insight endpoints return SSE (Server-Sent Events) streams. | ||
| * This parses the raw SSE text into the summary and context fields. | ||
| */ | ||
| function parseSseResponse(raw: unknown): AiInsightResponse { |
There was a problem hiding this comment.
So I assume this was the root cause of us not having AI Insights responses in the task-under-evaluation payloads?
There was a problem hiding this comment.
Yes, that was exactly it :)
| - Validate that the error is properly handled and does not impact payment processing for valid tokens. | ||
| - If no further errors occur, monitor for recurrence but no urgent action is required. If errors increase, investigate token validation logic and upstream authentication flows.`; | ||
|
|
||
| const PAYMENT_UNREACHABLE_ALERT_EXPECTED = `- Summary: An APM error count alert fired for the frontend service because the payment service is unreachable. The checkout flow fails with a gRPC Unavailable error ("name resolver error: produced zero addresses") when attempting to charge a card via the payment service. This is a connectivity or infrastructure failure, not an application code defect. |
There was a problem hiding this comment.
[Question] Just curious if you tweaked the expected responses for all insights based on our preferences/expectations or this is an actual response from the AI Insight API?
There was a problem hiding this comment.
Good question, @SrdjanLL ! I took an answer from Claude Opus and tweaked the wording a bit
|
Catch flakiness early (recommended): run the flaky test runner against this PR before merging. This PR unskips a previously-flaky Scout test ( Trigger a run with the Flaky Test Runner UI or post this comment on the PR: Share feedback in the #appex-qa channel. Posted via Macroscope — Flaky Test Runner nudge |
| await kbnClient.request<void>({ | ||
| method: 'POST', | ||
| path: `/internal/alerting/rule/${ruleId}/_run_soon`, | ||
| }); |
There was a problem hiding this comment.
_run_soon sits inside the pRetry callback, so it fires on every poll iteration.
IIRC this will queue up rule runs, while we only need the rule to trigger once and the polling should just wait for the alert to appear.
I think it's worthing moving this outside of the pRetry's block.
There was a problem hiding this comment.
Also, I noticed the CI is failing with:
Error: Alert not yet available
--
76 \| const alertDoc = alertsResponse.hits.hits[0];
77 \| if (!alertDoc) {
> 78 \| throw new Error('Alert not yet available');
\| ^
79 \| }
80 \| return alertDoc._id as string;
81 \| },
Do you think that's just a polling error? When you run snapshot replay manually (using CLI), are you able to see the alert?
There was a problem hiding this comment.
Thanks, @SrdjanLL , for the review and comments. I'm looking why it can happen, locally it worked fine
|
@yuliia-fryshko, it looks like you're updating the parameters for a rule type! Please review the guidelines for making additive changes to rule type parameters and determine if your changes require an intermediate release. |
💛 Build succeeded, but was flaky
Failed CI StepsMetrics [docs]
History
|
SrdjanLL
left a comment
There was a problem hiding this comment.
The new scenarios LGTM (as long as they are 🟢 on CI 🙂)!
For visibility, c9dc50d removes a failed scenario where alert wasn't triggering on CI. This will be tracked separately so we get some of the work complete before @yuliia-fryshko 's PTO. The removed scenario is tracked separately via https://github.com/elastic/obs-ai-team/issues/537 and I've added it to current iteration.
Closes https://github.com/elastic/obs-ai-team/issues/533
Closes https://github.com/elastic/obs-ai-team/issues/536
Closes https://github.com/elastic/obs-ai-team/issues/534
Closes https://github.com/elastic/obs-ai-team/issues/535
This PR introduces an evaluation dataset along with corresponding tests for AI Insights across different scenarios.
Added:
These tests aim to improve coverage and ensure consistent evaluation across key AI Insights use cases.