Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Research] Possible performance improvements in Lens expressions #182151

Closed
markov00 opened this issue Apr 30, 2024 · 2 comments
Closed

[Research] Possible performance improvements in Lens expressions #182151

markov00 opened this issue Apr 30, 2024 · 2 comments
Labels
research Team:Visualizations Visualization editors, elastic-charts and infrastructure technical debt Improvement of the software architecture and operational architecture

Comments

@markov00
Copy link
Member

markov00 commented Apr 30, 2024

I will add here some of the findings related to performance bottlenecks I've found in the current Lens/SearchStrategy architecture and possible improvements to these solutions.

Unnecessary multiple requests with the same esagg query

When a search is sent to ES and a response is received, the search service is looking if the request needs a post-flight request.

if (!this.hasPostFlightRequests()) {
obs.next(this.postFlightTransform(response));
obs.complete();
} else {
// Treat the complete response as partial, then run the postFlightRequests.
obs.next({
...this.postFlightTransform(response),
isPartial: true,
isRunning: true,
});

If needed, it transforms the response to a partial response and update the body with the postflight request.
This works correctly if the postflight is actually necessary, but due to the current implementation the postflight request is always "applied" even if not needed, causing a subsequent request to be sent to ES.
This results to an increase of:

  • more time spent unnecessary before returning the results to the client
  • 1 more unnecessary search strategy that cache check
  • 1 more unnecessary run of tabify

Analysis
The current method that checks if a request needs a subsequent post-flight request relies on a loose check from the function hasPostFlightRequests. This function checks if the agg property type.postFlightRequest is a function.

private hasPostFlightRequests() {
const aggs = this.getField('aggs');
if (aggs instanceof AggConfigs) {
return aggs.aggs.some(
(agg) => agg.enabled && typeof agg.type.postFlightRequest === 'function'
);
} else {
return false;
}
}

This function is there even if is not required. For example in a terms aggregation without the other bucket the function is still there but just return its identity

postFlightRequest: createOtherBucketPostFlightRequest(constructSingleTermOtherFilter),

All the other cases this is defaulted to an identity function, so the hasPostFlightRequests function will always return true.
this.postFlightRequest = config.postFlightRequest || identity;

wait_for_completion_timeout value is too low and can't process, without delays, a full response

This parameter, used in async search, describes the timeout before returning asynch search with a partial result..
This parameter is currently set to 200ms.

waitForCompletion: schema.duration({ defaultValue: '200ms' }),

After this 200ms interval the polling mechanism kicks in and the results then are just delayed everytime by at least ~300ms

const getPollInterval = (elapsedTime: number): number => {
if (typeof pollInterval === 'number') return pollInterval;
else {
// if static pollInterval is not provided, then use default back-off logic
switch (true) {
case elapsedTime < 1500:
return 300;
case elapsedTime < 5000:
return 1000;
case elapsedTime < 20000:
return 2500;
default:
return 5000;
}
}
};

Probably I don't have enough knowledge in that, but I don't see any major drawback to increase this value to at least 1s as proposed here #157837 (comment) or even more.
The main drawback with that is an open connection between ES and Kibana that last for ~1 second, instead of opening and closing a new one 5 times in the same time interval.

getXDomain can be speeded up

When using cartesian charts, we compute the x domain. If that domain is big, the time to compute is pretty relevant. For example for a 50k data point dataset it tooks ~40ms. This can probably reduced by half if we adopt a better strategy on data processing, avoiding multiple array scans to sort, filter, map values and we just loop once with a reduce.

Screenshot 2024-04-30 at 17 24 52
@markov00 markov00 added technical debt Improvement of the software architecture and operational architecture Team:Visualizations Visualization editors, elastic-charts and infrastructure research labels Apr 30, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-visualizations (Team:Visualizations)

@dej611
Copy link
Contributor

dej611 commented May 2, 2024

Probably I don't have enough knowledge in that, but I don't see any major drawback to increase this value to at least 1s as proposed here #157837 (comment) or even more.

I think it makes sense to increase it.
From some experiments we saw, in the past, that "quick" responses from ES where within 150ms but it the test didn't have any specific statistical significance but it was enough to increase it from 100ms to 200ms. I think pushing it to 400/500ms might be worth. wdyt @ppisljar ?

@elastic elastic locked and limited conversation to collaborators May 3, 2024
@markov00 markov00 converted this issue into discussion #182551 May 3, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
research Team:Visualizations Visualization editors, elastic-charts and infrastructure technical debt Improvement of the software architecture and operational architecture
Projects
None yet
Development

No branches or pull requests

3 participants