ESQL: Limit memory usage of fold#118602
Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @nik9000, I've created a changelog YAML for you. |
|
This is sort of the next step in #118101. Or, I guess, #118101 was really an attempt to unblock as many of the |
|
I shall defeat you, merge conflicts! |
|
run elasticsearch-ci/part-1 |
ivancea
left a comment
There was a problem hiding this comment.
LGTM 😰 The bulk of the trivial changes are here at least, let's wish future PRs get smaller haha
| /** | ||
| * {@link Expression#fold} using a small amount of memory. | ||
| */ | ||
| public static FoldContext small() { |
There was a problem hiding this comment.
Are we using this? It's unbounded
There was a problem hiding this comment.
I'll double check before merging. We really should only have one unbounded one.
There was a problem hiding this comment.
At some point I was playing around with the idea that you could know up front that a fold would always be "small" so you didn't have to limit it. But that's:
- Impossible to know
- Weird to want to do. If you know it's small them limit it.
There was a problem hiding this comment.
I've since made it bounded and, well, as big as the default.
There was a problem hiding this comment.
I can't read it from the user request because the whole point is to have one I can read statically from the user request without plumbing changes. But, it has the default size!
There was a problem hiding this comment.
Those BigArrays aren't being used; are we sure that the blockfactory methods are enough? Some arrays in there are accounted by the BigArrays, right? So we're missing part of them.
There was a problem hiding this comment.
OOOH. Probably not. Let me have a look at wiring those in too.
|
I'd love to get another set of eyes on this one before I merge it. I'll resolve conflicts now though. |
luigidellaquila
left a comment
There was a problem hiding this comment.
LGTM, thanks Nik.
I just left a comment
There was a problem hiding this comment.
Since we are still using it in a few places, wouldn't it make sense to default it to the same 5% of the memory as the default in the query pragmas (or a slightly higher value), rather than making it really unbounded? It's a bit paranoid maybe, I guess the final goal here is safety
There was a problem hiding this comment.
I'm hopeful we can remove the usages in a follow-up. Let's get this in an have a conversation about whether or not we should replace unbounded with small.
There was a problem hiding this comment.
Did we want to give this the 5% limit before merging this?
astefan
left a comment
There was a problem hiding this comment.
I've looked at the ComputeService mostly; it's an intricate core piece of ES|QL and every time I look at it I need to be extra careful in how I read the code considering all the execution paths code takes.
So, please take my comments there with some patience :-). My comment related to LocalExecutionPlanner might be wrong due to how code is called from many places and how the folding context is passed around, but the one related to query pragma for data nodes might be correct (?) or a learning opportunity for me regarding the % memory size value compute timing.
| * turn the given plan into a list of drivers to execute | ||
| */ | ||
| public LocalExecutionPlan plan(PhysicalPlan localPhysicalPlan) { | ||
| public LocalExecutionPlan plan(FoldContext foldCtx, PhysicalPlan localPhysicalPlan) { |
There was a problem hiding this comment.
Why do you need here to pass as parameter a FoldContext?
From what I can tell the constructor of LocalExecutionPlanner already has the needed pragma to build the folding context.
Also, it's unfortunate that we must build the PhysicalOperationProviders (which also takes a folding context) outside the LocalExecutionPlanner (for testing purposes that is) otherwise PhysicalOperationProviders could have been built as part of the LocalExecutionPlanner build and would have gotten the same folding context as the surrounding class.
There was a problem hiding this comment.
I had wanted to use a single context so we'd share the same limit across the whole process. I think at some point we'll no longer need FoldContext at the physical level at all eventually so maybe it's mute?
There was a problem hiding this comment.
I'm going to double check exactly how much sharing I get here.
There was a problem hiding this comment.
Right. We share with the the LocalLogicalPlanOptimizer and LocalPhysicalPlanOptimizer.
There was a problem hiding this comment.
From what I can tell this foldLimit here is the one coming from coordinator. And if it's 5% of the heap size, is this the heap of the coordinator or the heap of the data node?
There was a problem hiding this comment.
Oh, that's a very good point. ++
There was a problem hiding this comment.
Oh fun! I'll resolve the rest of this and then dig into this.
There was a problem hiding this comment.
This is wrong. Copy-pasta mistake probably.
There was a problem hiding this comment.
After merging main into this, this should become a non-issue because Range doesn't fold there anymore.
There was a problem hiding this comment.
Fun! It's interesting that tests didn't catch this. Let me make a test that fails....
There was a problem hiding this comment.
Indeed. There are many edge cases, we can create tests for as many as we can think of. There will always be something left behind, I came to believe it’s impossible to cover everything.
There was a problem hiding this comment.
Er, alex's test catches it already.
There was a problem hiding this comment.
Yeah. I'm not accusing at all. We try to cover it all. I try not to make mistakes too. But I figure if I've made a mistake once it's good to have a case.
There was a problem hiding this comment.
To be fair, I think we just never fold ranges in actual queries, at all; so it makes sense that no test caught this in the past. It doesn't help that Range was inherited from ql.
| && aggFunction.hasFilter() | ||
| && aggFunction.filter() instanceof Literal literal | ||
| && Boolean.FALSE.equals(literal.fold())) { | ||
| && Boolean.FALSE.equals(literal.value())) { |
alex-spies
left a comment
There was a problem hiding this comment.
Heya, I'm not sure if you already had a chance to address my last batch of comments (before this one). I resolved all remarks that I think are done now, but there's still a couple that may be good to check.
The most important thing from my review, unit tests for the fold context, has been addressed, so this nearly LGTM.
However, two major points remain:
- I think @astefan raised an important question though about whether the memory limit on the data nodes is correctly determined, or if maybe wrongly takes 5% of the coordinator node's memory. I'm also interested in this.
- @luigidellaquila suggested to default the
unboundedcontext to 5% of memory as well; do we want to address this in this PR?
There was a problem hiding this comment.
Did we want to give this the 5% limit before merging this?
There was a problem hiding this comment.
After merging main into this, this should become a non-issue because Range doesn't fold there anymore.
OK! I'd sort of assumed it was the data node without check and that was bad. It is, indeed, the data node. Well, it's 5% of the node who reads the
Yeah! I'll do it now. |
|
I'm going to spend some time this weekend thinking about how we can test it. |
Er, "it" here being the memory being 5% of each node's memory. That seems super important. |
| if (obj == null || obj.getClass() != this.getClass()) return false; | ||
| var that = (LogicalOptimizerContext) obj; | ||
| return Objects.equals(this.configuration, that.configuration); | ||
| return this.configuration.equals(that.configuration) && this.foldCtx.equals(that.foldCtx); |
There was a problem hiding this comment.
Should we update the hashCode implementation, too?
There was a problem hiding this comment.
Yes. Super important. You should block the merge for a mistake like that. Sorry.
It's not that it's likely to break anything, but it's sort of a bomb set up for someone years in the future.
| private static PushableGeoDistance from(FoldContext ctx, Attribute attr, Expression foldable, Order order) { | ||
| if (attr instanceof FieldAttribute fieldAttribute) { | ||
| Geometry geometry = SpatialRelatesUtils.makeGeometryFromLiteral(foldable); | ||
| Geometry geometry = SpatialRelatesUtils.makeGeometryFromLiteral(ctx, foldable); |
There was a problem hiding this comment.
Oh yes, absolutely. Sorry, I didn't mean this to be a suggestion for this PR; just wanted to share an observation and see if you agree :)
alex-spies
left a comment
There was a problem hiding this comment.
Thanks @nik9000 ! I think this PR is in a great state and is a great addition!
I think it makes sense to add a test to ensure the 5% of memory is relative to the current node.
Replacing the unbounded context by a 5% context would be better, but if that blows the PR's scope I think it's fine to do in a follow-up. (And I think you wanted to use this 5% limit in csv tests, too, although I think it's sufficient to use a limit in ITs.)
In any case, this could already be merged as-is IMHO. Nice!
|
OK! Status. I've got the I think the right way to test the 5%-is-relative-memeory thing is.... oh boy that's going to take some time. For a few days we'll have to rely on my brain. |
|
Follow ups:
|
`fold` can be surprisingly heavy! The maximally efficient/paranoid thing would be to fold each expression one time, in the constant folding rule, and then store the result as a `Literal`. But this PR doesn't do that because it's a big change. Instead, it creates the infrastructure for tracking memory usage for folding as plugs it into as many places as possible. That's not perfect, but it's better. This infrastructure limit the allocations of fold similar to the `CircuitBreaker` infrastructure we use for values, but it's different in a critical way: you don't manually free any of the values. This is important because the plan itself isn't `Releasable`, which is required when using a real CircuitBreaker. We could have tried to make the plan releasable, but that'd be a huge change. Right now there's a single limit of 5% of heap per query. We create the limit at the start of query planning and use it throughout planning. There are about 40 places that don't yet use it. We should get them plugged in as quick as we can manage. After that, we should look to the maximally efficient/paranoid thing that I mentioned about waiting for constant folding. That's an even bigger change, one I'm not equipped to make on my own.
|
Backport: #120100 |
`fold` can be surprisingly heavy! The maximally efficient/paranoid thing would be to fold each expression one time, in the constant folding rule, and then store the result as a `Literal`. But this PR doesn't do that because it's a big change. Instead, it creates the infrastructure for tracking memory usage for folding as plugs it into as many places as possible. That's not perfect, but it's better. This infrastructure limit the allocations of fold similar to the `CircuitBreaker` infrastructure we use for values, but it's different in a critical way: you don't manually free any of the values. This is important because the plan itself isn't `Releasable`, which is required when using a real CircuitBreaker. We could have tried to make the plan releasable, but that'd be a huge change. Right now there's a single limit of 5% of heap per query. We create the limit at the start of query planning and use it throughout planning. There are about 40 places that don't yet use it. We should get them plugged in as quick as we can manage. After that, we should look to the maximally efficient/paranoid thing that I mentioned about waiting for constant folding. That's an even bigger change, one I'm not equipped to make on my own.
`fold` can be surprisingly heavy! The maximally efficient/paranoid thing would be to fold each expression one time, in the constant folding rule, and then store the result as a `Literal`. But this PR doesn't do that because it's a big change. Instead, it creates the infrastructure for tracking memory usage for folding as plugs it into as many places as possible. That's not perfect, but it's better. This infrastructure limit the allocations of fold similar to the `CircuitBreaker` infrastructure we use for values, but it's different in a critical way: you don't manually free any of the values. This is important because the plan itself isn't `Releasable`, which is required when using a real CircuitBreaker. We could have tried to make the plan releasable, but that'd be a huge change. Right now there's a single limit of 5% of heap per query. We create the limit at the start of query planning and use it throughout planning. There are about 40 places that don't yet use it. We should get them plugged in as quick as we can manage. After that, we should look to the maximally efficient/paranoid thing that I mentioned about waiting for constant folding. That's an even bigger change, one I'm not equipped to make on my own.

foldcan be surprisingly heavy! The maximally efficient/paranoid thing would be to fold each expression one time, in the constant folding rule, and then store the result as aLiteral. But this PR doesn't do that because it's a big change. Instead, it creates the infrastructure for tracking memory usage for folding as plugs it into as many places as possible. That's not perfect, but it's better.This infrastructure limit the allocations of fold similar to the CircuitBreaker infrastructure we use for values, but it's different in a critical way: you don't manually free any of the values. This is important because the plan itself isn't
Releasable, which is required when using a real CircuitBreaker. We could have tried to make the plan releasable, but that'd be a huge change.Right now there's a single limit of 5% of heap per query. We create the limit at the start of query planning and use it throughout planning.
There are about 40 places that don't yet use it. We should get them plugged in as quick as we can manage. After that, we should look to the maximally efficient/paranoid thing that I mentioned about waiting for constant folding. That's an even bigger change, one I'm not equipped to make on my own.