-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory account not adding up in SortExec #10073
Comments
It sounds like the issue, at a (really) high level is "additional buffer space is required to actually implement the spill" And since during spill the plan is under memory pressure, getting this additional memory can and does fail Some strategies I can think of are:
Maybe we can do 1 in the sort term while figuring out a more sophisticated strategy for 2 or 3 |
I'm investigating this as part of #9359 |
I find this also related. #9528 (comment) |
Another point of code worth noticing is inside the current datafusion/datafusion/physical-plan/src/sorts/sort.rs Lines 601 to 607 in 79fa6f9
Performance-wise, I think it's beneficial to apply the row format comparison to all multi-column cases, however, while considering BTW, I think we should report memory usage inside datafusion/datafusion/physical-plan/src/sorts/sort.rs Lines 650 to 652 in 79fa6f9
|
Describe the bug
This is related / tangential to #9359
My primary problem is that I am trying to sort 100 million strings and always getting errors like:
After reading through the sort impl a bit I have noticed a few concerns (recorded in additional context)
To Reproduce
I don't have a df reproduction but this reproduces it for me in lance:
Expected behavior
I can sort any number of strings, as long as I don't overflow the disk
Additional context
Here is how I understand memory accounting in the sort today:
The first problem (and the one causing my error) is that a sorted batch of strings (the output of
sort_batch
) is occupying 25% more memory than the unsorted batch of strings. I'm not sure if this buffer alignment, padding, or some kind of 2x allocation strategy used by the sort, but it seems reasonable something like this could happen. Unfortunately, this is a problem. We are spilling because we have used up the entire memory pool. We now take X bytes from the memory pool, convert it into 1.25 * X bytes, and try to put it back in the memory pool. This fails with the error listed above.The second problem is that we are not accounting for the output of the sort perserving merge stream. Each output batch from the sort preserving merge stream is made up of rows from the various input batches. In the degenerate case, where the input data is fully random, this means we will probably require 2 * X bytes. This is because each output batch is made up of 1 batch from each input stream. We can't release any of the input batches until we emit the final output batch.
The solution to this second problem is that we should be streaming into the spill file. We should not collect from the sort preserving merge stream and then write the collected batches into the spill file. This problem is a bit less concerning for me at the moment because it is "datafusion uses more memory than it should" and not "datafusion is failing the plan with an error". We don't do a lot of sorting in lance and so we can work around it reasonably well by halving the size of the spill pool.
The text was updated successfully, but these errors were encountered: