Skip to content

Partially address sorting memory issues#86

Merged
coracuity merged 10 commits into
mainfrom
cs--sort-merge-mem-reserve
Mar 4, 2026
Merged

Partially address sorting memory issues#86
coracuity merged 10 commits into
mainfrom
cs--sort-merge-mem-reserve

Conversation

@coracuity
Copy link
Copy Markdown
Collaborator

@coracuity coracuity commented Mar 4, 2026

WTF

Try to calculate spill settings better to allow more room for the unspillable portions of larger-than-memory dataset sorts. This will still be pretty limited until something like apache/datafusion#20642 lands.


Note

Cursor Bugbot is generating a summary for commit 341cb5f. Configure here.

DataFusion 52.1.0 has a TOCTOU race in ExternalSorter where merge
reservations are freed and re-created empty, letting other partitions
steal the memory (apache/datafusion#20642). Until the upstream fix
lands, compute a data-aware sort_spill_reservation_bytes by sampling
actual Arrow row sizes from the input, estimating spill file count,
and reserving enough for the merge phase.
@coracuity coracuity requested a review from a team as a code owner March 4, 2026 07:58
@coracuity coracuity changed the title Cs sort merge mem reserve Partially address sorting memory issues Mar 4, 2026
Comment thread src/utils/memory.rs
Comment thread src/pipeline/mod.rs
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread src/utils/memory.rs Outdated
@coracuity coracuity merged commit b48c20f into main Mar 4, 2026
4 checks passed
@coracuity coracuity deleted the cs--sort-merge-mem-reserve branch March 4, 2026 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants