Proposed edits to section 2.2#4
Merged
hhhizzz merged 1 commit intohhhizzz:lm-pipeline-blogfrom Dec 7, 2025
Merged
Conversation
|
Preview URL: https://alamb.github.io/arrow-site If the preview URL doesn't work, you may forget to configure your fork repository for preview. |
alamb
commented
Dec 5, 2025
| ### 2.2 Combining row selectors (`RowSelection::and_then`) | ||
|
|
||
| `RowSelection`—defined in `selection.rs`—is the token that every stage passes around. It mostly uses RLE (`RowSelector::select/skip(len)`) to describe sparse ranges. `and_then` is the core operator for "apply one selection to another": left-hand side is "rows already allowed," right-hand side further filters those rows, and the output is their boolean AND. | ||
| [`RowSelection`] represents the set of rows that will eventually be produced. It currently uses RLE (`RowSelector::select/skip(len)`) to describe sparse ranges. [`RowSelection::and_then`] is the core operator for "apply one selection to another": the left-hand argument is "rows already passed" and the right-hand argument is "which of the passed rows also pass the second filter." The output is their boolean AND. |
Author
There was a problem hiding this comment.
I tried to make it more clear what this code was referring to
alamb
commented
Dec 5, 2025
| </figure> | ||
|
|
||
|
|
||
| This keeps narrowing the filter while touching only lightweight metadata—no data copies. The current implementation of `and_then` is a two-pointer linear scan; complexity is linear in selector segments. The sooner predicates shrink the selection, the cheaper later scans become. |
Author
There was a problem hiding this comment.
I propose moving this paragraph below the diagram so the text that describes the diagram is immediately above it
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Here is some proposed "wordsmithing" changes for
I'll comment inline with the rationale