-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Determine causal window frames to produce early results. #8842
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR @mustafasrepo -- I feel I lack the context to review it properly, maybe could you explain the usecase (ideally in doc comments somewhere) with an example or two?
Also, the PR description talks about emitting batches earlier, but I didn't see any test coverage of that. Was that intended?
At first, I thought it would be hard to exactly test this behaviour. However, after some thinking I found out it wouldn't be that hard. I have added a test for this case in commit Thanks for pointing this out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @mustafasrepo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did another round of review after @alamb and it LGTM
I will go ahead and merge this since it is a fairly simple change that is an incremental improvement to existing functionality without any major change to code structure. In case of any issues related to this PR, let us know and we will address with a quick follow-on PR. |
/// Flag indicates whether window frame is causal | ||
/// See documentation for [is_frame_causal] for what causal means in this context and how it is calculated. | ||
is_causal: bool, | ||
/// Flag indicating whether the frame is causal (i.e. computing the result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This slightly changed the API to create |
Which issue does this PR close?
Closes #.
Rationale for this change
In streaming execution, window functions generate result when frame end is smaller than batch size (This guarantees that future rows may not extend up the current window frame). However, for causal window frame (e.g window frame that doesn't include future rows than the current row). We can generate results even if frame end is same with batch size.
With this support we can decrease latency for causal queries.
What changes are included in this PR?
This PR adds support for causality analysis for window frame to decrease latency and memory during generating window function results.
Are these changes tested?
Yes
Are there any user-facing changes?