-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add map_windows support to Go SDK #24307
Conversation
R: @lostluck |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control |
A note about testing: I was unable to write a batch pipeline that resulted in a window mapping operation. Here is a link to the tests I wrote which did not result in a map_windows operation. It also contains the changes I made to the streaming wordcap example to make it happen on dataflow. I have not written a pipeline yet that works on Flink and uses map windows, but I wanted to go ahead get this change out for review. |
This is a very good start!
I looked into the Dataflow's graph analyzer and it doesn't use map windows in batch, so that's not amazing for batch testing. We should at least manually validate in streaming mode. I can do this if you like. We should still have unit tests for the MapWindows Node in the exec package. We should test to the spec (with the KV<nonce, Window> inputs and outputs), per the proto. eg.That we preserve the "nonce", and similar. https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/org/apache/beam/model/pipeline/v1/beam_runner_api.proto#L285 I've got a few more specific comments in a moment or two. |
Codecov Report
@@ Coverage Diff @@
## master #24307 +/- ##
==========================================
- Coverage 73.37% 73.37% -0.01%
==========================================
Files 718 718
Lines 97101 97164 +63
==========================================
+ Hits 71252 71293 +41
- Misses 24505 24523 +18
- Partials 1344 1348 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And just a heads up, in order to unblock RunInference, I've got a change that makes slices use the beam Iterable coder by default, for better cross language support. It affects some overlapping code, so be aware of merge conflicts.
@@ -177,6 +177,7 @@ const ( | |||
Iterable Kind = "I" | |||
KV Kind = "KV" | |||
LP Kind = "LP" // Explicitly length prefixed, likely at the runner's direction. | |||
IWCValue Kind = "IWCvalue" | |||
|
|||
Window Kind = "window" // A debug wrapper around a window coder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It bother's me we can't re-use this, but I agree we shouldn't mix these up just now. It's an internal implementation detail at least, so it can be cleaned up at some future point.
@lostluck Anything left to do here other than resolve conflicts? |
I think just resolving the conflicts? I'll take another look after i finish lunch. |
Adds support to the Go SDK for the [map_windows][1] urn. The existing type model does not allow `IntervalWindow` as a FullValue and the existing coders will throw away the window values when decoding the KV type [here][2] in `elideSingleElmFV`. This change updates coder construction to allow an IntervalWindow as a value coder. [1]: https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/org/apache/beam/model/pipeline/v1/beam_runner_api.proto#L296 [2]: https://github.com/camphillips22/beam/blob/8be4cdcaf65a1e53e3041ac3354e2e99c845e915/sdks/go/pkg/beam/core/runtime/exec/coder.go#L580-L580
f28028a
to
2f24f56
Compare
8a08253
to
38931b6
Compare
Run Go PreCommit |
PRecommit failure was a flake, so just rerunning to be sure. Otherwise this LGTM. |
Run Go PostCommit |
Run Go Flink ValidatesRunner |
Thank you very much for working through this! Hopefully now it's possible for someone to cobble together a PeriodicImpulse or similar and have functioning Slowly Changing Side Inputs. |
Thanks for helping me get this merged! It will help tremendously with our pipelines. |
Adds support to the Go SDK for the map_windows urn.
The existing type model does not allow
IntervalWindow
as a FullValue and the existing coders will throw away the window values when decoding the KV type here inelideSingleElmFV
. This change updates coder construction to allow an IntervalWindow as a value coder.Addresses #23106.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.