-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate BINARY, VARBINARY, CHAR, VARCHAR jdbc logical types to portable #23548
Conversation
f2f86f2
to
91307b1
Compare
Run Python 3.8 PostCommit |
jdbc xlang test succeeded: https://ci-beam.apache.org/job/beam_PostCommit_Python38_PR/632/testReport/ Will clean up the changes and mark as open |
* Move jdbc logical type to portable logical types in Java * Create portable logical types in Python * Support value_from_runner_api and value_to_runner_api in Python SchemaTransform (currently only support atomic type values) Fix nullable/test/leftovers
Codecov Report
@@ Coverage Diff @@
## master #23548 +/- ##
==========================================
- Coverage 73.45% 73.06% -0.40%
==========================================
Files 718 728 +10
Lines 95884 96413 +529
==========================================
+ Hits 70435 70443 +8
- Misses 24138 24659 +521
Partials 1311 1311
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
* Fix RowCoderImpl cannot encode bytes column in cython compiled * Set coder_impl.is_compiled=True when running on compiled stream module
Assigning reviewers. If you would like to opt out of this review, comment R: @AnandInguva for label python. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control |
Apologize for a large PR. Most changes are necessary refactors had to do, hopefully they looks trivial and not hard to review
Also included some fix found that not catched before
|
Run XVR_Flink PostCommit |
Run XVR_Direct PostCommit |
"Gradle build daemon disappeared unexpectedly" for Direct CrossLanguageValidatesRunner Tests Update: second time failure is
seems like another unrelated grpc flake |
Run XVR_Direct PostCommit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Abacn this is very thorough and LGTM overall. I just have a few minor suggestions on the Python side.
# TODO(https://github.com/apache/beam/issues/23373): Complete support | ||
# for logical types that require arguments beyond atomic type. | ||
# For now, skip arguments. | ||
return logical_type() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to fail loudly here. It's dangerous to move forward with a type we don't understand.
Are there cases where we need to let this through?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this was the behavior before (arguments omitted) and the fixed_decimal logical type has a Row type argument. It is special because the scale
argument is encoded both in argument and in its representation, so that I do not need the argument to recover the data from bytes. The argument's usage is to enable adding a guard for overflow check (like for fixed_char and fixed_bytes) here. Since the same check is done in Java side so it is safe so far.
Added a warning here.
@classmethod | ||
def _from_typing(cls, typ): | ||
# type: (type) -> LogicalType | ||
# TODO: enable argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we link an issue here (it could be the same one for following up on non-atomic types)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, linked to #23373
return np.int32 | ||
|
||
def argument(self): | ||
return self.max_length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you document the type mappings for these in the docstring for this module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in schemas module docstring
Run Python 3.9 PostCommit |
Run Java PreCommit |
…le (apache#23548) * Migrate BINARY, VARBINARY, CHAR, VARCHAR jdbc logical types to portable * Move jdbc logical type to portable logical types in Java * Create portable logical types in Python * Support value_from_runner_api and value_to_runner_api in Python SchemaTransform (currently only support atomic type values) Fix nullable/test/leftovers * Fix typos * Add standard coder test * Fix RowCoderImpl cannot encode bytes column in cython compiled * Set coder_impl.is_compiled=True when running on compiled stream module * Add docstring, add todo and warnings for unsupported
Thanks @TheNeuralBit ! |
…le (apache#23548) * Migrate BINARY, VARBINARY, CHAR, VARCHAR jdbc logical types to portable * Move jdbc logical type to portable logical types in Java * Create portable logical types in Python * Support value_from_runner_api and value_to_runner_api in Python SchemaTransform (currently only support atomic type values) Fix nullable/test/leftovers * Fix typos * Add standard coder test * Fix RowCoderImpl cannot encode bytes column in cython compiled * Set coder_impl.is_compiled=True when running on compiled stream module * Add docstring, add todo and warnings for unsupported
Fixes #23526
Please add a meaningful description for your change here
This enables cross-language jdbc ptransform write and read fixed/variable-length strings and bytes
Move to portable logical types in Java
Create portable logical types in Python
Python cross-language JdbcIO can read/write rows with CHAR, VARCHAR, BINARY, VARBINARY types now.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.