-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34432][SQL][TESTS] Add JavaSimpleWritableDataSource #31560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…e writable data source in `DataSourceV2Suite` ### What changes were proposed in this pull request? This is a followup of #19269 In #19269 , there is only a scala implementation of simple writable data source in `DataSourceV2Suite`. This PR adds a java implementation of it. ### Why are the changes needed? To improve test coverage. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? existing testsuites
sql/core/src/test/java/test/org/apache/spark/sql/connector/JavaSimpleWritableDataSource.java
Outdated
Show resolved
Hide resolved
sql/core/src/test/java/test/org/apache/spark/sql/connector/JavaSimpleWritableDataSource.java
Show resolved
Hide resolved
sql/core/src/test/java/test/org/apache/spark/sql/connector/JavaSimpleWritableDataSource.java
Outdated
Show resolved
Hide resolved
sql/core/src/test/java/test/org/apache/spark/sql/connector/JavaSimpleWritableDataSource.java
Outdated
Show resolved
Hide resolved
sql/core/src/test/java/test/org/apache/spark/sql/connector/JavaSimpleWritableDataSource.java
Show resolved
Hide resolved
sql/core/src/test/java/test/org/apache/spark/sql/connector/JavaSimpleWritableDataSource.java
Show resolved
Hide resolved
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making your first contribution. I left a few general comments.
sql/core/src/test/java/test/org/apache/spark/sql/connector/JavaSimpleWritableDataSource.java
Outdated
Show resolved
Hide resolved
|
ok to test |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #135199 has finished for PR 31560 at commit
|
delete duplicated blank Lines in `JavaSimpleWritableDataSource`
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #135230 has finished for PR 31560 at commit
|
|
Test build #135231 has finished for PR 31560 at commit
|
| return new InputPartition[0]; | ||
| } | ||
| } catch (IOException e) { | ||
| throw new RuntimeException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems not java friendly, I think we should update Batch#planInputPartitions to add throws IOException, the same as PartitionReader#next.
The throws clause is a compile-time check and won't break binary compatibility. Also cc @rdblue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some different opinions recently.
- for
Batch#planInputPartitions, If this method fails (by throwing an exception), the underlying data source scan may require manual cleanup . - for
BatchWrite#commit, If this method fails (by throwing an exception), this writing job is considered to have been failed, andBatchWrite#abortwould be called,butBatchWrite#abortmay not be able to deal with it. more details, pls see the comments ofBatchWrite#abort
so its better to deal with these exceptions in Batch#planInputPartitions and BatchWrite#commit;
sql/core/src/test/java/test/org/apache/spark/sql/connector/JavaSimpleWritableDataSource.java
Outdated
Show resolved
Hide resolved
|
Kubernetes integration test starting |
…rom JavaSimpleBatchTable and override the function capabilities in MyTable
|
Kubernetes integration test status failure |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #135307 has finished for PR 31560 at commit
|
|
Test build #135310 has finished for PR 31560 at commit
|
|
Test build #135313 has finished for PR 31560 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #135320 has finished for PR 31560 at commit
|
|
Test build #135322 has finished for PR 31560 at commit
|
|
|
||
| @Override | ||
| public String keyPrefix() { | ||
| return "javaSimpleWritableDataSource"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we don't need to implement SessionConfigSupport. Can you open a follow-up PR to remove it from both the Scala and Java versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all right
| Object[] objects = | ||
| Arrays.stream(currentLine.split(",")) | ||
| .map(String::trim) | ||
| .map(Integer::parseInt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we change the Scala version as well in a follow-up PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok,i'll change the scala version
|
thanks, merging to master! |
|
@kevincmchen can you open a new PR to add |
ok, all right. |
### What changes were proposed in this pull request? This is a followup of #31560, In #31560, we added `JavaSimpleWritableDataSource ` and left some little problems like unused interface `SessionConfigSupport` 、 inconsistent schema between `JavaSimpleWritableDataSource ` and `SimpleWritableDataSource`. This PR fixes the remaining problems in #31560. ### Why are the changes needed? 1. `SessionConfigSupport` in `JavaSimpleWritableDataSource ` and `SimpleWritableDataSource` is never used, so we don't need to implement it. 2. change the schema of `SimpleWritableDataSource`, to match `TestingV2Source` ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? existing testsuites Closes #31621 from kevincmchen/SPARK-34498. Authored-by: kevincmchen <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
This is a followup of #19269
In #19269 , there is only a scala implementation of simple writable data source in
DataSourceV2Suite.This PR adds a java implementation of it.
Why are the changes needed?
To improve test coverage.
Does this PR introduce any user-facing change?
No
How was this patch tested?
existing testsuites