-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-33659][SS] Document the current behavior for DataStreamWriter.toTable API #30885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| def start(): StreamingQuery = startInternal(None) | ||
|
|
||
| /** | ||
| * :: Experimental :: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experimental -> Evolving?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the comment
spark/common/tags/src/main/java/org/apache/spark/annotation/Experimental.java
Lines 28 to 31 in cc23581
| * NOTE: If there exists a Scaladoc comment that immediately precedes this annotation, the first | |
| * line of the comment must be ":: Experimental ::" with no trailing blank line. This is because | |
| * of the known issue that Scaladoc displays only either the annotation or the comment, whichever | |
| * comes first. |
:: Experimental :: is the tag for scaladoc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both annotations (experimental and evolving) provide different semantics, right? Adding the different tag would give more confusion as it's not clear whether this is experimental vs evolving. You're getting it from Experimental, not Evolving.
I'll reopen this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, let me delete this experimental one, only keep evolving.
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala
Outdated
Show resolved
Hide resolved
|
Test build #133205 has finished for PR 30885 at commit
|
| * :: Experimental :: | ||
| * | ||
| * Starts the execution of the streaming query, which will continually output results to the given | ||
| * table as new data arrives. A new table will be created if the table not exists. The returned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it documented a new table will be created if not existing, but later it also documents "Please create a table manually before the execution". It looks confusing, I think. Could we rephrase them together and give a more concrete description about table creation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we have two small paragraphs for v1 and v2 table separately? E.g.
For v1 table, partitioning columns provided by `partitionBy` will be respected
no matter the table exists or not. A new table will be created if the table not exists.
For v2 table, `partitionBy` will be ignored if the table already exists. `partitionBy`
will be respected only if the v2 table does not exist. Besides, the v2 table created
by this API lacks some functionalities (e.g., customized properties, options, and serde info).
If you need them, please create the v2 table manually before the execution to avoid
creating a table with incomplete information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to @viirya suggestion. My request was to describe the impact of options (mostly partitionBy) for matrix of v1 vs v2 and existing vs non-existing.
Separating the case of v1 vs v2 is more important, so the suggestion looks better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the rephrase, done in c158775
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #133311 has finished for PR 30885 at commit
|
HeartSaVioR
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK except the scaladoc tag.
|
Kubernetes integration test starting |
|
The last commit is just removing two line comment. Let me just merge this in. |
|
Merged to master and branch-3.1. |
…toTable API ### What changes were proposed in this pull request? Follow up work for #30521, document the following behaviors in the API doc: - Figure out the effects when configurations are (provider/partitionBy) conflicting with the existing table. - Document the lack of functionality on creating a v2 table, and guide that the users should ensure a table is created in prior to avoid the behavior unintended/insufficient table is being created. ### Why are the changes needed? We didn't have full support for the V2 table created in the API now. (TODO SPARK-33638) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Document only. Closes #30885 from xuanyuanking/SPARK-33659. Authored-by: Yuanjian Li <[email protected]> Signed-off-by: HyukjinKwon <[email protected]> (cherry picked from commit 86c1cfc) Signed-off-by: HyukjinKwon <[email protected]>
|
Late LGTM. Thanks for addressing this! |
|
Kubernetes integration test status success |
|
Test build #133334 has finished for PR 30885 at commit
|
What changes were proposed in this pull request?
Follow up work for #30521, document the following behaviors in the API doc:
Why are the changes needed?
We didn't have full support for the V2 table created in the API now. (TODO SPARK-33638)
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Document only.