[SPARK-28612][SQL] Add DataFrameWriterV2 API#25681
[SPARK-28612][SQL] Add DataFrameWriterV2 API#25681rdblue wants to merge 2 commits intoapache:masterfrom
Conversation
f7ecd17 to
8d5b7fe
Compare
|
Test build #110143 has finished for PR 25681 at commit
|
|
Test build #110144 has finished for PR 25681 at commit
|
8d5b7fe to
b36c156
Compare
|
Test build #110147 has finished for PR 25681 at commit
|
b36c156 to
9472474
Compare
|
Test build #110262 has finished for PR 25681 at commit
|
|
@brkyvz. this is passing tests. Anything else we need to change? |
| * @group partition_transforms | ||
| * @since 3.0.0 | ||
| */ | ||
| def bucket(numBuckets: Column, e: Column): Column = withExpr { |
There was a problem hiding this comment.
nit: if we're only going to support Literals, I feel it may be better to leave this out initially
| /** | ||
| * Expression for the v2 partition transform years. | ||
| */ | ||
| case class Years(child: Expression) extends PartitionTransformExpression { |
There was a problem hiding this comment.
do you want to implement ExpectsInputTypes for these classes for auto analysis checks?
|
This LGTM like last time, do we have JIRAs for Python API support as well as Java API tests? |
|
@rdblue Can you please rebase this? |
9472474 to
b45593d
Compare
|
@brkyvz, I rebased, added Java API tests, and opened SPARK-29157 to track the PySpark API. |
|
Test build #110933 has finished for PR 25681 at commit
|
|
Test build #110936 has finished for PR 25681 at commit
|
|
+1. Merging to master! |
|
Thanks, @brkyvz! |
| this | ||
| } | ||
|
|
||
| override def tableProperty(property: String, value: String): DataFrameWriterV2[T] = { |
There was a problem hiding this comment.
This should return CreateTableWriter. It doesn't make sense to specify table properties when inserting to an existing table.
There was a problem hiding this comment.
Opened SPARK-29249 for this. Should have a PR posted soon.
What changes were proposed in this pull request?
This adds a new write API as proposed in the SPIP to standardize logical plans. This new API:
append,overwrite,create, andreplacethat correspond to the new logical plans.partitionedBycan only be called when the writer executescreateorreplace.Here are a few example uses of the new API:
How was this patch tested?
Added
DataFrameWriterV2Suitethat tests the new write API. Existing tests for v2 plans.