-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests #30377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests #30377
Conversation
|
@cloud-fan @HyukjinKwon May I ask you to take a look at this PR, please. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
| } | ||
| } | ||
|
|
||
| test("show partitions of not partitioned table") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"non-partitioned" sounds a bit more natural.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I will address your comment together with others. @HyukjinKwon @cloud-fan Do you have any comments for this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few places with not partitioned:
$ find . -name '*.scala' -print0|xargs -0 grep -i -n 'not partitioned'
./core/src/test/scala/org/apache/spark/rdd/SortingSuite.scala:138: test("get a range of elements in an array not partitioned by a range partitioner") {
./mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala:925: * Note that the term "rating block" is a bit of a misnomer, as the ratings are not partitioned by
./streaming/src/main/scala/org/apache/spark/streaming/dstream/MapWithStateDStream.scala:134: // If the RDD is not partitioned the right way, let us repartition it using the
./sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:556: // One side of join is not partitioned in the desired way. Need to shuffle one side.
./sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:590: // One side of join is not partitioned in the desired way. Since the number of partitions of
./sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:591: // the side that has already partitioned is smaller than the side that is not partitioned,
./sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala:308: test("OverwritePartitions: overwrite all rows if not partitioned") {
./sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowPartitionsSuite.scala:130: test("show partitions of not partitioned table") {
./sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowPartitionsSuite.scala:137: assert(errMsg.contains("not allowed on a table that is not partitioned"))
./sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala:1982: // not supported since the table is not partitioned
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileIndex.scala:73: /** Schema of the partitioning columns, or the empty schema if the table is not partitioned. */
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DescribeTableExec.scala:79: rows += toCatalystRow("Not partitioned", "", "")
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala:533: failAnalysis(s"Insert into a partition is not allowed because $l is not partitioned.")
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala:149: // This dataset is not partitioned.
./sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala:462: s"for tables that are not partitioned: $tableIdentWithDB")
./sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala:989: * 1. If the table is not partitioned.
./sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala:999: s"SHOW PARTITIONS is not allowed on a table that is not partitioned: $tableIdentWithDB")
./sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala:1755: // not supported since the table is not partitioned
I will change the title of this test but I am not sure about other places. I will leave them AS IS so far. @janekdb If you want, you can open a PR and fix them when it makes sense.
|
Test build #131095 has finished for PR 30377 at commit
|
| .partitionBy("a") | ||
| .format("parquet") | ||
| .mode(SaveMode.Overwrite) | ||
| .saveAsTable("part_datasrc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like testing the DataFrameWriter API not the SHOW PARTITIONS command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah the test was already there. Let's keep it then.
|
|
||
| override protected def createDateTable(table: String): Unit = { | ||
| sql(s""" | ||
| |CREATE TABLE $table (price int, qty int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CREATE TABLE ... USING hive PARTITIONED BY (...) doesn't work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check that. I just didn't want to change the original test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the functions from Hive's suite.
|
Test build #131148 has finished for PR 30377 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #131152 has finished for PR 30377 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
SHOW PARTITIONSparsing tests toShowPartitionsParserSuiteSHOW PARTITIONSfromHiveCommandSuiteto the base test suitev1.ShowPartitionsSuiteBase. This will allow to run the tests w/ and w/o Hive.The changes follow the approach of #30287.
Why are the changes needed?
SHOW PARTITIONStests for both DSv1 and Hive DSv1, DSv2Does this PR introduce any user-facing change?
No
How was this patch tested?
By running:
build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *ShowPartitionsSuite"build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly org.apache.spark.sql.hive.execution.HiveCommandSuite"