[SPARK-49200][SQL] Fix null type non-codegen ordering exception #47707

ulysses-you · 2024-08-12T03:18:31Z

What changes were proposed in this pull request?

Spark mark NullType as orderable, and we return 0 when gen comparing code for NullType.

object OrderUtils {
  def isOrderable(dataType: DataType): Boolean = dataType match {
    case NullType => true

This pr makes NullType ordering work for non-codegen path to avoid exception.

Why are the changes needed?

Fix exception:

set spark.sql.codegen.factoryMode=NO_CODEGEN;
set spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.EliminateSorts;

select * from range(10) order by array(null);

org.apache.spark.SparkIllegalArgumentException: Type PhysicalNullType does not support ordered operations.
    at org.apache.spark.sql.errors.QueryExecutionErrors$.orderedOperationUnsupportedByDataTypeError(QueryExecutionErrors.scala:352)
    at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:246)
    at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:243)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType$$anon$1.<init>(PhysicalDataType.scala:283)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering$lzycompute(PhysicalDataType.scala:281)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering(PhysicalDataType.scala:281)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.ordering(PhysicalDataType.scala:277)
    at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:67)
    at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:40)
    at org.apache.spark.sql.execution.UnsafeExternalRowSorter$RowComparator.compare(UnsafeExternalRowSorter.java:254)
    at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:70)
    at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:44)

Does this PR introduce any user-facing change?

yes, bug fix

How was this patch tested?

add test

Was this patch authored or co-authored using generative AI tooling?

no

ulysses-you · 2024-08-12T03:19:19Z

cc @yaooqinn @cloud-fan thank you

cloud-fan

good catch!

### What changes were proposed in this pull request? Spark mark `NullType` as orderable, and we return 0 when gen comparing code for `NullType`. ``` object OrderUtils { def isOrderable(dataType: DataType): Boolean = dataType match { case NullType => true ``` This pr makes `NullType` ordering work for non-codegen path to avoid exception. ### Why are the changes needed? Fix exception: ```sql set spark.sql.codegen.factoryMode=NO_CODEGEN; set spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.EliminateSorts; select * from range(10) order by array(null); ``` ``` org.apache.spark.SparkIllegalArgumentException: Type PhysicalNullType does not support ordered operations. at org.apache.spark.sql.errors.QueryExecutionErrors$.orderedOperationUnsupportedByDataTypeError(QueryExecutionErrors.scala:352) at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:246) at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:243) at org.apache.spark.sql.catalyst.types.PhysicalArrayType$$anon$1.<init>(PhysicalDataType.scala:283) at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering$lzycompute(PhysicalDataType.scala:281) at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering(PhysicalDataType.scala:281) at org.apache.spark.sql.catalyst.types.PhysicalArrayType.ordering(PhysicalDataType.scala:277) at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:67) at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:40) at org.apache.spark.sql.execution.UnsafeExternalRowSorter$RowComparator.compare(UnsafeExternalRowSorter.java:254) at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:70) at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:44) ``` ### Does this PR introduce _any_ user-facing change? yes, bug fix ### How was this patch tested? add test ### Was this patch authored or co-authored using generative AI tooling? no Closes #47707 from ulysses-you/null-ordering. Authored-by: ulysses-you <[email protected]> Signed-off-by: youxiduo <[email protected]> (cherry picked from commit e3ba74b) Signed-off-by: youxiduo <[email protected]>

ulysses-you · 2024-08-12T10:12:06Z

thanks, merged to master/branch-3.5

yaooqinn · 2024-08-12T10:15:37Z

Late LGTM

LuciferYang · 2024-08-12T12:20:08Z

late LGTM

…he#543) ### What changes were proposed in this pull request? Spark mark `NullType` as orderable, and we return 0 when gen comparing code for `NullType`. ``` object OrderUtils { def isOrderable(dataType: DataType): Boolean = dataType match { case NullType => true ``` This pr makes `NullType` ordering work for non-codegen path to avoid exception. ### Why are the changes needed? Fix exception: ```sql set spark.sql.codegen.factoryMode=NO_CODEGEN; set spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.EliminateSorts; select * from range(10) order by array(null); ``` ``` org.apache.spark.SparkIllegalArgumentException: Type PhysicalNullType does not support ordered operations. at org.apache.spark.sql.errors.QueryExecutionErrors$.orderedOperationUnsupportedByDataTypeError(QueryExecutionErrors.scala:352) at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:246) at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:243) at org.apache.spark.sql.catalyst.types.PhysicalArrayType$$anon$1.<init>(PhysicalDataType.scala:283) at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering$lzycompute(PhysicalDataType.scala:281) at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering(PhysicalDataType.scala:281) at org.apache.spark.sql.catalyst.types.PhysicalArrayType.ordering(PhysicalDataType.scala:277) at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:67) at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:40) at org.apache.spark.sql.execution.UnsafeExternalRowSorter$RowComparator.compare(UnsafeExternalRowSorter.java:254) at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:70) at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:44) ``` ### Does this PR introduce _any_ user-facing change? yes, bug fix ### How was this patch tested? add test ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#47707 from ulysses-you/null-ordering. Authored-by: ulysses-you <[email protected]> (cherry picked from commit e3ba74b) Signed-off-by: youxiduo <[email protected]> Co-authored-by: ulysses-you <[email protected]>

Fix null type non-codegen ordering exception

6594cf5

github-actions bot added the SQL label Aug 12, 2024

cloud-fan approved these changes Aug 12, 2024

View reviewed changes

ulysses-you closed this in e3ba74b Aug 12, 2024

ulysses-you deleted the null-ordering branch August 12, 2024 10:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-49200][SQL] Fix null type non-codegen ordering exception #47707

[SPARK-49200][SQL] Fix null type non-codegen ordering exception #47707

ulysses-you commented Aug 12, 2024

Uh oh!

ulysses-you commented Aug 12, 2024

Uh oh!

cloud-fan left a comment

Uh oh!

ulysses-you commented Aug 12, 2024

Uh oh!

yaooqinn commented Aug 12, 2024

Uh oh!

LuciferYang commented Aug 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-49200][SQL] Fix null type non-codegen ordering exception #47707

[SPARK-49200][SQL] Fix null type non-codegen ordering exception #47707

Conversation

ulysses-you commented Aug 12, 2024

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

ulysses-you commented Aug 12, 2024

Uh oh!

cloud-fan left a comment

Choose a reason for hiding this comment

Uh oh!

ulysses-you commented Aug 12, 2024

Uh oh!

yaooqinn commented Aug 12, 2024

Uh oh!

LuciferYang commented Aug 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants