Skip to content

Conversation

@ulysses-you
Copy link
Contributor

What changes were proposed in this pull request?

Spark mark NullType as orderable, and we return 0 when gen comparing code for NullType.

object OrderUtils {
  def isOrderable(dataType: DataType): Boolean = dataType match {
    case NullType => true

This pr makes NullType ordering work for non-codegen path to avoid exception.

Why are the changes needed?

Fix exception:

set spark.sql.codegen.factoryMode=NO_CODEGEN;
set spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.EliminateSorts;

select * from range(10) order by array(null);
org.apache.spark.SparkIllegalArgumentException: Type PhysicalNullType does not support ordered operations.
    at org.apache.spark.sql.errors.QueryExecutionErrors$.orderedOperationUnsupportedByDataTypeError(QueryExecutionErrors.scala:352)
    at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:246)
    at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:243)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType$$anon$1.<init>(PhysicalDataType.scala:283)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering$lzycompute(PhysicalDataType.scala:281)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering(PhysicalDataType.scala:281)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.ordering(PhysicalDataType.scala:277)
    at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:67)
    at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:40)
    at org.apache.spark.sql.execution.UnsafeExternalRowSorter$RowComparator.compare(UnsafeExternalRowSorter.java:254)
    at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:70)
    at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:44)

Does this PR introduce any user-facing change?

yes, bug fix

How was this patch tested?

add test

Was this patch authored or co-authored using generative AI tooling?

no

@github-actions github-actions bot added the SQL label Aug 12, 2024
@ulysses-you
Copy link
Contributor Author

cc @yaooqinn @cloud-fan thank you

Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

ulysses-you added a commit that referenced this pull request Aug 12, 2024
### What changes were proposed in this pull request?

Spark mark `NullType` as orderable, and we return 0 when gen comparing code for `NullType`.
```
object OrderUtils {
  def isOrderable(dataType: DataType): Boolean = dataType match {
    case NullType => true
```
This pr makes `NullType` ordering work for non-codegen path to avoid exception.

### Why are the changes needed?

Fix exception:
```sql
set spark.sql.codegen.factoryMode=NO_CODEGEN;
set spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.EliminateSorts;

select * from range(10) order by array(null);
```

```
org.apache.spark.SparkIllegalArgumentException: Type PhysicalNullType does not support ordered operations.
    at org.apache.spark.sql.errors.QueryExecutionErrors$.orderedOperationUnsupportedByDataTypeError(QueryExecutionErrors.scala:352)
    at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:246)
    at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:243)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType$$anon$1.<init>(PhysicalDataType.scala:283)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering$lzycompute(PhysicalDataType.scala:281)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering(PhysicalDataType.scala:281)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.ordering(PhysicalDataType.scala:277)
    at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:67)
    at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:40)
    at org.apache.spark.sql.execution.UnsafeExternalRowSorter$RowComparator.compare(UnsafeExternalRowSorter.java:254)
    at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:70)
    at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:44)
```

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

add test

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #47707 from ulysses-you/null-ordering.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: youxiduo <[email protected]>
(cherry picked from commit e3ba74b)
Signed-off-by: youxiduo <[email protected]>
@ulysses-you
Copy link
Contributor Author

thanks, merged to master/branch-3.5

@ulysses-you ulysses-you deleted the null-ordering branch August 12, 2024 10:12
@yaooqinn
Copy link
Member

Late LGTM

@LuciferYang
Copy link
Contributor

late LGTM

turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…he#543)

### What changes were proposed in this pull request?

Spark mark `NullType` as orderable, and we return 0 when gen comparing code for `NullType`.
```
object OrderUtils {
  def isOrderable(dataType: DataType): Boolean = dataType match {
    case NullType => true
```
This pr makes `NullType` ordering work for non-codegen path to avoid exception.

### Why are the changes needed?

Fix exception:
```sql
set spark.sql.codegen.factoryMode=NO_CODEGEN;
set spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.EliminateSorts;

select * from range(10) order by array(null);
```

```
org.apache.spark.SparkIllegalArgumentException: Type PhysicalNullType does not support ordered operations.
    at org.apache.spark.sql.errors.QueryExecutionErrors$.orderedOperationUnsupportedByDataTypeError(QueryExecutionErrors.scala:352)
    at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:246)
    at org.apache.spark.sql.catalyst.types.PhysicalNullType.ordering(PhysicalDataType.scala:243)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType$$anon$1.<init>(PhysicalDataType.scala:283)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering$lzycompute(PhysicalDataType.scala:281)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.interpretedOrdering(PhysicalDataType.scala:281)
    at org.apache.spark.sql.catalyst.types.PhysicalArrayType.ordering(PhysicalDataType.scala:277)
    at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:67)
    at org.apache.spark.sql.catalyst.expressions.InterpretedOrdering.compare(ordering.scala:40)
    at org.apache.spark.sql.execution.UnsafeExternalRowSorter$RowComparator.compare(UnsafeExternalRowSorter.java:254)
    at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:70)
    at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter$SortComparator.compare(UnsafeInMemorySorter.java:44)
```

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

add test

### Was this patch authored or co-authored using generative AI tooling?

no

Closes apache#47707 from ulysses-you/null-ordering.

Authored-by: ulysses-you <[email protected]>

(cherry picked from commit e3ba74b)

Signed-off-by: youxiduo <[email protected]>
Co-authored-by: ulysses-you <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants