[SPARK-35442][SQL] Support propagate empty relation through aggregate/union #35149

ulysses-you · 2022-01-10T02:25:29Z

What changes were proposed in this pull request?

Add LogicalQueryStage(_, agg: BaseAggregateExec) check in AQEPropagateEmptyRelation
Add LeafNode check in PropagateEmptyRelationBase, so we can eliminate LogicalQueryStage to LocalRelation
Unify the applyFunc and commonApplyFunc in PropagateEmptyRelationBase

Why are the changes needed?

The Aggregate in AQE is different with others, the LogicalQueryStage looks like LogicalQueryStage(Aggregate, BaseAggregate). We should handle this case specially.

Logically, if the Aggregate grouping expression is not empty, we can eliminate it safely.

Does this PR introduce any user-facing change?

no

How was this patch tested?

add new test in AdaptiveQueryExecSuite

Support propagate empty relation through aggregate
Support propagate empty relation through union

ulysses-you · 2022-01-10T02:36:11Z

...catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala

+          val outputAliases = outputs.map { case (newAttr, oldAttr) =>
+            val newExplicitMetadata =
+              if (oldAttr.metadata != newAttr.metadata) Some(oldAttr.metadata) else None
+            Alias(newAttr, oldAttr.name)(explicitMetadata = newExplicitMetadata)


A small bug fix, the previous is
Alias(newAttr, oldAttr.name)(oldAttr.exprId, explicitMetadata = newExplicitMetadata)

We should make alias use the not same expr id with references as the integral check do LogicalPlanIntegrity.checkIfExprIdsAreGloballyUnique.

A negative case:

SELECT key, count(*) FROM testData WHERE value = 'no_match' GROUP BY key UNION ALL SELECT key, 1 FROM testData

cloud-fan · 2022-01-10T02:39:32Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEPropagateEmptyRelation.scala

  override protected def nonEmpty(plan: LogicalPlan): Boolean =
    super.nonEmpty(plan) || getRowCount(plan).exists(_ > 0)

  private def getRowCount(plan: LogicalPlan): Option[BigInt] = plan match {


I'd suggest renaming this to getEstimatedRowCount with a comment to say that it can be over estimated if row count is not 0, to match the new change.

cloud-fan · 2022-01-10T02:43:11Z

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala

+    withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true") {
+      val (plan1, adaptivePlan1) = runAdaptiveAndVerifyResult(
+        "SELECT key, count(*) FROM testData WHERE value = 'no_match' GROUP BY key")
+      assert(findTopLevelBaeAggregate(plan1).size == 2)


nit: I think it's clearer to check assert(!plan1.isInstanceOf[LocalTableScanExec])

And we can remove this now.

cloud-fan · 2022-01-10T09:08:03Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEPropagateEmptyRelation.scala

+    super.nonEmpty(plan) || getEstimatedRowCount(plan).exists(_ > 0)

-  private def getRowCount(plan: LogicalPlan): Option[BigInt] = plan match {
+  private def getEstimatedRowCount(plan: LogicalPlan): Option[BigInt] = plan match {


let's document the assumptions: 0 means the plan must produce 0 row. Positive value means an estimated row count which can be over-estimated.

ulysses-you · 2022-01-10T09:53:25Z

thank you @cloud-fan , updated with

added some comments at getEstimatedRowCount
removed the aggregate num check in test

cloud-fan · 2022-01-10T12:07:00Z

thanks, merging to master!

cloud-fan · 2022-01-11T09:15:19Z

...catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala

+            } else {
+              val newExplicitMetadata =
+                if (oldAttr.metadata != newAttr.metadata) Some(oldAttr.metadata) else None
+              Alias(newAttr, oldAttr.name)(oldAttr.exprId, explicitMetadata = newExplicitMetadata)


how do you bypass LogicalPlanIntegrity.checkIfExprIdsAreGloballyUnique as we introduce conflicting attr id here?

I checked if the exprId is same, and reuse the original attr if same, create a new alias if different.

if (newAttr.exprId == oldAttr.exprId) { newAttr } else { val newExplicitMetadata = if (oldAttr.metadata != newAttr.metadata) Some(oldAttr.metadata) else None Alias(newAttr, oldAttr.name)(oldAttr.exprId, explicitMetadata = newExplicitMetadata) }

Actually the conflicting attr id is introduced by #29053, original code is:

val outputAliases = outputs.map { case (newAttr, oldAttr) => val newExplicitMetadata = if (oldAttr.metadata != newAttr.metadata) Some(oldAttr.metadata) else None Alias(newAttr, oldAttr.name)(oldAttr.exprId, explicitMetadata = newExplicitMetadata)

hmm, why doesn't LogicalPlanIntegrity.checkIfExprIdsAreGloballyUnique capture this...

Current branch we only create Alias if the old attr id is different with new attr id, so it won't introduce the conflicting attr id.

For the #29053, it seems there is no test can cover this case so it passed..

I think it's easy to trigger this bug: Union(A, B), A is empty and we return B instead. @ulysses-you can you help to fix this bug? We have a common solution for it: QueryPlan.transformWithNewOutput

…/union ### What changes were proposed in this pull request? - Add `LogicalQueryStage(_, agg: BaseAggregateExec)` check in `AQEPropagateEmptyRelation` - Add `LeafNode` check in `PropagateEmptyRelationBase`, so we can eliminate `LogicalQueryStage` to `LocalRelation` - Unify the `applyFunc` and `commonApplyFunc` in `PropagateEmptyRelationBase` ### Why are the changes needed? The Aggregate in AQE is different with others, the `LogicalQueryStage` looks like `LogicalQueryStage(Aggregate, BaseAggregate)`. We should handle this case specially. Logically, if the Aggregate grouping expression is not empty, we can eliminate it safely. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? add new test in `AdaptiveQueryExecSuite` - `Support propagate empty relation through aggregate` - `Support propagate empty relation through union` Closes apache#35149 from ulysses-you/SPARK-35442-GA-SPARK. Authored-by: ulysses-you <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

LuciferYang · 2022-08-30T13:42:52Z

@cloud-fan @ulysses-you Sorry to disturb you in a pr that has been merged, but I find that this pr may have some negative effects.

I used databricks spark-sql-pref + Spark 3.3 in spark-shell to run 3TB TPCDS q24a or q24b, the test code as follows:

val rootDir = "hdfs://${clusterName}/tpcds-data/POCGenData3T"
val databaseName = "tpcds_database"
val scaleFactor = "3072"
val format = "parquet"
import com.databricks.spark.sql.perf.tpcds.TPCDSTables
// just use to load tables
val tables = new TPCDSTables(
      spark.sqlContext,dsdgenDir = "./tpcds-kit/tools",
      scaleFactor = scaleFactor,
      useDoubleForDecimal = false,useStringForDate = false)
spark.sql(s"create database $databaseName")
tables.createTemporaryTables(rootDir, format)
spark.sql(s"use $databaseName")
// TPCDS 24a
val result = spark.sql(""" with ssales as
 (select c_last_name, c_first_name, s_store_name, ca_state, s_state, i_color,
        i_current_price, i_manager_id, i_units, i_size, sum(ss_net_paid) netpaid
 from store_sales, store_returns, store, item, customer, customer_address
 where ss_ticket_number = sr_ticket_number
   and ss_item_sk = sr_item_sk
   and ss_customer_sk = c_customer_sk
   and ss_item_sk = i_item_sk
   and ss_store_sk = s_store_sk
   and c_birth_country = upper(ca_country)
   and s_zip = ca_zip
 and s_market_id = 8
 group by c_last_name, c_first_name, s_store_name, ca_state, s_state, i_color,
          i_current_price, i_manager_id, i_units, i_size)
 select c_last_name, c_first_name, s_store_name, sum(netpaid) paid
 from ssales
 where i_color = 'pale'
 group by c_last_name, c_first_name, s_store_name
 having sum(netpaid) > (select 0.05*avg(netpaid) from ssales)""").collect()
 sc.stop()

The above test may failed due to Stage cancelled because SparkContext was shut down of stage 31 and stage 36 when AQE enabled as follows:

The DAG corresponding to sql is as follows:

And the details as follows:

== Physical Plan ==
AdaptiveSparkPlan (42)
+- == Final Plan ==
   LocalTableScan (1)
+- == Initial Plan ==
   Filter (41)
   +- HashAggregate (40)
      +- Exchange (39)
         +- HashAggregate (38)
            +- HashAggregate (37)
               +- Exchange (36)
                  +- HashAggregate (35)
                     +- Project (34)
                        +- BroadcastHashJoin Inner BuildRight (33)
                           :- Project (29)
                           :  +- BroadcastHashJoin Inner BuildRight (28)
                           :     :- Project (24)
                           :     :  +- BroadcastHashJoin Inner BuildRight (23)
                           :     :     :- Project (19)
                           :     :     :  +- BroadcastHashJoin Inner BuildRight (18)
                           :     :     :     :- Project (13)
                           :     :     :     :  +- SortMergeJoin Inner (12)
                           :     :     :     :     :- Sort (6)
                           :     :     :     :     :  +- Exchange (5)
                           :     :     :     :     :     +- Project (4)
                           :     :     :     :     :        +- Filter (3)
                           :     :     :     :     :           +- Scan parquet  (2)
                           :     :     :     :     +- Sort (11)
                           :     :     :     :        +- Exchange (10)
                           :     :     :     :           +- Project (9)
                           :     :     :     :              +- Filter (8)
                           :     :     :     :                 +- Scan parquet  (7)
                           :     :     :     +- BroadcastExchange (17)
                           :     :     :        +- Project (16)
                           :     :     :           +- Filter (15)
                           :     :     :              +- Scan parquet  (14)
                           :     :     +- BroadcastExchange (22)
                           :     :        +- Filter (21)
                           :     :           +- Scan parquet  (20)
                           :     +- BroadcastExchange (27)
                           :        +- Filter (26)
                           :           +- Scan parquet  (25)
                           +- BroadcastExchange (32)
                              +- Filter (31)
                                 +- Scan parquet  (30)


(1) LocalTableScan
Output [4]: [c_last_name#421, c_first_name#420, s_store_name#669, paid#850]
Arguments: <empty>, [c_last_name#421, c_first_name#420, s_store_name#669, paid#850]

(2) Scan parquet 
Output [6]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149, ss_sold_date_sk#152]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/store_sales]
PushedFilters: [IsNotNull(ss_ticket_number), IsNotNull(ss_item_sk), IsNotNull(ss_store_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_item_sk:int,ss_customer_sk:int,ss_store_sk:int,ss_ticket_number:bigint,ss_net_paid:decimal(7,2)>

(3) Filter
Input [6]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149, ss_sold_date_sk#152]
Condition : (((isnotnull(ss_ticket_number#138L) AND isnotnull(ss_item_sk#131)) AND isnotnull(ss_store_sk#136)) AND isnotnull(ss_customer_sk#132))

(4) Project
Output [5]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149]
Input [6]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149, ss_sold_date_sk#152]

(5) Exchange
Input [5]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149]
Arguments: hashpartitioning(ss_ticket_number#138L, ss_item_sk#131, 300), ENSURE_REQUIREMENTS, [id=#309]

(6) Sort
Input [5]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149]
Arguments: [ss_ticket_number#138L ASC NULLS FIRST, ss_item_sk#131 ASC NULLS FIRST], false, 0

(7) Scan parquet 
Output [3]: [sr_item_sk#177, sr_ticket_number#184L, sr_returned_date_sk#195]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/store_returns]
PushedFilters: [IsNotNull(sr_ticket_number), IsNotNull(sr_item_sk)]
ReadSchema: struct<sr_item_sk:int,sr_ticket_number:bigint>

(8) Filter
Input [3]: [sr_item_sk#177, sr_ticket_number#184L, sr_returned_date_sk#195]
Condition : (isnotnull(sr_ticket_number#184L) AND isnotnull(sr_item_sk#177))

(9) Project
Output [2]: [sr_item_sk#177, sr_ticket_number#184L]
Input [3]: [sr_item_sk#177, sr_ticket_number#184L, sr_returned_date_sk#195]

(10) Exchange
Input [2]: [sr_item_sk#177, sr_ticket_number#184L]
Arguments: hashpartitioning(sr_ticket_number#184L, sr_item_sk#177, 300), ENSURE_REQUIREMENTS, [id=#310]

(11) Sort
Input [2]: [sr_item_sk#177, sr_ticket_number#184L]
Arguments: [sr_ticket_number#184L ASC NULLS FIRST, sr_item_sk#177 ASC NULLS FIRST], false, 0

(12) SortMergeJoin
Left keys [2]: [ss_ticket_number#138L, ss_item_sk#131]
Right keys [2]: [sr_ticket_number#184L, sr_item_sk#177]
Join condition: None

(13) Project
Output [4]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_net_paid#149]
Input [7]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149, sr_item_sk#177, sr_ticket_number#184L]

(14) Scan parquet 
Output [5]: [s_store_sk#664, s_store_name#669, s_market_id#674, s_state#688, s_zip#689]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/store]
PushedFilters: [IsNotNull(s_market_id), EqualTo(s_market_id,8), IsNotNull(s_store_sk), IsNotNull(s_zip)]
ReadSchema: struct<s_store_sk:int,s_store_name:string,s_market_id:int,s_state:string,s_zip:string>

(15) Filter
Input [5]: [s_store_sk#664, s_store_name#669, s_market_id#674, s_state#688, s_zip#689]
Condition : (((isnotnull(s_market_id#674) AND (s_market_id#674 = 8)) AND isnotnull(s_store_sk#664)) AND isnotnull(s_zip#689))

(16) Project
Output [4]: [s_store_sk#664, s_store_name#669, s_state#688, s_zip#689]
Input [5]: [s_store_sk#664, s_store_name#669, s_market_id#674, s_state#688, s_zip#689]

(17) BroadcastExchange
Input [4]: [s_store_sk#664, s_store_name#669, s_state#688, s_zip#689]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#316]

(18) BroadcastHashJoin
Left keys [1]: [ss_store_sk#136]
Right keys [1]: [s_store_sk#664]
Join condition: None

(19) Project
Output [6]: [ss_item_sk#131, ss_customer_sk#132, ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689]
Input [8]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_net_paid#149, s_store_sk#664, s_store_name#669, s_state#688, s_zip#689]

(20) Scan parquet 
Output [6]: [i_item_sk#564, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/item]
PushedFilters: [IsNotNull(i_color), EqualTo(i_color,pale), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_current_price:decimal(7,2),i_size:string,i_color:string,i_units:string,i_manager_id:int>

(21) Filter
Input [6]: [i_item_sk#564, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584]
Condition : ((isnotnull(i_color#581) AND (i_color#581 = pale)) AND isnotnull(i_item_sk#564))

(22) BroadcastExchange
Input [6]: [i_item_sk#564, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#320]

(23) BroadcastHashJoin
Left keys [1]: [ss_item_sk#131]
Right keys [1]: [i_item_sk#564]
Join condition: None

(24) Project
Output [10]: [ss_customer_sk#132, ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584]
Input [12]: [ss_item_sk#131, ss_customer_sk#132, ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689, i_item_sk#564, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584]

(25) Scan parquet 
Output [4]: [c_customer_sk#412, c_first_name#420, c_last_name#421, c_birth_country#426]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_birth_country)]
ReadSchema: struct<c_customer_sk:int,c_first_name:string,c_last_name:string,c_birth_country:string>

(26) Filter
Input [4]: [c_customer_sk#412, c_first_name#420, c_last_name#421, c_birth_country#426]
Condition : (isnotnull(c_customer_sk#412) AND isnotnull(c_birth_country#426))

(27) BroadcastExchange
Input [4]: [c_customer_sk#412, c_first_name#420, c_last_name#421, c_birth_country#426]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#324]

(28) BroadcastHashJoin
Left keys [1]: [ss_customer_sk#132]
Right keys [1]: [c_customer_sk#412]
Join condition: None

(29) Project
Output [12]: [ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584, c_first_name#420, c_last_name#421, c_birth_country#426]
Input [14]: [ss_customer_sk#132, ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584, c_customer_sk#412, c_first_name#420, c_last_name#421, c_birth_country#426]

(30) Scan parquet 
Output [3]: [ca_state#456, ca_zip#457, ca_country#458]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/customer_address]
PushedFilters: [IsNotNull(ca_country), IsNotNull(ca_zip)]
ReadSchema: struct<ca_state:string,ca_zip:string,ca_country:string>

(31) Filter
Input [3]: [ca_state#456, ca_zip#457, ca_country#458]
Condition : (isnotnull(ca_country#458) AND isnotnull(ca_zip#457))

(32) BroadcastExchange
Input [3]: [ca_state#456, ca_zip#457, ca_country#458]
Arguments: HashedRelationBroadcastMode(List(upper(input[2, string, false]), input[1, string, false]),false), [id=#328]

(33) BroadcastHashJoin
Left keys [2]: [c_birth_country#426, s_zip#689]
Right keys [2]: [upper(ca_country#458), ca_zip#457]
Join condition: None

(34) Project
Output [11]: [ss_net_paid#149, s_store_name#669, s_state#688, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584, c_first_name#420, c_last_name#421, ca_state#456]
Input [15]: [ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584, c_first_name#420, c_last_name#421, c_birth_country#426, ca_state#456, ca_zip#457, ca_country#458]

(35) HashAggregate
Input [11]: [ss_net_paid#149, s_store_name#669, s_state#688, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584, c_first_name#420, c_last_name#421, ca_state#456]
Keys [10]: [c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579]
Functions [1]: [partial_sum(UnscaledValue(ss_net_paid#149))]
Aggregate Attributes [1]: [sum#870L]
Results [11]: [c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579, sum#871L]

(36) Exchange
Input [11]: [c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579, sum#871L]
Arguments: hashpartitioning(c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579, 300), ENSURE_REQUIREMENTS, [id=#333]

(37) HashAggregate
Input [11]: [c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579, sum#871L]
Keys [10]: [c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579]
Functions [1]: [sum(UnscaledValue(ss_net_paid#149))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_net_paid#149))#853L]
Results [4]: [c_last_name#421, c_first_name#420, s_store_name#669, MakeDecimal(sum(UnscaledValue(ss_net_paid#149))#853L,17,2) AS netpaid#852]

(38) HashAggregate
Input [4]: [c_last_name#421, c_first_name#420, s_store_name#669, netpaid#852]
Keys [3]: [c_last_name#421, c_first_name#420, s_store_name#669]
Functions [1]: [partial_sum(netpaid#852)]
Aggregate Attributes [2]: [sum#866, isEmpty#867]
Results [5]: [c_last_name#421, c_first_name#420, s_store_name#669, sum#868, isEmpty#869]

(39) Exchange
Input [5]: [c_last_name#421, c_first_name#420, s_store_name#669, sum#868, isEmpty#869]
Arguments: hashpartitioning(c_last_name#421, c_first_name#420, s_store_name#669, 300), ENSURE_REQUIREMENTS, [id=#337]

(40) HashAggregate
Input [5]: [c_last_name#421, c_first_name#420, s_store_name#669, sum#868, isEmpty#869]
Keys [3]: [c_last_name#421, c_first_name#420, s_store_name#669]
Functions [1]: [sum(netpaid#852)]
Aggregate Attributes [1]: [sum(netpaid#852)#854]
Results [4]: [c_last_name#421, c_first_name#420, s_store_name#669, sum(netpaid#852)#854 AS paid#850]

(41) Filter
Input [4]: [c_last_name#421, c_first_name#420, s_store_name#669, paid#850]
Condition : (isnotnull(paid#850) AND (cast(paid#850 as decimal(33,8)) > cast(Subquery subquery#851, [id=#294] as decimal(33,8))))

(42) AdaptiveSparkPlan
Output [4]: [c_last_name#421, c_first_name#420, s_store_name#669, paid#850]
Arguments: isFinalPlan=true

Since I saw the PlanChangeLogger log related to Applying Rule org.apache.spark.sql.execution.adaptive.AQEPropagateEmptyRelation and Result of Batch Propagate Empty Relations in the driver log, so I tried to revert this pr and test again. I found that the problem no longer exists.

And already file a jira https://issues.apache.org/jira/browse/SPARK-40278

PS: I'm not sure whether master has fixed this issue

ulysses-you · 2022-08-30T14:01:37Z

hi @LuciferYang , thank you for the test. It is a known issue that when plan converts to empty in AQE once one side of shuffle stage is materialized, it is possible that there may exist some other stages which are still running. The plan is acutally finished then the driver is unblocked. So if the driver exits, the running stage will be cancelled as you see. It does not affect the result and just some unused running stages.

I have a related pr #34316 try to fix this issue, but it is not easy.

LuciferYang · 2022-08-30T14:14:49Z

@ulysses-you Do we have plans to continue #34316? Should we temporarily add a switch or some document for this feature? This may make users feel puzzled.

cloud-fan · 2022-08-30T15:05:43Z

So the problem is we have confusing error messages in the log? The query itself is fine, right?

LuciferYang · 2022-08-30T15:17:46Z

No, the problem is the SQL be classified as failed queries on the UI, and end users will question whether this sql is successful or failed

cloud-fan · 2022-08-30T15:47:53Z

Hi @ulysses-you , do you have time to investigate why the query is marked as failed in the UI? This seems something that is fix-able.

LuciferYang · 2022-08-31T02:50:46Z

So the problem is we have confusing error messages in the log? The query itself is fine, right?

@cloud-fan
I think more about this, there may another possible problems:

For the above demo program, if there are other jobs to be executed after spark.sql(sql).collect(), the new jobs cannot be run with sufficient concurrency due to resource not released by stage 31 and 36 in time.

That is to say, a completed SQL will affect the running efficiency of the next SQL. This will be displayed when using databricks spark-sql-pref to run TPCDS power test with all sqls.

cloud-fan · 2022-08-31T02:55:43Z

the new jobs cannot be run with sufficient concurrency due to resource not released by stage 31 and 36 in time.

This is not worse than before. Without this optimization, the first query will run longer and wait for the jobs to complete. At least we allow the second query to kick off earlier now. It's a further improvement to cancel the unused running jobs.

ulysses-you · 2022-08-31T03:09:26Z

do you have time to investigate why the query is marked as failed in the UI? This seems something that is fix-able.

@cloud-fan In general, the unused stages will run until finished, but if the driver stops after the plan finished, the running stages will be cancelled. That causes the failed stages in UI.

LuciferYang · 2022-08-31T03:14:35Z

Therefore, when the last statement of the App is such a case, it is likely to fail from the UI?

LuciferYang · 2022-08-31T03:17:25Z

the new jobs cannot be run with sufficient concurrency due to resource not released by stage 31 and 36 in time.

This is not worse than before. Without this optimization, the first query will run longer and wait for the jobs to complete. At least we allow the second query to kick off earlier now. It's a further improvement to cancel the unused running jobs.

make sense

cloud-fan · 2022-08-31T03:19:49Z

but if the driver stops after the plan finished, the running stages will be cancelled.

do you mean "before" the plan finishes? Anyway, can we detect this case and not mark the query as failed? e.g. if the query is already finished, ignore canceled stage events that arrive later.

Support propagate empty relation through aggregate/union

307e115

github-actions bot added the SQL label Jan 10, 2022

This was referenced Jan 10, 2022

[SPARK-35442][SQL] Support propagate empty relation through aggregate/union #35148

Closed

[SPARK-35442][SQL] Support propagate empty relation through aggregate #35135

Closed

ulysses-you commented Jan 10, 2022

View reviewed changes

cloud-fan reviewed Jan 10, 2022

View reviewed changes

ulysses-you added 2 commits January 10, 2022 11:08

address comment

ee1bc1a

fix

86329f0

cloud-fan reviewed Jan 10, 2022

View reviewed changes

address comment

a2fa0c7

cloud-fan approved these changes Jan 10, 2022

View reviewed changes

cloud-fan closed this in ee4c4e5 Jan 10, 2022

ulysses-you deleted the SPARK-35442-GA-SPARK branch January 10, 2022 12:27

cloud-fan reviewed Jan 11, 2022

View reviewed changes

[SPARK-35442][SQL] Support propagate empty relation through aggregate/union #35149

[SPARK-35442][SQL] Support propagate empty relation through aggregate/union #35149

Uh oh!

Conversation

ulysses-you commented Jan 10, 2022

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ulysses-you commented Jan 10, 2022

Uh oh!

cloud-fan commented Jan 10, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LuciferYang commented Aug 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ulysses-you commented Aug 30, 2022

Uh oh!

LuciferYang commented Aug 30, 2022

Uh oh!

cloud-fan commented Aug 30, 2022

Uh oh!

LuciferYang commented Aug 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloud-fan commented Aug 30, 2022

Uh oh!

LuciferYang commented Aug 31, 2022

Uh oh!

cloud-fan commented Aug 31, 2022

Uh oh!

ulysses-you commented Aug 31, 2022

Uh oh!

LuciferYang commented Aug 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LuciferYang commented Aug 31, 2022

Uh oh!

cloud-fan commented Aug 31, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LuciferYang commented Aug 30, 2022 •

edited

Loading

LuciferYang commented Aug 30, 2022 •

edited

Loading

LuciferYang commented Aug 31, 2022 •

edited

Loading