Skip to content

Conversation

@ulysses-you
Copy link
Contributor

What changes were proposed in this pull request?

  • Add LogicalQueryStage(_, agg: BaseAggregateExec) check in AQEPropagateEmptyRelation
  • Add LeafNode check in PropagateEmptyRelationBase, so we can eliminate LogicalQueryStage to LocalRelation
  • Unify the applyFunc and commonApplyFunc in PropagateEmptyRelationBase

Why are the changes needed?

The Aggregate in AQE is different with others, the LogicalQueryStage looks like LogicalQueryStage(Aggregate, BaseAggregate). We should handle this case specially.

Logically, if the Aggregate grouping expression is not empty, we can eliminate it safely.

Does this PR introduce any user-facing change?

no

How was this patch tested?

add new test in AdaptiveQueryExecSuite

  • Support propagate empty relation through aggregate
  • Support propagate empty relation through union

val outputAliases = outputs.map { case (newAttr, oldAttr) =>
val newExplicitMetadata =
if (oldAttr.metadata != newAttr.metadata) Some(oldAttr.metadata) else None
Alias(newAttr, oldAttr.name)(explicitMetadata = newExplicitMetadata)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small bug fix, the previous is
Alias(newAttr, oldAttr.name)(oldAttr.exprId, explicitMetadata = newExplicitMetadata)

We should make alias use the not same expr id with references as the integral check do LogicalPlanIntegrity.checkIfExprIdsAreGloballyUnique.

A negative case:

SELECT key, count(*) FROM testData WHERE value = 'no_match' GROUP BY key
UNION ALL
SELECT key, 1 FROM testData

override protected def nonEmpty(plan: LogicalPlan): Boolean =
super.nonEmpty(plan) || getRowCount(plan).exists(_ > 0)

private def getRowCount(plan: LogicalPlan): Option[BigInt] = plan match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest renaming this to getEstimatedRowCount with a comment to say that it can be over estimated if row count is not 0, to match the new change.

withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true") {
val (plan1, adaptivePlan1) = runAdaptiveAndVerifyResult(
"SELECT key, count(*) FROM testData WHERE value = 'no_match' GROUP BY key")
assert(findTopLevelBaeAggregate(plan1).size == 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it's clearer to check assert(!plan1.isInstanceOf[LocalTableScanExec])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we can remove this now.

super.nonEmpty(plan) || getEstimatedRowCount(plan).exists(_ > 0)

private def getRowCount(plan: LogicalPlan): Option[BigInt] = plan match {
private def getEstimatedRowCount(plan: LogicalPlan): Option[BigInt] = plan match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's document the assumptions: 0 means the plan must produce 0 row. Positive value means an estimated row count which can be over-estimated.

@ulysses-you
Copy link
Contributor Author

thank you @cloud-fan , updated with

  • added some comments at getEstimatedRowCount
  • removed the aggregate num check in test

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in ee4c4e5 Jan 10, 2022
@ulysses-you ulysses-you deleted the SPARK-35442-GA-SPARK branch January 10, 2022 12:27
} else {
val newExplicitMetadata =
if (oldAttr.metadata != newAttr.metadata) Some(oldAttr.metadata) else None
Alias(newAttr, oldAttr.name)(oldAttr.exprId, explicitMetadata = newExplicitMetadata)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you bypass LogicalPlanIntegrity.checkIfExprIdsAreGloballyUnique as we introduce conflicting attr id here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked if the exprId is same, and reuse the original attr if same, create a new alias if different.

if (newAttr.exprId == oldAttr.exprId) {
  newAttr
} else {
  val newExplicitMetadata =
    if (oldAttr.metadata != newAttr.metadata) Some(oldAttr.metadata) else None
  Alias(newAttr, oldAttr.name)(oldAttr.exprId, explicitMetadata = newExplicitMetadata)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the conflicting attr id is introduced by #29053, original code is:

val outputAliases = outputs.map { case (newAttr, oldAttr) =>
val newExplicitMetadata =
  if (oldAttr.metadata != newAttr.metadata) Some(oldAttr.metadata) else None
Alias(newAttr, oldAttr.name)(oldAttr.exprId, explicitMetadata = newExplicitMetadata)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, why doesn't LogicalPlanIntegrity.checkIfExprIdsAreGloballyUnique capture this...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current branch we only create Alias if the old attr id is different with new attr id, so it won't introduce the conflicting attr id.

For the #29053, it seems there is no test can cover this case so it passed..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's easy to trigger this bug: Union(A, B), A is empty and we return B instead. @ulysses-you can you help to fix this bug? We have a common solution for it: QueryPlan.transformWithNewOutput

dchvn pushed a commit to dchvn/spark that referenced this pull request Jan 19, 2022
…/union

### What changes were proposed in this pull request?

- Add `LogicalQueryStage(_, agg: BaseAggregateExec)` check in `AQEPropagateEmptyRelation`
- Add `LeafNode` check in `PropagateEmptyRelationBase`, so we can eliminate `LogicalQueryStage` to `LocalRelation`
- Unify the `applyFunc` and `commonApplyFunc` in `PropagateEmptyRelationBase`

### Why are the changes needed?

The Aggregate in AQE is different with others, the `LogicalQueryStage` looks like `LogicalQueryStage(Aggregate, BaseAggregate)`. We should handle this case specially.

Logically, if the Aggregate grouping expression is not empty, we can eliminate it safely.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

add new test in `AdaptiveQueryExecSuite`
- `Support propagate empty relation through aggregate`
- `Support propagate empty relation through union`

Closes apache#35149 from ulysses-you/SPARK-35442-GA-SPARK.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@LuciferYang
Copy link
Contributor

LuciferYang commented Aug 30, 2022

@cloud-fan @ulysses-you Sorry to disturb you in a pr that has been merged, but I find that this pr may have some negative effects.

I used databricks spark-sql-pref + Spark 3.3 in spark-shell to run 3TB TPCDS q24a or q24b, the test code as follows:

val rootDir = "hdfs://${clusterName}/tpcds-data/POCGenData3T"
val databaseName = "tpcds_database"
val scaleFactor = "3072"
val format = "parquet"
import com.databricks.spark.sql.perf.tpcds.TPCDSTables
// just use to load tables
val tables = new TPCDSTables(
      spark.sqlContext,dsdgenDir = "./tpcds-kit/tools",
      scaleFactor = scaleFactor,
      useDoubleForDecimal = false,useStringForDate = false)
spark.sql(s"create database $databaseName")
tables.createTemporaryTables(rootDir, format)
spark.sql(s"use $databaseName")
// TPCDS 24a
val result = spark.sql(""" with ssales as
 (select c_last_name, c_first_name, s_store_name, ca_state, s_state, i_color,
        i_current_price, i_manager_id, i_units, i_size, sum(ss_net_paid) netpaid
 from store_sales, store_returns, store, item, customer, customer_address
 where ss_ticket_number = sr_ticket_number
   and ss_item_sk = sr_item_sk
   and ss_customer_sk = c_customer_sk
   and ss_item_sk = i_item_sk
   and ss_store_sk = s_store_sk
   and c_birth_country = upper(ca_country)
   and s_zip = ca_zip
 and s_market_id = 8
 group by c_last_name, c_first_name, s_store_name, ca_state, s_state, i_color,
          i_current_price, i_manager_id, i_units, i_size)
 select c_last_name, c_first_name, s_store_name, sum(netpaid) paid
 from ssales
 where i_color = 'pale'
 group by c_last_name, c_first_name, s_store_name
 having sum(netpaid) > (select 0.05*avg(netpaid) from ssales)""").collect()
 sc.stop() 

The above test may failed due to Stage cancelled because SparkContext was shut down of stage 31 and stage 36 when AQE enabled as follows:

image

image

image

The DAG corresponding to sql is as follows:

image

And the details as follows:

== Physical Plan ==
AdaptiveSparkPlan (42)
+- == Final Plan ==
   LocalTableScan (1)
+- == Initial Plan ==
   Filter (41)
   +- HashAggregate (40)
      +- Exchange (39)
         +- HashAggregate (38)
            +- HashAggregate (37)
               +- Exchange (36)
                  +- HashAggregate (35)
                     +- Project (34)
                        +- BroadcastHashJoin Inner BuildRight (33)
                           :- Project (29)
                           :  +- BroadcastHashJoin Inner BuildRight (28)
                           :     :- Project (24)
                           :     :  +- BroadcastHashJoin Inner BuildRight (23)
                           :     :     :- Project (19)
                           :     :     :  +- BroadcastHashJoin Inner BuildRight (18)
                           :     :     :     :- Project (13)
                           :     :     :     :  +- SortMergeJoin Inner (12)
                           :     :     :     :     :- Sort (6)
                           :     :     :     :     :  +- Exchange (5)
                           :     :     :     :     :     +- Project (4)
                           :     :     :     :     :        +- Filter (3)
                           :     :     :     :     :           +- Scan parquet  (2)
                           :     :     :     :     +- Sort (11)
                           :     :     :     :        +- Exchange (10)
                           :     :     :     :           +- Project (9)
                           :     :     :     :              +- Filter (8)
                           :     :     :     :                 +- Scan parquet  (7)
                           :     :     :     +- BroadcastExchange (17)
                           :     :     :        +- Project (16)
                           :     :     :           +- Filter (15)
                           :     :     :              +- Scan parquet  (14)
                           :     :     +- BroadcastExchange (22)
                           :     :        +- Filter (21)
                           :     :           +- Scan parquet  (20)
                           :     +- BroadcastExchange (27)
                           :        +- Filter (26)
                           :           +- Scan parquet  (25)
                           +- BroadcastExchange (32)
                              +- Filter (31)
                                 +- Scan parquet  (30)


(1) LocalTableScan
Output [4]: [c_last_name#421, c_first_name#420, s_store_name#669, paid#850]
Arguments: <empty>, [c_last_name#421, c_first_name#420, s_store_name#669, paid#850]

(2) Scan parquet 
Output [6]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149, ss_sold_date_sk#152]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/store_sales]
PushedFilters: [IsNotNull(ss_ticket_number), IsNotNull(ss_item_sk), IsNotNull(ss_store_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_item_sk:int,ss_customer_sk:int,ss_store_sk:int,ss_ticket_number:bigint,ss_net_paid:decimal(7,2)>

(3) Filter
Input [6]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149, ss_sold_date_sk#152]
Condition : (((isnotnull(ss_ticket_number#138L) AND isnotnull(ss_item_sk#131)) AND isnotnull(ss_store_sk#136)) AND isnotnull(ss_customer_sk#132))

(4) Project
Output [5]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149]
Input [6]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149, ss_sold_date_sk#152]

(5) Exchange
Input [5]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149]
Arguments: hashpartitioning(ss_ticket_number#138L, ss_item_sk#131, 300), ENSURE_REQUIREMENTS, [id=#309]

(6) Sort
Input [5]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149]
Arguments: [ss_ticket_number#138L ASC NULLS FIRST, ss_item_sk#131 ASC NULLS FIRST], false, 0

(7) Scan parquet 
Output [3]: [sr_item_sk#177, sr_ticket_number#184L, sr_returned_date_sk#195]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/store_returns]
PushedFilters: [IsNotNull(sr_ticket_number), IsNotNull(sr_item_sk)]
ReadSchema: struct<sr_item_sk:int,sr_ticket_number:bigint>

(8) Filter
Input [3]: [sr_item_sk#177, sr_ticket_number#184L, sr_returned_date_sk#195]
Condition : (isnotnull(sr_ticket_number#184L) AND isnotnull(sr_item_sk#177))

(9) Project
Output [2]: [sr_item_sk#177, sr_ticket_number#184L]
Input [3]: [sr_item_sk#177, sr_ticket_number#184L, sr_returned_date_sk#195]

(10) Exchange
Input [2]: [sr_item_sk#177, sr_ticket_number#184L]
Arguments: hashpartitioning(sr_ticket_number#184L, sr_item_sk#177, 300), ENSURE_REQUIREMENTS, [id=#310]

(11) Sort
Input [2]: [sr_item_sk#177, sr_ticket_number#184L]
Arguments: [sr_ticket_number#184L ASC NULLS FIRST, sr_item_sk#177 ASC NULLS FIRST], false, 0

(12) SortMergeJoin
Left keys [2]: [ss_ticket_number#138L, ss_item_sk#131]
Right keys [2]: [sr_ticket_number#184L, sr_item_sk#177]
Join condition: None

(13) Project
Output [4]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_net_paid#149]
Input [7]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_ticket_number#138L, ss_net_paid#149, sr_item_sk#177, sr_ticket_number#184L]

(14) Scan parquet 
Output [5]: [s_store_sk#664, s_store_name#669, s_market_id#674, s_state#688, s_zip#689]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/store]
PushedFilters: [IsNotNull(s_market_id), EqualTo(s_market_id,8), IsNotNull(s_store_sk), IsNotNull(s_zip)]
ReadSchema: struct<s_store_sk:int,s_store_name:string,s_market_id:int,s_state:string,s_zip:string>

(15) Filter
Input [5]: [s_store_sk#664, s_store_name#669, s_market_id#674, s_state#688, s_zip#689]
Condition : (((isnotnull(s_market_id#674) AND (s_market_id#674 = 8)) AND isnotnull(s_store_sk#664)) AND isnotnull(s_zip#689))

(16) Project
Output [4]: [s_store_sk#664, s_store_name#669, s_state#688, s_zip#689]
Input [5]: [s_store_sk#664, s_store_name#669, s_market_id#674, s_state#688, s_zip#689]

(17) BroadcastExchange
Input [4]: [s_store_sk#664, s_store_name#669, s_state#688, s_zip#689]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#316]

(18) BroadcastHashJoin
Left keys [1]: [ss_store_sk#136]
Right keys [1]: [s_store_sk#664]
Join condition: None

(19) Project
Output [6]: [ss_item_sk#131, ss_customer_sk#132, ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689]
Input [8]: [ss_item_sk#131, ss_customer_sk#132, ss_store_sk#136, ss_net_paid#149, s_store_sk#664, s_store_name#669, s_state#688, s_zip#689]

(20) Scan parquet 
Output [6]: [i_item_sk#564, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/item]
PushedFilters: [IsNotNull(i_color), EqualTo(i_color,pale), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_current_price:decimal(7,2),i_size:string,i_color:string,i_units:string,i_manager_id:int>

(21) Filter
Input [6]: [i_item_sk#564, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584]
Condition : ((isnotnull(i_color#581) AND (i_color#581 = pale)) AND isnotnull(i_item_sk#564))

(22) BroadcastExchange
Input [6]: [i_item_sk#564, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#320]

(23) BroadcastHashJoin
Left keys [1]: [ss_item_sk#131]
Right keys [1]: [i_item_sk#564]
Join condition: None

(24) Project
Output [10]: [ss_customer_sk#132, ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584]
Input [12]: [ss_item_sk#131, ss_customer_sk#132, ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689, i_item_sk#564, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584]

(25) Scan parquet 
Output [4]: [c_customer_sk#412, c_first_name#420, c_last_name#421, c_birth_country#426]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_birth_country)]
ReadSchema: struct<c_customer_sk:int,c_first_name:string,c_last_name:string,c_birth_country:string>

(26) Filter
Input [4]: [c_customer_sk#412, c_first_name#420, c_last_name#421, c_birth_country#426]
Condition : (isnotnull(c_customer_sk#412) AND isnotnull(c_birth_country#426))

(27) BroadcastExchange
Input [4]: [c_customer_sk#412, c_first_name#420, c_last_name#421, c_birth_country#426]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#324]

(28) BroadcastHashJoin
Left keys [1]: [ss_customer_sk#132]
Right keys [1]: [c_customer_sk#412]
Join condition: None

(29) Project
Output [12]: [ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584, c_first_name#420, c_last_name#421, c_birth_country#426]
Input [14]: [ss_customer_sk#132, ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584, c_customer_sk#412, c_first_name#420, c_last_name#421, c_birth_country#426]

(30) Scan parquet 
Output [3]: [ca_state#456, ca_zip#457, ca_country#458]
Batched: true
Location: InMemoryFileIndex [afs://tianqi.afs.baidu.com:9902/user/emr_spark_history/tpcds-data/POCGenData3T/customer_address]
PushedFilters: [IsNotNull(ca_country), IsNotNull(ca_zip)]
ReadSchema: struct<ca_state:string,ca_zip:string,ca_country:string>

(31) Filter
Input [3]: [ca_state#456, ca_zip#457, ca_country#458]
Condition : (isnotnull(ca_country#458) AND isnotnull(ca_zip#457))

(32) BroadcastExchange
Input [3]: [ca_state#456, ca_zip#457, ca_country#458]
Arguments: HashedRelationBroadcastMode(List(upper(input[2, string, false]), input[1, string, false]),false), [id=#328]

(33) BroadcastHashJoin
Left keys [2]: [c_birth_country#426, s_zip#689]
Right keys [2]: [upper(ca_country#458), ca_zip#457]
Join condition: None

(34) Project
Output [11]: [ss_net_paid#149, s_store_name#669, s_state#688, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584, c_first_name#420, c_last_name#421, ca_state#456]
Input [15]: [ss_net_paid#149, s_store_name#669, s_state#688, s_zip#689, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584, c_first_name#420, c_last_name#421, c_birth_country#426, ca_state#456, ca_zip#457, ca_country#458]

(35) HashAggregate
Input [11]: [ss_net_paid#149, s_store_name#669, s_state#688, i_current_price#569, i_size#579, i_color#581, i_units#582, i_manager_id#584, c_first_name#420, c_last_name#421, ca_state#456]
Keys [10]: [c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579]
Functions [1]: [partial_sum(UnscaledValue(ss_net_paid#149))]
Aggregate Attributes [1]: [sum#870L]
Results [11]: [c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579, sum#871L]

(36) Exchange
Input [11]: [c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579, sum#871L]
Arguments: hashpartitioning(c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579, 300), ENSURE_REQUIREMENTS, [id=#333]

(37) HashAggregate
Input [11]: [c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579, sum#871L]
Keys [10]: [c_last_name#421, c_first_name#420, s_store_name#669, ca_state#456, s_state#688, i_color#581, i_current_price#569, i_manager_id#584, i_units#582, i_size#579]
Functions [1]: [sum(UnscaledValue(ss_net_paid#149))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_net_paid#149))#853L]
Results [4]: [c_last_name#421, c_first_name#420, s_store_name#669, MakeDecimal(sum(UnscaledValue(ss_net_paid#149))#853L,17,2) AS netpaid#852]

(38) HashAggregate
Input [4]: [c_last_name#421, c_first_name#420, s_store_name#669, netpaid#852]
Keys [3]: [c_last_name#421, c_first_name#420, s_store_name#669]
Functions [1]: [partial_sum(netpaid#852)]
Aggregate Attributes [2]: [sum#866, isEmpty#867]
Results [5]: [c_last_name#421, c_first_name#420, s_store_name#669, sum#868, isEmpty#869]

(39) Exchange
Input [5]: [c_last_name#421, c_first_name#420, s_store_name#669, sum#868, isEmpty#869]
Arguments: hashpartitioning(c_last_name#421, c_first_name#420, s_store_name#669, 300), ENSURE_REQUIREMENTS, [id=#337]

(40) HashAggregate
Input [5]: [c_last_name#421, c_first_name#420, s_store_name#669, sum#868, isEmpty#869]
Keys [3]: [c_last_name#421, c_first_name#420, s_store_name#669]
Functions [1]: [sum(netpaid#852)]
Aggregate Attributes [1]: [sum(netpaid#852)#854]
Results [4]: [c_last_name#421, c_first_name#420, s_store_name#669, sum(netpaid#852)#854 AS paid#850]

(41) Filter
Input [4]: [c_last_name#421, c_first_name#420, s_store_name#669, paid#850]
Condition : (isnotnull(paid#850) AND (cast(paid#850 as decimal(33,8)) > cast(Subquery subquery#851, [id=#294] as decimal(33,8))))

(42) AdaptiveSparkPlan
Output [4]: [c_last_name#421, c_first_name#420, s_store_name#669, paid#850]
Arguments: isFinalPlan=true

Since I saw the PlanChangeLogger log related to Applying Rule org.apache.spark.sql.execution.adaptive.AQEPropagateEmptyRelation and Result of Batch Propagate Empty Relations in the driver log, so I tried to revert this pr and test again. I found that the problem no longer exists.

And already file a jira https://issues.apache.org/jira/browse/SPARK-40278

PS: I'm not sure whether master has fixed this issue

@ulysses-you
Copy link
Contributor Author

hi @LuciferYang , thank you for the test. It is a known issue that when plan converts to empty in AQE once one side of shuffle stage is materialized, it is possible that there may exist some other stages which are still running. The plan is acutally finished then the driver is unblocked. So if the driver exits, the running stage will be cancelled as you see. It does not affect the result and just some unused running stages.

I have a related pr #34316 try to fix this issue, but it is not easy.

@LuciferYang
Copy link
Contributor

@ulysses-you Do we have plans to continue #34316? Should we temporarily add a switch or some document for this feature? This may make users feel puzzled.

@cloud-fan
Copy link
Contributor

So the problem is we have confusing error messages in the log? The query itself is fine, right?

@LuciferYang
Copy link
Contributor

LuciferYang commented Aug 30, 2022

No, the problem is the SQL be classified as failed queries on the UI, and end users will question whether this sql is successful or failed

@cloud-fan
Copy link
Contributor

Hi @ulysses-you , do you have time to investigate why the query is marked as failed in the UI? This seems something that is fix-able.

@LuciferYang
Copy link
Contributor

So the problem is we have confusing error messages in the log? The query itself is fine, right?

@cloud-fan
I think more about this, there may another possible problems:

For the above demo program, if there are other jobs to be executed after spark.sql(sql).collect(), the new jobs cannot be run with sufficient concurrency due to resource not released by stage 31 and 36 in time.

That is to say, a completed SQL will affect the running efficiency of the next SQL. This will be displayed when using databricks spark-sql-pref to run TPCDS power test with all sqls.

@cloud-fan
Copy link
Contributor

the new jobs cannot be run with sufficient concurrency due to resource not released by stage 31 and 36 in time.

This is not worse than before. Without this optimization, the first query will run longer and wait for the jobs to complete. At least we allow the second query to kick off earlier now. It's a further improvement to cancel the unused running jobs.

@ulysses-you
Copy link
Contributor Author

do you have time to investigate why the query is marked as failed in the UI? This seems something that is fix-able.

@cloud-fan In general, the unused stages will run until finished, but if the driver stops after the plan finished, the running stages will be cancelled. That causes the failed stages in UI.

@LuciferYang
Copy link
Contributor

LuciferYang commented Aug 31, 2022

Therefore, when the last statement of the App is such a case, it is likely to fail from the UI?

@LuciferYang
Copy link
Contributor

the new jobs cannot be run with sufficient concurrency due to resource not released by stage 31 and 36 in time.

This is not worse than before. Without this optimization, the first query will run longer and wait for the jobs to complete. At least we allow the second query to kick off earlier now. It's a further improvement to cancel the unused running jobs.

make sense

@cloud-fan
Copy link
Contributor

but if the driver stops after the plan finished, the running stages will be cancelled.

do you mean "before" the plan finishes? Anyway, can we detect this case and not mark the query as failed? e.g. if the query is already finished, ignore canceled stage events that arrive later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants