Skip to content

Remove allFragments from PlanPrinter#formatFragment#14924

Merged
wenleix merged 1 commit intoprestodb:masterfrom
wenleix:test_plan_printer
Jul 30, 2020
Merged

Remove allFragments from PlanPrinter#formatFragment#14924
wenleix merged 1 commit intoprestodb:masterfrom
wenleix:test_plan_printer

Conversation

@wenleix
Copy link
Contributor

@wenleix wenleix commented Jul 29, 2020

When stats and cost is displayed in #11268, allFragments is introduced to PlanPrinter#formatFragment since FragmentedPlanStatsCalculator and FragmentedPlanCostCalculator needs all fragments. It also changes the TypeProvider to use all variables in all fragments, looks like this change is by accident, see discussions in https://github.com/prestodb/presto/pull/11268/files#r462651639

Fragment stats and cost are now precomputed and stored in PlanFragment through #11511 . As a result, allFragments is not required any more by PlanPrinter#formatFragment.

cc @rschlussel

Test plan - Travis passed. Also spot check a plan for EXPLAIN (TYPE DISTRIBUTED) with multiple fragments to see that the stats look correct.

== NO RELEASE NOTE ==

@wenleix wenleix force-pushed the test_plan_printer branch from ca3955a to f975f3f Compare July 30, 2020 00:22
@wenleix
Copy link
Contributor Author

wenleix commented Jul 30, 2020

Travis is green! @rschlussel : does that mean we can remove allFragments now? 😃

@wenleix wenleix changed the title [Test Only] Remove allFragments from PlanPrinter#formatFragment Remove allFragments from PlanPrinter#formatFragment Jul 30, 2020
@rschlussel
Copy link
Contributor

Can you spot check a plan for EXPLAIN (TYPE DISTRIBUTED) with multiple fragments to see that the stats look correct? Otherwise looks good, thanks!

@wenleix
Copy link
Contributor Author

wenleix commented Jul 30, 2020

@rschlussel : Yeah, EXPLAIN TYPE(DISTRIBUTED) looks good:

-- HiveQueryRunner, hive.tpch

SELECT
    l.shipmode,
    sum(case
        when o.orderpriority = '1-URGENT'
            OR o.orderpriority = '2-HIGH'
            then 1
        else 0
    end) as high_line_count,
    sum(case
        when o.orderpriority <> '1-URGENT'
            AND o.orderpriority <> '2-HIGH'
            then 1
        else 0
    end) AS low_line_count
FROM
    orders o,
    lineitem l
WHERE
    o.orderkey = l.orderkey
    AND l.shipmode in ('MAIL', 'SHIP')
    AND l.commitdate < l.receiptdate
    AND l.shipdate < l.commitdate
    AND l.receiptdate >= date '1994-01-01'
    AND l.receiptdate < date '1994-01-01' + interval '1' year
GROUP BY
    l.shipmode
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Fragment 0 [SINGLE]
     Output layout: [shipmode, sum, sum_5]
     Output partitioning: SINGLE []
     Stage Execution Strategy: UNGROUPED_EXECUTION
     - Output[shipmode, high_line_count, low_line_count] => [shipmode:varchar(10), sum:bigint, sum_5:bigint]
             high_line_count := sum
             low_line_count := sum_5
         - RemoteSource[1] => [shipmode:varchar(10), sum:bigint, sum_5:bigint]

 Fragment 1 [HASH]
     Output layout: [shipmode, sum, sum_5]
     Output partitioning: SINGLE []
     Stage Execution Strategy: UNGROUPED_EXECUTION
     - Project[projectLocality = LOCAL] => [shipmode:varchar(10), sum:bigint, sum_5:bigint]
         - Aggregate(FINAL)[shipmode][$hashvalue] => [shipmode:varchar(10), $hashvalue:bigint, sum:bigint, sum_5:bigint]
                 sum := "presto.default.sum"((sum_18))
                 sum_5 := "presto.default.sum"((sum_19))
             - LocalExchange[HASH][$hashvalue] (shipmode) => [shipmode:varchar(10), sum_18:bigint, sum_19:bigint, $hashvalue:bigint]
                 - RemoteSource[2] => [shipmode:varchar(10), sum_18:bigint, sum_19:bigint, $hashvalue_20:bigint]

 Fragment 2 [HASH]
     Output layout: [shipmode, sum_18, sum_19, $hashvalue_26]
     Output partitioning: HASH [shipmode][$hashvalue_26]
     Stage Execution Strategy: UNGROUPED_EXECUTION
     - Aggregate(PARTIAL)[shipmode][$hashvalue_26] => [shipmode:varchar(10), $hashvalue_26:bigint, sum_18:bigint, sum_19:bigint]
             sum_18 := "presto.default.sum"((expr_3))
             sum_19 := "presto.default.sum"((expr_4))
         - Project[projectLocality = LOCAL] => [shipmode:varchar(10), expr_3:bigint, expr_4:bigint, $hashvalue_26:bigint]
                 Estimates: {rows: 2000 (70.86kB), cpu: 6092392.49, memory: 54569.17, network: 525757.17}
                 expr_3 := CAST(SWITCH(BOOLEAN true, WHEN(((orderpriority) = (VARCHAR(15) 1-URGENT)) OR ((orderpriority) = (VARCHAR(15) 2-HIGH)), INTEGER 1), INTEGER 0) AS bigint)
                 expr_4 := CAST(SWITCH(BOOLEAN true, WHEN(((orderpriority) <> (VARCHAR(15) 1-URGENT)) AND ((orderpriority) <> (VARCHAR(15) 2-HIGH)), INTEGER 1), INTEGER 0) AS bigint)
                 $hashvalue_26 := combine_hash(BIGINT 0, COALESCE($operator$hash_code(shipmode), BIGINT 0))
             - InnerJoin[("orderkey" = "orderkey_0")][$hashvalue_21, $hashvalue_23] => [orderpriority:varchar(15), shipmode:varchar(10)]
                     Estimates: {rows: 2000 (44.33kB), cpu: 6019826.62, memory: 54569.17, network: 525757.17}
                     Distribution: PARTITIONED
                 - RemoteSource[3] => [orderkey:bigint, orderpriority:varchar(15), $hashvalue_21:bigint]
                 - LocalExchange[HASH][$hashvalue_23] (orderkey_0) => [orderkey_0:bigint, shipmode:varchar(10), $hashvalue_23:bigint]
                         Estimates: {rows: 2000 (53.29kB), cpu: 4170109.52, memory: 0.00, network: 54569.17}
                     - RemoteSource[4] => [orderkey_0:bigint, shipmode:varchar(10), $hashvalue_24:bigint]

 Fragment 3 [SOURCE]
     Output layout: [orderkey, orderpriority, $hashvalue_22]
     Output partitioning: HASH [orderkey][$hashvalue_22]
     Stage Execution Strategy: UNGROUPED_EXECUTION
     - ScanProject[table = TableHandle {connectorId='hive', connectorHandle='HiveTableHandle{schemaName=tpch, tableName=orders, analyzePartitionValues=Optional.empty}', layout='Optional[tpch.
             Estimates: {rows: 15000 (460.14kB), cpu: 336188.00, memory: 0.00, network: 0.00}/{rows: 15000 (460.14kB), cpu: 807376.00, memory: 0.00, network: 0.00}
             $hashvalue_22 := combine_hash(BIGINT 0, COALESCE($operator$hash_code(orderkey), BIGINT 0))
             LAYOUT: tpch.orders{}
             orderpriority := orderpriority:varchar(15):5:REGULAR
             orderkey := orderkey:bigint:0:REGULAR

@wenleix wenleix merged commit c53732e into prestodb:master Jul 30, 2020
@wenleix wenleix deleted the test_plan_printer branch July 30, 2020 15:36
@wenleix
Copy link
Contributor Author

wenleix commented Jul 30, 2020

My bad, forgot to amend the commit message before merge (only update the PR title :( )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants