Skip to content

Use precomputed stats/cost estimates in distributed explain plans#11511

Merged
rschlussel merged 6 commits intoprestodb:masterfrom
rschlussel:explain-stats
Oct 3, 2018
Merged

Use precomputed stats/cost estimates in distributed explain plans#11511
rschlussel merged 6 commits intoprestodb:masterfrom
rschlussel:explain-stats

Conversation

@rschlussel
Copy link
Contributor

No description provided.

Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a first pass. Let me see it again once you address the comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this.symbolStatistics = HashTreePMap.from(requireNonNull(symbolStatistics, "symbolStatistics is null"));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to inline because i want to follow the style of the rest of the class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This style is more verbose and more error prone. You can fix it in a separate commit, and than have your fields inlined. But that's up to you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline createPlan, or move it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline all the requireNonNulls in a separate commit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove functionRegistry and statsCalculator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function registry ultimately gets used in PlanPrinter.formatFragment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? Maybe extract it to a separate commit with an explanation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unintentional. I'll remove

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass the stats and costs maps instead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately we don't want to have any references to StatsProvider or CostProvider in this class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All IS_NULL_OVERHEAD multiplications seems to deserve a separate commit

@arhimondr
Copy link
Member

I love this change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or AstUtils.preOrder(node).forEach

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the (Plan's) constructor is not the best place to do this, especially since calculating stats may involve a lot, including network communication

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use just stats very often (predominantly? i don't know)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don;t quite like this being part of the plan -- would it be possible to keep it separate and pass further down separately?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can have a separate class StatsAndCosts that contains both stats and costs maps, and we can pass it all around as a single object. I think it will be even nicer.

@rschlussel rschlussel force-pushed the explain-stats branch 2 times, most recently from 4e94be0 to e5f1b46 Compare September 21, 2018 17:32
Copy link
Contributor Author

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function registry ultimately gets used in PlanPrinter.formatFragment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true. it's there for future support of groupid nodes, but seeing as there aren't any immediate plans to expand that out, I think it's fair to add it later when it's needed. I'll remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see both, and I thought we prefer things that are in Java rather than guava when they're available. but happy to switch it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use them interchangeably throughout the code. I think they are pretty equivalent as names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to StatsAndCosts.Builder

Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make stats/cost estimates serializable

LGTM % nit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Just do this(outputRowCount, HashTreePMap.from(requireNonNull(symbolStatistics, "symbolStatistics is null"))) to avoud copy pasting the argument checks

Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass pre-computed stats/costs through to plan fragments

Some comments

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImmutableMap.copyOf()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImmutableMap.copyOf()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to store UNKNOWN stats.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to store UKNOWN costs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Move getters after the constructor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace it with StatsAndCosts.empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace it with StatsAndCosts.empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace it with StatsAndCosts.empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you remove the statsAndCosts field from the Plan you can simply pass StatsAndCosts.empty() here. Also don't forget to remove CostProvider costProvider = node -> UNKNOWN_COST;, as it will be not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave stats and cost with the plan, but the costProvider is already not needed. I took it out, but somehow it got added back while I was resolving conflicts during rebase. I'll remove :).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StatsAndCosts.empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use precomputed stats/costs in distributed explains

Some comments

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just unused. An might be unused for a very long time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statsAndCosts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you decided to change the name don't forget to fix the message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to distinguish it from the execution stats.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StatsAndCosts.empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would refactor this part to be like this

private void printPlanNodesStatsAndCost(int indent, PlanNode... nodes)
        {
            List<String> statsAndCosts = Arrays.stream(nodes)
                    .map(this::formatPlanNodeStatsAndCost)
                    .filter(Optional::isPresent)
                    .map(Optional::get)
                    .collect(toImmutableList());

            if (!statsAndCosts.isEmpty()) {
                print(indent, "Cost: %s", Joiner.on("/").join(statsAndCosts));
            }
        }

        private Optional<String> formatPlanNodeStatsAndCost(PlanNode node)
        {
            PlanNodeStatsEstimate stats = estimatedStatsAndCosts.getStats().get(node.getId());
            PlanNodeCostEstimate cost = estimatedStatsAndCosts.getCosts().get(node.getId());
            if (stats == null || cost == null) {
                return Optional.empty();
            }
            return Optional.of(String.format("{rows: %s (%s), cpu: %s, memory: %s, network: %s}",
                    formatAsLong(stats.getOutputRowCount()),
                    formatEstimateAsDataSize(stats.getOutputSizeInBytes(node.getOutputSymbols(), types)),
                    formatDouble(cost.getCpuCost()),
                    formatDouble(cost.getMemoryCost()),
                    formatDouble(cost.getNetworkCost())));
        }

Also consider not printing stats if stats for some node is missing. If, for example, for ScanFilterAndProject, the stats are missing for Filter - the output may be confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that. I'm doing that with a slight modification that I'll still check for unknown because it's still possible for the Stats/Cost maps to contain UNKNOWN_STATS and I don't think we gain anything from enforcing that in StatsAndCosts itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImmutableMap.copyOf()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StatsAndCosts.empty()

Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline requireNonNull in Plan and LogicalPlanner

LGTM. Thanks for doing this.

Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove lookup from CostCalculator interface

LGTM, thanks for doing that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this segmented per fragment? The costs are per node, so they should be associated with the overall query plan.

Copy link
Member

@arhimondr arhimondr Sep 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had to attach the stats to the Fragment, so it is easier to pass it to the QueryMonitor and PlanPrinter.

Otherwise you need to toss stats around across many classes. SqlQueryExecution#analyzeQuery -> PlanRoot -> SqlQueryExecution#planDistribution -> SqlQueryScheduler -> StageInfo -> QueryStateMachine-> QueryInfo -> QueryMonitor

It can be simplified somehow, by taking the stats in the SqlQueryExecution#buildQueryInfo from the SqlQueryExecution#queryPlan, and storing the stats directly into the QueryInfo. But than the stats object has to be tossed around in the PlanPrinter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are associated with the overall plan, but once you have a list of fragments, you need to still be able to get the stats and costs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked to @martint offline, and we agreed that storing StatsAndCosts in PlanFragment is good for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these need to be json serializable? At what point do they leave the coordinator?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was done just so it can be included into the PlanFragment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so see my other comment about why are these being attached to the fragments vs the overall plan (since they are not a per-fragment entity)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we decided to do not attach this to PlanFragment, it still has to be serializable, since it hass to be passed to the QueryMonitor as part of the QueryInfo object, which is serizalizable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they don't, but fragments get serialized and deserialized somehow before explain. Honestly, I didn't look into why that happens, I just saw the serialization errors when I tried running EXPLAIN (TYPE DISTRIBUTED) and fixed them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a space before to separate statistic constants from the object fields

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move it before constructor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline this

@rschlussel
Copy link
Contributor Author

Thanks @arhimondr for the review. addressed comments, and i'll merge after the release.

@arhimondr
Copy link
Member

@rschlussel Could you please also change the MetricComparator.getEstimatedValuesInternal to get the precomputed stats from the plan?

@dain is working on a patch that auto-closes the transaction in case when query got canceled. He is getting failures in the TestTpchDistributedStats. Here is the stacktrace: https://gist.github.com/dain/0935079644eef95045e395cf7137332a

@dain dain mentioned this pull request Sep 26, 2018
@rschlussel
Copy link
Contributor Author

@rschlussel Could you please also change the MetricComparator.getEstimatedValuesInternal to get the precomputed stats from the plan?

Updated thanks

@mbasmanova
Copy link
Contributor

@rschlussel Rebecca, would you update the PR description to show the new output of the EXPLAIN query?

@sopel39
Copy link
Contributor

sopel39 commented Oct 2, 2018

@rschlussel
What benefit does this PR bring over FragmentedPlanCostCalculator and FragmentedPlanStatsCalculator. Is this just an optimization?

@rschlussel
Copy link
Contributor Author

rschlussel commented Oct 3, 2018

@sopel39

  1. (the main motivation) FragmentedPlanCostCalculator can't be used in the QueryMonitor because it's after the transaction has finished.
  2. (added bonus) it's simpler

@mbasmanova explain plans look basically the same. The difference is that plans from the QueryMonitor now also have stats.

@sopel39
Copy link
Contributor

sopel39 commented Oct 3, 2018

@rschlussel Event listeners would also have complete stats, correct?

In order to pass stats/cost estimates to plan fragments, they need to be
serializable.
This will be used to print estimated stats/costs in explains
Computing stats on the fly couldn't be used from the QueryMonitor
because that gets called after the transaction completes.
It doesn't get used, and there aren't immediate plans to add support.
Simplify things for now and when it becomes useful we can add it back.
This way stats computation remains within the transaction.
@rschlussel rschlussel merged commit 2399f1c into prestodb:master Oct 3, 2018
@dain
Copy link
Contributor

dain commented Oct 3, 2018

@rschlussel, were you able to update TestTpchDistributedStats also?

@rschlussel
Copy link
Contributor Author

@dain Yes, those are the metric comparator changes.

arhimondr added a commit to arhimondr/presto that referenced this pull request Oct 23, 2018
Stats calculator fails on assertions for complex queries, thus it is not production ready yet.

The prestodb#11511 changes the place where the stats are computed to be
displayed in the final plan. Before this patch, the stats were computed in QueryMonitor, when printing a final plan.
Before if the stats couldn't be computed for whatever reason, only the text plan generation would fail.

Currently, when the stats calculator is invoked for every query during the initial planning, queries may
be failing, even when the CBO is not used.

This change disables stats calculator by default. It can be enabled back with a session property on per query basis.
arhimondr added a commit that referenced this pull request Oct 23, 2018
Stats calculator fails on assertions for complex queries, thus it is not production ready yet.

The #11511 changes the place where the stats are computed to be
displayed in the final plan. Before this patch, the stats were computed in QueryMonitor, when printing a final plan.
Before if the stats couldn't be computed for whatever reason, only the text plan generation would fail.

Currently, when the stats calculator is invoked for every query during the initial planning, queries may
be failing, even when the CBO is not used.

This change disables stats calculator by default. It can be enabled back with a session property on per query basis.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants