Skip to content

Add log for canonicalized plans in HBO#22306

Merged
pranjalssh merged 1 commit intoprestodb:masterfrom
feilong-liu:log_hbo_canonical_plan
Mar 29, 2024
Merged

Add log for canonicalized plans in HBO#22306
pranjalssh merged 1 commit intoprestodb:masterfrom
feilong-liu:log_hbo_canonical_plan

Conversation

@feilong-liu
Copy link
Contributor

@feilong-liu feilong-liu commented Mar 22, 2024

Description

Log the stats equivalent query plan and the canonicalized plan from HBO.

Motivation and Context

This is to help debug plan hash of HBO.

In HBO, a query will get a snapshot of the plan which we call stats equivalent plan. This plan is then canonicalized to get a new canonicalized plan, which later gets hashed to get a string. This hash string is used as map key to store and retrieve history.

However, one problem observed is that, for the same query, it can have different hash string. This is hard to debug, as it cannot be deterministically reproduced. In this PR, I log the stats equivalent and canonicalized plan, so as to help compare and debug such cases.

Impact

Will help debug HBO hash generation.

Test Plan

Tested locally end to end.

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Add log of stats equivalent plan and canonicalized plan for HBO. This feature is controlled by session property `log_query_plans_used_in_history_based_optimizer`

@feilong-liu feilong-liu requested a review from presto-oss March 22, 2024 20:03
@feilong-liu feilong-liu marked this pull request as draft March 22, 2024 20:03
@feilong-liu feilong-liu force-pushed the log_hbo_canonical_plan branch 3 times, most recently from c7fae5f to efd39b0 Compare March 22, 2024 23:24
profileStartTime = System.nanoTime();
}
planNodesWithHash.addAll(getPlanNodeHashes(plan, session, false).values());
planNodesWithHash.addAll(getPlanNodeHashes(plan, session, false, plan == root).values());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Record the stats equivalent plan and canonicalized plan only 1) when registering the plan, i.e. hash generation was first calculated as following calls will hit cache 2) only for root node, the serialized plan will cover the whole plan tree


PlanNode plan = resolveGroupReferences(planNode, lookup);
Map<PlanCanonicalizationStrategy, PlanNodeWithHash> allHashes = getPlanNodeHashes(plan, session, true);
Map<PlanCanonicalizationStrategy, PlanNodeWithHash> allHashes = getPlanNodeHashes(plan, session, true, false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not record plan when getting statistics

if (recordPlan) {
Optional<String> statsEquivalentPlanString = serializeStatsEquivalentPlan(statsEquivalentPlanNode);
if (statsEquivalentPlanString.isPresent()) {
historyBasedStatisticsCacheManager.setStatsEquivalentPlan(session.getQueryId(), statsEquivalentPlanString.get());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Record stats equivalent plan

@feilong-liu feilong-liu marked this pull request as ready for review March 22, 2024 23:27
@steveburnett
Copy link
Contributor

"This feature is controlled by session property log_query_plans_used_in_history_based_optimizer.

Suggest adding doc for the session property. Would http://prestodb.io/docs/current/plugin/redis-hbo-provider.html or http://prestodb.io/docs/current/admin/properties.html#optimizer-properties be the correct place for it?

jaystarshot
jaystarshot previously approved these changes Mar 25, 2024
@github-actions
Copy link

github-actions bot commented Mar 25, 2024

Codenotify: Notifying subscribers in CODENOTIFY files for diff 8b8c9e4...89ed79c.

Notify File(s)
@steveburnett presto-docs/src/main/sphinx/admin/properties.rst

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the doc! One tiny nit suggestion for readability.

* **Type:** ``boolean``
* **Default value:** ``false``

Log stats equivalent plan and canonicalized plans used in history based optimization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Log stats equivalent plan and canonicalized plans used in history based optimization.
Log the stats equivalent plan and canonicalized plans used in history based optimization.

"Log stats equivalent plan" seems a little hard to parse. Suggestion of "the" taken from the Description of the PR makes it easier to understand.

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick response! Two small nits.

@feilong-liu feilong-liu force-pushed the log_hbo_canonical_plan branch from b71ef34 to f7f89c4 Compare March 25, 2024 22:34
steveburnett
steveburnett previously approved these changes Mar 25, 2024
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull updated branch, new local build,looks good. Thanks for the rapid response!

@feilong-liu feilong-liu force-pushed the log_hbo_canonical_plan branch from f7f89c4 to 280e466 Compare March 26, 2024 00:25
jaystarshot
jaystarshot previously approved these changes Mar 26, 2024
import static com.facebook.presto.SystemSessionProperties.getHistoryBasedOptimizerTimeoutLimit;
import static com.facebook.presto.SystemSessionProperties.getHistoryInputTableStatisticsMatchingThreshold;
import static com.facebook.presto.SystemSessionProperties.isVerboseRuntimeStatsEnabled;
import static com.facebook.presto.SystemSessionProperties.logQueryPlansUsedInHBO;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some places we use HBO, but some places we exapnd to historybasedoptimizer in session props/configs/func names. Lets keep it consistent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to logQueryPlansUsedInHistoryBasedOptimizer


@Override
public Optional<String> hash(Session session, PlanNode planNode, PlanCanonicalizationStrategy strategy, boolean cacheOnly)
public Optional<String> hash(Session session, PlanNode planNode, PlanCanonicalizationStrategy strategy, boolean cacheOnly, boolean recordPlan)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this boolean here. Putting it in hash api seems weird. Can we always record it, and choose to log it based on configs above this API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the record argument.

It will help to debug the root cause when the same query produces different
plan hash.
@feilong-liu feilong-liu force-pushed the log_hbo_canonical_plan branch from 3fa5d45 to 89ed79c Compare March 26, 2024 21:54
@feilong-liu feilong-liu requested a review from pranjalssh March 26, 2024 22:11
Copy link
Contributor

@pranjalssh pranjalssh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pranjalssh pranjalssh merged commit 2c226fc into prestodb:master Mar 29, 2024
@feilong-liu feilong-liu deleted the log_hbo_canonical_plan branch March 29, 2024 05:57
@feilong-liu feilong-liu restored the log_hbo_canonical_plan branch April 4, 2024 18:07
@wanglinsong wanglinsong mentioned this pull request May 1, 2024
48 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants