Skip to content

Support format arguments in EXPLAIN ANALYZE#22733

Merged
ZacBlanco merged 1 commit intoprestodb:masterfrom
ZacBlanco:upstream-stats-explain-analyze
Jun 7, 2024
Merged

Support format arguments in EXPLAIN ANALYZE#22733
ZacBlanco merged 1 commit intoprestodb:masterfrom
ZacBlanco:upstream-stats-explain-analyze

Conversation

@ZacBlanco
Copy link
Contributor

@ZacBlanco ZacBlanco commented May 13, 2024

Description

This change adds two features:

  1. Support for format <X> in EXPLAIN ANALYZE
  2. Adds support for a stats field in each plan node representation

This allows the user to get a computer-friendly representation of the plan with the statistics. This can be useful for visualization or as input to other systems for plan node-level performance analysis.

Motivation and Context

  • Additional execution analysis capabilities through query plan JSON

Impact

  • The JSON representation of a plan from now includes a "stats" field representation statistics at runtime. Its contents map to PlanNodeStats.java
    • This effects the plans sent from the event listener framework
    • Backwards compatibility should be preserved as the change only introduces a new field in the JSON representation without modifying previously-present fields.

Test Plan

  • New tests in the parser and statement analyzer to verify user-supplied arguments.

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

General Changes
* Improve :doc:`/sql/explain-analyze` statement to support a ``format`` argument with values of ``<TEXT|JSON>`` :pr:`22733`

@ZacBlanco ZacBlanco force-pushed the upstream-stats-explain-analyze branch 6 times, most recently from b760bb9 to 78d8af9 Compare May 20, 2024 21:37
@ZacBlanco ZacBlanco force-pushed the upstream-stats-explain-analyze branch from 78d8af9 to 4774032 Compare May 21, 2024 23:56
@ZacBlanco ZacBlanco marked this pull request as ready for review May 22, 2024 19:40
@ZacBlanco ZacBlanco requested a review from presto-oss May 22, 2024 19:40
steveburnett
steveburnett previously approved these changes May 22, 2024
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull branch, local docs build, looks good. Thanks!

@steveburnett
Copy link
Contributor

Suggest revising the release note entry following the Order of changes in the Release Notes Guidelines, also add a link to the existing documentation (I tested this link in a local build):

== RELEASE NOTES ==

General Changes
* Improve :doc:`/sql/explain-analyze` statement to support a ``format`` argument with values of ``<TEXT|JSON|GRAPHVIZ>`` :pr:`22733`

@vivek-bharathan vivek-bharathan self-requested a review May 23, 2024 17:09
@aaneja aaneja self-requested a review May 28, 2024 16:16
Comment on lines 166 to 167
case GRAPHVIZ:
plan = graphvizDistributedPlan(queryInfo.getOutputStage().get().getSubStages().get(0), functionAndTypeManager, operatorContext.getSession());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are not changing the Graphviz output, should we skip supporting EXPLAIN ANALYZE (FORMAT GRAPHIZ) ? (the output is same as EXPLAIN (FORMAT GRAPHIZ))
IMO, it will only add to confusion for users

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your opinion on this one. I had a similar thought but decided to just include it anyways. However, if you think it is confusing I will remove support for it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, lets skip it for now. We can enhance Graphviz output in a different PR (if there's interest) and add it back

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naive question - why is it hard to add execution stats to graphviz?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's particularly difficult, but I'm not familiar with the graphviz format. It would just require some more time+effort to make it work properly

Comment on lines 166 to 167
case GRAPHVIZ:
plan = graphvizDistributedPlan(queryInfo.getOutputStage().get().getSubStages().get(0), functionAndTypeManager, operatorContext.getSession());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naive question - why is it hard to add execution stats to graphviz?

@ZacBlanco ZacBlanco force-pushed the upstream-stats-explain-analyze branch 2 times, most recently from ddf57b1 to 741acd4 Compare May 29, 2024 15:55
aaneja
aaneja previously approved these changes May 29, 2024
@ZacBlanco
Copy link
Contributor Author

ZacBlanco commented Jun 3, 2024

@feilong-liu @jaystarshot would you like to take a look at this?

jaystarshot
jaystarshot previously approved these changes Jun 3, 2024
@feilong-liu
Copy link
Contributor

@feilong-liu @jaystarshot would you like to take a look at this?

Will take a look and reply later today

@ZacBlanco ZacBlanco force-pushed the upstream-stats-explain-analyze branch from 741acd4 to 382afa2 Compare June 6, 2024 15:30
@Test
public void testExplainAnalyzeFormatJson()
{
analyze("EXPLAIN ANALYZE (format JSON) SELECT * FROM t1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these just check that we don't fail. is there a way to test/see what the new output looks like?

Copy link
Contributor Author

@ZacBlanco ZacBlanco Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a deserialize method in the JsonRenderer class and an associated test inside of AbstractTestDistributedQueries


private final Duration planNodeScheduledTime;
private final Duration planNodeCpuTime;
private final Duration planNodeBlockedWallTime;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these added here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main driver behind this change is that we have an external tool that helps visualize some plans from the JSON output of the event listener. The event listener output is the actually the same output as what you get from an EXPLAIN ANALYZE. However, you only get execution statistics in the text output from the event listener, but not the JSON one. So, to augment the event listener's JSON plan output to be closer to the text version, I added some more statistics to this class.

This change adds two features:

1. Support for `format JSON` in EXPLAIN ANALYZE
2. Adds supoprt for a `stats` field in each plan node representation

This allows the user to get a computer-friendly representation of the
plan with the statistics. This can be useful for visualization
@ZacBlanco ZacBlanco force-pushed the upstream-stats-explain-analyze branch from 382afa2 to 72a7570 Compare June 7, 2024 16:59
@ZacBlanco ZacBlanco merged commit f661a1c into prestodb:master Jun 7, 2024
@wanglinsong wanglinsong mentioned this pull request Jun 25, 2024
36 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants