Skip to content

Report physical input data size in EXPLAIN ANALYZE#14907

Merged
raunaqmorarka merged 1 commit intotrinodb:masterfrom
Dith3r:feature/physical-size
Nov 10, 2022
Merged

Report physical input data size in EXPLAIN ANALYZE#14907
raunaqmorarka merged 1 commit intotrinodb:masterfrom
Dith3r:feature/physical-size

Conversation

@Dith3r
Copy link
Copy Markdown
Member

@Dith3r Dith3r commented Nov 4, 2022

Description

Additional information about physical data size used by connector to EXPLAIN ANALYZE in connector metrics section.

Example output.

     └─ TableScan[table = iceberg:part.ztest$data@1581575691215353074]
            Layout: [id:bigint, small:varchar, medium:varchar, big:varchar]
            Estimates: {rows: 483 (125.00kB), cpu: 125.00k, memory: 0B, network: 0B}
            CPU: 1.32s (62.71%), Scheduled: 3.59s (76.56%), Blocked: 0.00ns (0.00%), Output: 87 rows (1.36GB)
            Input avg.: 0.97 rows, Input std.dev.: 18.57%
            small := 2:small:varchar
            big := 4:big:varchar
            id := 1:id:bigint
            medium := 3:medium:varchar
            Physical Input: 932.06kB

Non-technical explanation

Additional information about physical data size used by connector to EXPLAIN ANALYZE.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# General
* Add the amount of data read from an external source during table scan to EXPLAIN ANALYZE. ({issue}`14907`)

@cla-bot cla-bot bot added the cla-signed label Nov 4, 2022
@Dith3r Dith3r requested review from lukasz-stec and sopel39 November 4, 2022 14:03
@Dith3r Dith3r force-pushed the feature/physical-size branch from 9937be9 to cdd2c4b Compare November 7, 2022 12:25
@Dith3r Dith3r requested a review from lukasz-stec November 7, 2022 12:25
@Dith3r Dith3r force-pushed the feature/physical-size branch from cdd2c4b to 4cbdb82 Compare November 7, 2022 12:36
Copy link
Copy Markdown
Member

@lukasz-stec lukasz-stec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@raunaqmorarka
Copy link
Copy Markdown
Member

I think the approach here should be similar to #10472
Also, please add example of how the EXPLAIN output changed to the description

@lukasz-stec
Copy link
Copy Markdown
Member

I think the approach here should be similar to #10472

I think we want "physical input data size" displayed even if verbose=false and connector metrics are only displayed if verbose=true

@Dith3r
Copy link
Copy Markdown
Member Author

Dith3r commented Nov 8, 2022

@raunaqmorarka As @lukasz-stec wrote, connector metrics are displayed only for verbose output, whereas we want to display physical input data metric for explain analyze only. Updated description.

@Dith3r Dith3r force-pushed the feature/physical-size branch from 4cbdb82 to 3fc3ca9 Compare November 8, 2022 12:02
Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test (see BaseJdbcConnectorTest#testExplainAnalyzePhysicalReadWallTime)

@Dith3r Dith3r force-pushed the feature/physical-size branch 3 times, most recently from 78d3fc4 to 8802812 Compare November 9, 2022 09:01
@Dith3r Dith3r force-pushed the feature/physical-size branch from 8802812 to 084b0c4 Compare November 9, 2022 09:24
@Dith3r Dith3r force-pushed the feature/physical-size branch from 084b0c4 to 81814fe Compare November 9, 2022 09:45
@Dith3r Dith3r requested a review from raunaqmorarka November 9, 2022 11:10
Example output:
- ScanFilterProject[table = hive:sf1:orders, filterPredicate = ("orderdate" > DATE '1995-01-01')]
     Layout: [clerk:varchar(15), $hashvalue_2:bigint]
     Estimates: {rows: 1500000 (41.48MB), cpu: 35.76M, memory: 0B, network: 0B}/{rows: 816424 (22.58MB), cpu: 35.76M, memory: 0B, network: 0B}/{rows: 816424 (22.58MB), cpu: 22.58M, memory: 0B, network: 0B}
     CPU: 180.00ms (78.95%), Scheduled: 298.00ms (71.46%), Blocked: 0.00ns (0.00%), Output: 818058 rows (12.98MB)
     Input avg.: 1500000.00 rows, Input std.dev.: 0.00%
     $hashvalue_2 := combine_hash(bigint '0', COALESCE("$operator$hash_code"("clerk"), 0))
     clerk := clerk:varchar(15):REGULAR
     orderdate := orderdate:date:REGULAR
     Input: 1500000 rows (18.17MB), Filtered: 45.46%, Physical Input: 4.51MB
@Dith3r Dith3r force-pushed the feature/physical-size branch from 81814fe to f276cde Compare November 9, 2022 13:38
@raunaqmorarka raunaqmorarka merged commit f9654eb into trinodb:master Nov 10, 2022
@github-actions github-actions bot added this to the 403 milestone Nov 10, 2022
@Dith3r Dith3r deleted the feature/physical-size branch November 14, 2022 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants