feat(small): Set 'summary' level metrics for `DataSourceExec` with parquet source by 2010YOUY01 · Pull Request #18196 · apache/datafusion

2010YOUY01 · 2025-10-21T10:46:33Z

Which issue does this PR close?

Rationale for this change

The below configuration can be used to let EXPLAIN ANALYZE only show important high-level insights.

set datafusion.explain.analyze_level = summary;

This PR sets summary level metrics for the parquet data source:

`summary` level metrics for `DataSourceExec` with `Parquet` source

File level pruning metrics
Row-group level pruning metrics
Bytes scanned
metadata load time
In

datafusion/datafusion/datasource-parquet/src/metrics.rs

Line 29 in 155b56e

pub struct ParquetFileMetrics {

The remaining metrics are kept in the dev level. I'm not sure if the page level pruning metrics should also be included to the summary level, I'm open to suggestions for this, or any other metrics that should also be included.

While implementing this, I came up with a few ideas to further improve metrics tracking in the Parquet scanner. I’ve documented them in #18195

What changes are included in this PR?

Set the above metrics to summary analyze level

Are these changes tested?

UTs

Are there any user-facing changes?

No

xudong963 · 2025-10-22T11:19:32Z

page level pruning metrics

I'd like to see it under any context.

2010YOUY01 · 2025-10-23T11:23:47Z

page level pruning metrics

I'd like to see it under any context.

I see. I have added page pruning metrics to summary analyze level in 7a3fd2d

xudong963

thank you @2010YOUY01

…rquet source (apache#18196) ## Which issue does this PR close?  Part of apache#18116 ## Rationale for this change  The below configuration can be used to let `EXPLAIN ANALYZE` only show important high-level insights. ``` set datafusion.explain.analyze_level = summary; ``` This PR sets `summary` level metrics for the parquet data source: ### `summary` level metrics for `DataSourceExec` with `Parquet` source - File level pruning metrics - Row-group level pruning metrics - Bytes scanned - metadata load time In https://github.com/apache/datafusion/blob/155b56e521d75186776a65f1634ee03058899a79/datafusion/datasource-parquet/src/metrics.rs#L29 The remaining metrics are kept in the `dev` level. I'm not sure if the page level pruning metrics should also be included to the `summary` level, I'm open to suggestions for this, or any other metrics that should also be included. While implementing this, I came up with a few ideas to further improve metrics tracking in the Parquet scanner. I’ve documented them in apache#18195 ## What changes are included in this PR?  Set the above metrics to `summary` analyze level ## Are these changes tested? UTs  ## Are there any user-facing changes? No

set 'summar' level metrics for DataSourceExec with parquet source

f6a68ec

github-actions Bot added core Core DataFusion crate datasource Changes to the datasource crate labels Oct 21, 2025

add page pruning stats to 'summary' analyze level

7a3fd2d

xudong963 approved these changes Oct 24, 2025

View reviewed changes

2010YOUY01 added this pull request to the merge queue Oct 25, 2025

Merged via the queue into apache:main with commit f4a49b5 Oct 25, 2025
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(small): Set 'summary' level metrics for `DataSourceExec` with parquet source#18196

feat(small): Set 'summary' level metrics for `DataSourceExec` with parquet source#18196
2010YOUY01 merged 2 commits into
apache:mainfrom
2010YOUY01:parquet-metrics-level

2010YOUY01 commented Oct 21, 2025

Uh oh!

xudong963 commented Oct 22, 2025

Uh oh!

2010YOUY01 commented Oct 23, 2025

Uh oh!

xudong963 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

2010YOUY01 commented Oct 21, 2025

Which issue does this PR close?

Rationale for this change

summary level metrics for DataSourceExec with Parquet source

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

xudong963 commented Oct 22, 2025

Uh oh!

2010YOUY01 commented Oct 23, 2025

Uh oh!

xudong963 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`summary` level metrics for `DataSourceExec` with `Parquet` source