perf: Add runtime stats for uncaptured metastore and planning operations#27438
Merged
feilong-liu merged 1 commit intoprestodb:masterfrom Mar 26, 2026
Merged
perf: Add runtime stats for uncaptured metastore and planning operations#27438feilong-liu merged 1 commit intoprestodb:masterfrom
feilong-liu merged 1 commit intoprestodb:masterfrom
Conversation
Contributor
Reviewer's GuideAdds runtime planning and metastore timing metrics by wrapping specific metadata and access control operations with RuntimeStats recording, and introduces a new CHECK_ACCESS_PERMISSIONS_TIME_NANOS runtime metric. Sequence diagram for recording access permission check timingsequenceDiagram
participant SqlQueryExecution
participant Session
participant RuntimeStats
participant AccessControl
SqlQueryExecution->>Session: getSession()
Session-->>SqlQueryExecution: Session
SqlQueryExecution->>Session: getRuntimeStats()
Session-->>SqlQueryExecution: RuntimeStats
SqlQueryExecution->>RuntimeStats: recordWallAndCpuTime(CHECK_ACCESS_PERMISSIONS_TIME_NANOS, callback)
activate RuntimeStats
RuntimeStats-->>SqlQueryExecution: invoke callback
activate SqlQueryExecution
SqlQueryExecution->>SqlQueryExecution: checkAccessPermissions(accessControlReferences, viewDefinitionReferences, query, preparedStatements, identity, accessControl, accessControlContext)
SqlQueryExecution->>AccessControl: check permissions for referenced tables and views
AccessControl-->>SqlQueryExecution: permission check result
deactivate SqlQueryExecution
SqlQueryExecution-->>RuntimeStats: callback returns null
RuntimeStats-->>SqlQueryExecution: recorded wall and CPU time for CHECK_ACCESS_PERMISSIONS_TIME_NANOS
deactivate RuntimeStats
Sequence diagram for metadata and metastore timing in StatementAnalyzersequenceDiagram
participant StatementAnalyzer
participant Session
participant RuntimeStats
participant MetadataResolver
rect rgb(230,230,250)
note over StatementAnalyzer,MetadataResolver: Timing getTableHandle in visitCreateVectorIndex
StatementAnalyzer->>Session: getRuntimeStats()
Session-->>StatementAnalyzer: RuntimeStats
StatementAnalyzer->>RuntimeStats: recordWallTime(GET_TABLE_HANDLE_TIME_NANOS, callback)
activate RuntimeStats
RuntimeStats-->>StatementAnalyzer: invoke callback
activate StatementAnalyzer
StatementAnalyzer->>MetadataResolver: getTableHandle(sourceTableName)
MetadataResolver-->>StatementAnalyzer: Optional<TableHandle>
StatementAnalyzer-->>RuntimeStats: Optional<TableHandle>
deactivate StatementAnalyzer
RuntimeStats-->>StatementAnalyzer: TableHandle (via Optional.get())
deactivate RuntimeStats
end
rect rgb(230,250,230)
note over StatementAnalyzer,MetadataResolver: Timing getColumnHandles in analyze and create vector index
StatementAnalyzer->>Session: getRuntimeStats()
Session-->>StatementAnalyzer: RuntimeStats
StatementAnalyzer->>RuntimeStats: recordWallTime(GET_COLUMN_HANDLE_TIME_NANOS, callback)
activate RuntimeStats
RuntimeStats-->>StatementAnalyzer: invoke callback
activate StatementAnalyzer
StatementAnalyzer->>MetadataResolver: getColumnHandles(tableHandle)
MetadataResolver-->>StatementAnalyzer: Map<ColumnName, ColumnHandle>
StatementAnalyzer-->>RuntimeStats: Map<ColumnName, ColumnHandle>
deactivate StatementAnalyzer
RuntimeStats-->>StatementAnalyzer: Map<ColumnName, ColumnHandle> for further processing
deactivate RuntimeStats
end
rect rgb(250,230,230)
note over StatementAnalyzer,MetadataResolver: Timing getMaterializedViewStatus in analyzeAutoRefreshMaterializedView
StatementAnalyzer->>Session: getRuntimeStats()
Session-->>StatementAnalyzer: RuntimeStats
StatementAnalyzer->>RuntimeStats: recordWallTime(GET_MATERIALIZED_VIEW_STATUS_TIME_NANOS, callback)
activate RuntimeStats
RuntimeStats-->>StatementAnalyzer: invoke callback
activate StatementAnalyzer
StatementAnalyzer->>MetadataResolver: getMaterializedViewStatus(materializedViewName, baseQueryDomain)
MetadataResolver-->>StatementAnalyzer: MaterializedViewStatus
StatementAnalyzer-->>RuntimeStats: MaterializedViewStatus
deactivate StatementAnalyzer
RuntimeStats-->>StatementAnalyzer: MaterializedViewStatus for further analysis
deactivate RuntimeStats
end
Updated class diagram for runtime stats integrationclassDiagram
class SqlQueryExecution {
- QueryStateMachine stateMachine
- QuerySessionSupplier sessionSupplier
- AccessControl accessControl
- QueryAnalysis queryAnalysis
+ SqlQueryExecution(...)
- checkAccessPermissions(accessControlReferences, viewDefinitionReferences, query, preparedStatements, identity, accessControl, accessControlContext)
}
class StatementAnalyzer {
- Session session
- MetadataResolver metadataResolver
+ visitAnalyze(node, scope) Scope
+ analyzeAutoRefreshMaterializedView(node, viewName) Map
+ visitCreateVectorIndex(node, scope) Scope
- getMaterializedViewStatus(materializedViewName, baseQueryDomain) MaterializedViewStatus
}
class Session {
+ getRuntimeStats() RuntimeStats
+ getPreparedStatements() Map
+ getIdentity() Identity
+ getAccessControlContext() AccessControlContext
}
class RuntimeStats {
+ recordWallTime(metricName, supplier) Object
+ recordWallAndCpuTime(metricName, supplier) Object
}
class RuntimeMetricName {
<<final>>
+ DIRECTORY_LISTING_CACHE_MISS String
+ DIRECTORY_LISTING_TIME_NANOS String
+ FILES_READ_COUNT String
+ CHECK_ACCESS_PERMISSIONS_TIME_NANOS String
}
class MetadataResolver {
+ getColumnHandles(tableHandle) Map
+ getTableHandle(tableName) Optional
+ getMaterializedViewStatus(materializedViewName, baseQueryDomain) MaterializedViewStatus
}
class AccessControl {
+ checkCanSelectFromColumns(...)
+ checkCanInsertIntoTable(...)
+ checkCanReferenceView(...)
}
class MaterializedViewStatus {
+ getPartitionsFromBaseTables() Map
}
class TupleDomain {
+ all() TupleDomain
}
SqlQueryExecution --> Session : uses
SqlQueryExecution --> AccessControl : uses
SqlQueryExecution --> RuntimeStats : uses via Session
SqlQueryExecution --> RuntimeMetricName : uses CHECK_ACCESS_PERMISSIONS_TIME_NANOS
StatementAnalyzer --> Session : uses
StatementAnalyzer --> MetadataResolver : uses
StatementAnalyzer --> RuntimeStats : uses via Session
StatementAnalyzer --> RuntimeMetricName : uses GET_TABLE_HANDLE_TIME_NANOS
StatementAnalyzer --> RuntimeMetricName : uses GET_COLUMN_HANDLE_TIME_NANOS
StatementAnalyzer --> RuntimeMetricName : uses GET_MATERIALIZED_VIEW_STATUS_TIME_NANOS
Session --> RuntimeStats : provides
MetadataResolver --> MaterializedViewStatus : returns
StatementAnalyzer --> MaterializedViewStatus : uses
StatementAnalyzer --> TupleDomain : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Contributor
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
getMaterializedViewStatus, the warning branch previously usedbaseQueryDomainbut now usesTupleDomain.all()inside therecordWallTimewrapper; if this wasn’t intentional, it changes behavior and may widen the domain unexpectedly when the view definition is missing. - The new
recordWallTimecalls for table/column handle and materialized view status lookups are repeated in several places inStatementAnalyzer; consider introducing small helper methods (e.g.,getTableHandleWithTiming,getMaterializedViewStatusWithTiming) to centralize the metric wrapping and avoid divergence between call sites.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `getMaterializedViewStatus`, the warning branch previously used `baseQueryDomain` but now uses `TupleDomain.all()` inside the `recordWallTime` wrapper; if this wasn’t intentional, it changes behavior and may widen the domain unexpectedly when the view definition is missing.
- The new `recordWallTime` calls for table/column handle and materialized view status lookups are repeated in several places in `StatementAnalyzer`; consider introducing small helper methods (e.g., `getTableHandleWithTiming`, `getMaterializedViewStatusWithTiming`) to centralize the metric wrapping and avoid divergence between call sites.
## Individual Comments
### Comment 1
<location path="presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java" line_range="2722-2724" />
<code_context>
if (!materializedViewDefinition.isPresent()) {
log.warn("Materialized view definition not present as expected when fetching materialized view status");
- return metadataResolver.getMaterializedViewStatus(materializedViewName, baseQueryDomain);
+ return session.getRuntimeStats().recordWallTime(
+ RuntimeMetricName.GET_MATERIALIZED_VIEW_STATUS_TIME_NANOS,
+ () -> metadataResolver.getMaterializedViewStatus(materializedViewName, TupleDomain.all()));
}
</code_context>
<issue_to_address>
**issue (bug_risk):** Using TupleDomain.all() here drops the baseQueryDomain filtering that was present before.
This branch used to call `getMaterializedViewStatus(materializedViewName, baseQueryDomain)`, preserving the predicate. Passing `TupleDomain.all()` now ignores that filter and can change behavior (e.g., requesting status for all partitions). I think this should still use `baseQueryDomain`, consistent with the return at the end of the method.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java
Show resolved
Hide resolved
a02cd9f to
4c1fdf2
Compare
…ations Adds runtime metrics to capture previously untracked wall time during query planning, helping decompose the gap between ANALYZE_TIME_NANOS wall time and CPU time. - Add verbose_planner_runtime_stats_enabled session property - Add GET_TABLE_STATISTICS_TIME_NANOS timing in MetadataManager - Add CHECK_ACCESS_PERMISSIONS_TIME_NANOS timing in SqlQueryExecution - Add GET_MATERIALIZED_VIEW_STATUS_TIME_NANOS timing in StatementAnalyzer - Add GET_COLUMN_HANDLE_TIME_NANOS and GET_TABLE_HANDLE_TIME_NANOS timing in StatementAnalyzer - Wire isVerbosePlannerRuntimeStatsEnabled into Optimizer, IterativeOptimizer, ApplyConnectorOptimization, HistoryBasedPlanStatisticsCalculator, and CachingPlanCanonicalInfoProvider Differential Revision: D98045070
4c1fdf2 to
6185861
Compare
arhimondr
approved these changes
Mar 26, 2026
feilong-liu
added a commit
that referenced
this pull request
Mar 30, 2026
…ons (#27438) Adds runtime metrics to capture previously untracked wall time during query planning, helping decompose the gap between ANALYZE_TIME_NANOS wall time and CPU time. **Verbose Planner Runtime Stats Session Property** - Added `verbose_planner_runtime_stats_enabled` session property to enable verbose runtime stats for analyzer, logical planner, and optimizer phases only (without enabling all verbose runtime stats) - Wired into `Optimizer`, `IterativeOptimizer`, `ApplyConnectorOptimization`, `HistoryBasedPlanStatisticsCalculator`, and `CachingPlanCanonicalInfoProvider` **New Timing Metrics** - `GET_TABLE_STATISTICS_TIME_NANOS` — wall time for `getTableStatistics()` in `MetadataManager` - `CHECK_ACCESS_PERMISSIONS_TIME_NANOS` — wall and CPU time for `checkAccessPermissions()` in `SqlQueryExecution` - `GET_MATERIALIZED_VIEW_STATUS_TIME_NANOS` — wall time for `getMaterializedViewStatus()` calls in `StatementAnalyzer` - `GET_COLUMN_HANDLE_TIME_NANOS` — wall time for `getColumnHandles()` calls during ANALYZE and vector index creation in `StatementAnalyzer` - `GET_TABLE_HANDLE_TIME_NANOS` — wall time for `getTableHandle()` during vector index creation in `StatementAnalyzer` Differential Revision: D98045070 == NO RELEASE NOTE ==
This was referenced Mar 31, 2026
15 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds runtime metrics to capture previously untracked wall time during query planning, helping decompose the gap between ANALYZE_TIME_NANOS wall time and CPU time.
Verbose Planner Runtime Stats Session Property
verbose_planner_runtime_stats_enabledsession property to enable verbose runtime stats for analyzer, logical planner, and optimizer phases only (without enabling all verbose runtime stats)Optimizer,IterativeOptimizer,ApplyConnectorOptimization,HistoryBasedPlanStatisticsCalculator, andCachingPlanCanonicalInfoProviderNew Timing Metrics
GET_TABLE_STATISTICS_TIME_NANOS— wall time forgetTableStatistics()inMetadataManagerCHECK_ACCESS_PERMISSIONS_TIME_NANOS— wall and CPU time forcheckAccessPermissions()inSqlQueryExecutionGET_MATERIALIZED_VIEW_STATUS_TIME_NANOS— wall time forgetMaterializedViewStatus()calls inStatementAnalyzerGET_COLUMN_HANDLE_TIME_NANOS— wall time forgetColumnHandles()calls during ANALYZE and vector index creation inStatementAnalyzerGET_TABLE_HANDLE_TIME_NANOS— wall time forgetTableHandle()during vector index creation inStatementAnalyzerDifferential Revision: D98045070
Release Notes
== NO RELEASE NOTE ==