fix(planner): Set the size estimate for a ConstantExpression/Literal#27188
Merged
tdcmeehan merged 1 commit intoprestodb:masterfrom Feb 24, 2026
Merged
Conversation
Contributor
Reviewer's guide (collapsed on small PRs)Reviewer's GuideSets average row size estimates for constant scalar expressions backed by Slice values and updates scalar stats tests accordingly, improving join size and cost estimation for constant variables. Class diagram for updated ScalarStatsCalculator scalar estimationclassDiagram
class ScalarStatsCalculator {
+VariableStatsEstimate visitConstant(ConstantExpression literal, Void context)
+VariableStatsEstimate visitLiteral(Literal node, Void context)
}
class VariableStatsEstimate {
+setLowValue(double lowValue) VariableStatsEstimateBuilder
+setHighValue(double highValue) VariableStatsEstimateBuilder
+setAverageRowSize(double averageRowSize) VariableStatsEstimateBuilder
+build() VariableStatsEstimate
}
class VariableStatsEstimateBuilder {
+setLowValue(double lowValue) VariableStatsEstimateBuilder
+setHighValue(double highValue) VariableStatsEstimateBuilder
+setAverageRowSize(double averageRowSize) VariableStatsEstimateBuilder
+build() VariableStatsEstimate
}
class ConstantExpression {
+Object getValue()
}
class Literal {
+Object getValue()
}
class Slice {
+int length()
}
ScalarStatsCalculator --> VariableStatsEstimateBuilder : uses
ScalarStatsCalculator --> ConstantExpression : parameter
ScalarStatsCalculator --> Literal : parameter
ConstantExpression --> Slice : value_may_be
Literal --> Slice : value_may_be
VariableStatsEstimateBuilder --> VariableStatsEstimate : builds
Slice <.. ScalarStatsCalculator : averageRowSize_from_length
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Contributor
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The logic to set
averageRowSizeforSlicevalues is duplicated in bothvisitConstantandvisitLiteral; consider extracting a small helper to keep this behavior consistent and easier to evolve. - Right now only
Slice-backed literals get anaverageRowSize; if there are other variable-width types represented differently (e.g., non-Slice-backed), it may be worth clarifying or centralizing how average row size should be derived across all relevant types.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The logic to set `averageRowSize` for `Slice` values is duplicated in both `visitConstant` and `visitLiteral`; consider extracting a small helper to keep this behavior consistent and easier to evolve.
- Right now only `Slice`-backed literals get an `averageRowSize`; if there are other variable-width types represented differently (e.g., non-Slice-backed), it may be worth clarifying or centralizing how average row size should be derived across all relevant types.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
2f32172 to
829dea4
Compare
829dea4 to
d02433a
Compare
tdcmeehan
approved these changes
Feb 24, 2026
This was referenced Mar 31, 2026
15 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description, Motivation and Context
The planner generates constant variables which were missing a size estimate. These are needed for correctly estimating join sizes/ join costs (among other cost based comparisons)
Impact
Improves build/prode side estimation during
DetermineJoinDistributionTypeTest Plan
CI
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.
Summary by Sourcery
Set size estimates for constant and literal string expressions to improve planner cost estimation for joins and other operations.
Bug Fixes:
Tests: