-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-22529] [SQL] Relation stats should be consistent with other plans based on cbo config #19757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #83897 has finished for PR 19757 at commit
|
|
Test build #83928 has finished for PR 19757 at commit
|
|
retest this please |
|
Test build #83935 has finished for PR 19757 at commit
|
|
Test build #84123 has finished for PR 19757 at commit
|
|
retest this please |
|
Test build #84130 has finished for PR 19757 at commit
|
| Statistics(sizeInBytes = sizeInBytes, rowCount = rowCount, | ||
| attributeStats = AttributeMap(matched)) | ||
| def toPlanStats(planOutput: Seq[Attribute], cboEnabled: Boolean): Statistics = { | ||
| val attrStats = planOutput.flatMap(a => colStats.get(a.name).map(a -> _)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move into if.
| withSQLConf(SQLConf.CBO_ENABLED.key -> "false") { | ||
| // Don't show rowCount if cbo is disabled | ||
| checkKeywordsExist(sql(explainCostCommand), "sizeInBytes") | ||
| checkKeywordsNotExist(sql(explainCostCommand), "rowCount") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you assume there is no table relation cache in this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, should I refresh it for robustness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes please.
|
Test build #84213 has finished for PR 19757 at commit
|
|
Test build #84218 has finished for PR 19757 at commit
|
|
thanks, merging to master! |
| } else { | ||
| // When CBO is disabled, we apply the size-only estimation strategy, so there's no need to | ||
| // propagate other statistics from catalog to the plan. | ||
| Statistics(sizeInBytes = sizeInBytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If rowCount is available, why we ignore them?
What changes were proposed in this pull request?
Currently, relation stats is the same whether cbo is enabled or not. While relation (
LogicalRelationorHiveTableRelation) is aLogicalPlan, its behavior is inconsistent with other plans. This can cause confusion when user runs EXPLAIN COST commands. Besides, when CBO is disabled, we apply the size-only estimation strategy, so there's no need to propagate other catalog statistics to relation.How was this patch tested?
Enhanced existing tests case and added a test case.