-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-37199][SQL] Add deterministic field to QueryPlan #34470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -84,6 +84,13 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]] | |
| AttributeSet.fromAttributeSets(expressions.map(_.references)) -- producedAttributes | ||
| } | ||
|
|
||
| /** | ||
| * Returns true when the all the expressions in the current node as well as all of its children | ||
| * are deterministic | ||
| */ | ||
| lazy val deterministic: Boolean = expressions.forall(_.deterministic) && | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wait .. why is this in query plan? What about physical plans vs logical plans? should both be marked?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should move this to logical plan only since it doesn't make sense physical plans have different determinism.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. physical plan can override this
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can physical plan have a different determinism to ones in logical plan? e.g., Sample is non-deterministic. I think physical plans of Sample should always be non-deterministic. Otherwise, the output will be inconsistent for which physical plan is used. The opposite case is the same too.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yea, if we override this lazy val in a logical plan, we should do it in the corresponding physical plan as well. Moving this to logical plan is also OK, if we don't need it in physical plan at all. cc @maryannxue
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So if we optimize something, that should always happen in optimizer with logical plans ... right? If we can do something with physical plans, we will have to add another argument for every non deterministic plan e.g.) case class Sample(
lowerBound: Double,
upperBound: Double,
withReplacement: Boolean,
seed: Long,
+ deterministic: Boolean,
child: LogicalPlan) extends UnaryNode {
case class SampleExec(
lowerBound: Double,
upperBound: Double,
withReplacement: Boolean,
seed: Long,
+ deterministic: Boolean,
child: SparkPlan) extends UnaryExecNode with CodegenSupport {which is pretty much different from how we do in Otherwise, we will have to recalculate it for each plan, etc. |
||
| children.forall(_.deterministic) | ||
|
|
||
| /** | ||
| * Attributes that are referenced by expressions but not provided by this node's children. | ||
| */ | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qq: should we mark all non-deterministic plans as so? e.g.
Sample?