Skip to content

Conversation

@ueshin
Copy link
Member

@ueshin ueshin commented Jul 16, 2014

Substring including null literal cases could be added to NullPropagation.

@SparkQA
Copy link

SparkQA commented Jul 16, 2014

QA tests have started for PR 1428. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16705/consoleFull

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not doable with constant folding?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I think about it more, this is probably ok to leave it here since in constant folding we don't really special case for null literals. However, I looked into other rules in NullPropagation and a lot of them don't really belong there... (has nothing to do with your PR though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxin, thank you for your comment.
The rule ConstantFolding can be applied only if the expression is foldable, i.e. ALL of the child expressions are foldable, and the NullPropagation covers the case there exists at least one null literal regardless of foldabilities of the others.
So I think the cases are needed.

BTW, Substring can be foldable, right?

@SparkQA
Copy link

SparkQA commented Jul 16, 2014

QA results for PR 1428:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16705/consoleFull

@rxin
Copy link
Contributor

rxin commented Jul 16, 2014

Thanks. Merging this in master & branch-1.0.

asfgit pushed a commit that referenced this pull request Jul 16, 2014
`Substring` including `null` literal cases could be added to `NullPropagation`.

Author: Takuya UESHIN <[email protected]>

Closes #1428 from ueshin/issues/SPARK-2509 and squashes the following commits:

d9eb85f [Takuya UESHIN] Add Substring cases to NullPropagation.

(cherry picked from commit 9b38b7c)
Signed-off-by: Reynold Xin <[email protected]>
@asfgit asfgit closed this in 9b38b7c Jul 16, 2014
@ueshin
Copy link
Member Author

ueshin commented Jul 16, 2014

@rxin Ah, wait for a moment.
What do you think about foldability of the Substring I mentioned above?

@ueshin
Copy link
Member Author

ueshin commented Jul 16, 2014

I'll open another issue about foldability of Substring.

@rxin
Copy link
Contributor

rxin commented Jul 16, 2014

I've already merged this. I think we should re-evaluate in the future whether we would want to push some stuff into expression foldability. I think that might also lead to higher planner performance (although that's probably less important). More importantly, it does feel more natural to me that this stuff goes into each expression, rather than having a rule for each NullPropagation.

cc @marmbrus

@ueshin
Copy link
Member Author

ueshin commented Jul 16, 2014

@rxin I agree that we need the better way for NullPropagation instead of too much pattern matching.

@marmbrus
Copy link
Contributor

You are right that this rule does more than null propagation now. I'm not sure what a better name would be. DegenerateExpressionSimplification?

Regarding moving null propagation into the expressions, you could do it... but what would it look like? You specify which of the children make the entire expression null if they are null? Seems like a lot of refactoring for little benefit...

asfgit pushed a commit that referenced this pull request Jul 16, 2014
This is a follow-up of #1428.

Author: Takuya UESHIN <[email protected]>

Closes #1432 from ueshin/issues/SPARK-2518 and squashes the following commits:

37d1ace [Takuya UESHIN] Fix foldability of Substring expression.
asfgit pushed a commit that referenced this pull request Jul 16, 2014
This is a follow-up of #1428.

Author: Takuya UESHIN <[email protected]>

Closes #1432 from ueshin/issues/SPARK-2518 and squashes the following commits:

37d1ace [Takuya UESHIN] Fix foldability of Substring expression.

(cherry picked from commit cc965ee)
Signed-off-by: Reynold Xin <[email protected]>
@rxin
Copy link
Contributor

rxin commented Jul 16, 2014

It doesn't bring much benefit right now, but what we are doing here is creating patterns in NullPropagation to specify the semantics of each individual expression ... not very scalable in maintaining this code base.

@marmbrus
Copy link
Contributor

I'm not sure I agree with that. This is a pretty niche optimization not something fundamental about the expressions that is required for correct evaluation (and the first version of this code had a bunch of mistakes). Having all of these rules in one place made it easier for me to realize that when doing the review.

@rxin
Copy link
Contributor

rxin commented Jul 16, 2014

Well with that argument we shouldn't have any of the constant folding logic in expressions themselves either since they don't affect correctness.

This is a situation that it is not obvious where to put the stuff. Anyway I think it's ok for them to be here for now, since it is not too long yet. As the list gets longer, we can revisit.

@marmbrus
Copy link
Contributor

That is exactly the argument I made when the folding logic was added. :) I suggested that we add deterministic instead and then have a rule that folds things that are deterministic and have no references. deterministic I would argue is something fundamental about the expression itself, unlike these optimizations we are making.

@marmbrus
Copy link
Contributor

I would be supportive of changing it to match my original proposal...

xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
`Substring` including `null` literal cases could be added to `NullPropagation`.

Author: Takuya UESHIN <[email protected]>

Closes apache#1428 from ueshin/issues/SPARK-2509 and squashes the following commits:

d9eb85f [Takuya UESHIN] Add Substring cases to NullPropagation.
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
This is a follow-up of apache#1428.

Author: Takuya UESHIN <[email protected]>

Closes apache#1432 from ueshin/issues/SPARK-2518 and squashes the following commits:

37d1ace [Takuya UESHIN] Fix foldability of Substring expression.
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
…nsitivity (apache#1428)

### What changes were proposed in this pull request?
Previously alignment was checked by comparing the exact attribute references between
Spark and the underlying table, which failed with case insensitive SQL configurations. To
fix this we use the configuration's resolver to compare references.

### Why are the changes needed?
This was breaking some migrations from Spark 3.1 where the alignment check was not present. A query which attempted
to do a MergeInto with column names which matched in a case-insensitive way would fail to trigger our plan rewrite
rules leading to an opaque MERGE INTO is temporarily not supported exception.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
This patch is a cherry pick of the code fixed and released in Apache Iceberg. apache/iceberg#4848 

The test for this specific case is in the Iceberg codebase and we will back-port this to ADT following the merge of the fix in Spark.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants