-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove type coercion in the binary physical expr #3396
remove type coercion in the binary physical expr #3396
Conversation
c150de1
to
c2884f7
Compare
c2884f7
to
1d7c2a0
Compare
1d7c2a0
to
8c38433
Compare
after rebase #3380, the test case |
I meet so many issue when remove the binary type coercion which is caused from these pr #3301 #3246 @andygrove why not generate the logical binary op when create the |
cc @alamb I create the issue #3509 to support other logical expr for the typ coercion. |
I think the only coercion it might need is that its argument is |
Codecov Report
@@ Coverage Diff @@
## master #3396 +/- ##
==========================================
+ Coverage 86.03% 86.08% +0.05%
==========================================
Files 300 300
Lines 56253 56313 +60
==========================================
+ Hits 48395 48477 +82
+ Misses 7858 7836 -22
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
I've started reviewing this, but there are many changes and referenced issues I need to go read. I will continue with the review tomorrow. |
I also plan to review this PR, but I may not get to it today |
I try to explain what I do and why i do in the description I hope this works for your review. cc @andygrove @alamb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a great PR to read @liukun4515 -- really nice.
Thank you for sticking with it.
I think we should merge it asap to minimize conflicts.
@@ -1452,7 +1452,11 @@ impl SessionState { | |||
rules.push(Arc::new(FilterNullJoinKeys::default())); | |||
} | |||
rules.push(Arc::new(ReduceOuterJoin::new())); | |||
// TODO: https://github.com/apache/arrow-datafusion/issues/3557 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it makes sense to me that we need to simplify expressons after coercion
"+--------------+---------------------------+---------------------------+---------------------------+", | ||
"| 1.5 | 2.5 | 3.5 | 2.5 |", | ||
"+--------------+---------------------------+---------------------------+---------------------------+", | ||
"+--------------+-------------------------+-------------------------+-------------------------+", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Operator::Modulo, | ||
DataType::Decimal128(10, 2) | ||
); | ||
// TODO add other data type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should file another ticket to track this gap? Specifically, a "help wanted" ticket that explained what was needed might encourage some additional contributions
Operator::GtEq, | ||
DataType::Decimal128(15, 3) | ||
); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the tests cases in this file carefully. Very nice 👌
DataType::Boolean, | ||
DataType::Boolean, | ||
Operator::Or, | ||
DataType::Boolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any other types that are coerced to boolean for logical operations? Or are the tests for boolean just showing that boolean types are not changed when coerced?
@@ -69,14 +67,8 @@ impl OptimizerRule for TypeCoercion { | |||
}, | |||
); | |||
|
|||
let mut execution_props = ExecutionProps::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice
@@ -892,25 +891,6 @@ impl BinaryExpr { | |||
} | |||
} | |||
|
|||
/// return two physical expressions that are optionally coerced to a | |||
/// common type that the binary operator supports. | |||
fn binary_cast( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
@@ -1080,7 +1071,7 @@ mod tests { | |||
Operator::Plus, | |||
Int32Array, | |||
DataType::Int32, | |||
vec![2i32, 4i32] | |||
vec![2i32, 4i32], | |||
); | |||
test_coercion!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if these tests are doing anything useful anymore now that we are coercing in the logical layer? 🤔 It seems like they are now testing the test code 😆
I plan to get this PR ready to merge by: add my suggested comments, fix clippy, and merge to master |
(since I already have it checked out this will be a simple thing for me) |
Benchmark runs are scheduled for baseline = b625277 and contender = d7c0e42. d7c0e42 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #3388
depend on #3421 #3289 #3459 #3472
fixed the row filter predication for the data type of
NULL
value #3470Rationale for this change
What changes are included in this PR?
In order to make the pr review more friendly for reviewer, I will explain what I did in this pr:
remove the type coercion in the creation of the binary physical expr and fix the test case
redo the
simplify expression
again after thetype coercion
in the optimizer. This is just an minor fix. If the issue move the type coercion out of the optimizer and refactor the optimizer #3582 done, we can remove the duplicatedsimplify expression
in the optimizer.try do the type coercion before evaluate the logical expr in the file according to the comments in this issue simplify_expressions don't support different data type for binary #3556 from @alamb . This is just an temporary fix, and it will be remove when do move the type coercion out of the optimizer and refactor the optimizer #3582
remove the between optimization in the simplify expression because of issue simplify between expr should consider the data type #3587, and will recovery when do move the type coercion out of the optimizer and refactor the optimizer #3582
refresh the test cases
cc @alamb @andygrove
Are there any user-facing changes?