Skip to content

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Jul 25, 2018

What changes were proposed in this pull request?

The below SQL will throw AnalysisException. but it can success on Spark 2.1.x. This pr fix this issue.

CREATE TEMPORARY VIEW t4 AS SELECT * FROM VALUES
  (CAST(1 AS DOUBLE), CAST(2 AS STRING), CAST(3 AS STRING))
AS t1(t4a, t4b, t4c);

CREATE TEMPORARY VIEW t5 AS SELECT * FROM VALUES
  (CAST(1 AS DECIMAL(18, 0)), CAST(2 AS STRING), CAST(3 AS BIGINT))
AS t1(t5a, t5b, t5c);

SELECT * FROM t4
WHERE
(t4a, t4b, t4c) IN (SELECT t5a, t5b, t5c FROM t5);

I tested the different combinations of getting commonTypes, it looks like findCommonTypeForBinaryComparison + findWiderTypeWithoutStringPromotionForTwo is the best way.

No. commonTypes desc
1 findCommonTypeForBinaryComparison + findTightestCommonType Can't compare double with decimal
2 findCommonTypeForBinaryComparison + findWiderTypeForTwo Seems good
3 findWiderTypeForTwo Can't compare string with double
4 findCommonTypeForBinaryComparison + findWiderTypeWithoutStringPromotionForTwo Same to No.2
5 findWiderTypeWithoutStringPromotionForTwo Can't compare string with int
6 findCommonTypeForBinaryComparison + findTightestCommonType + findWiderTypeForDecimal Same to No.2

The details can be found in the commit log.

How was this patch tested?

unit tests

@mgaido91
Copy link
Contributor

I think this is basically the same of what I proposed in #19635. Unfortunately, that PR got a bit stuck...

SELECT cast('2017-12-12 09:30:00' as date) in (cast('2017-12-12 09:30:00' as date), cast('2017-12-11 09:30:00.0' as timestamp)) FROM t;
SELECT cast('2017-12-12 09:30:00' as date) in (cast('2017-12-12 09:30:00' as date), cast('2017-12-11 09:30:00' as date)) FROM t;

SELECT * FROM t WHERE (cast(1 as tinyint)) IN (SELECT cast(1 as tinyint) FROM t);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to test all the combinations? We need most of such logics should be tested in findWiderTypeWithoutStringPromotionForTwo and we could have just few end to end tests.

@wangyum
Copy link
Member Author

wangyum commented Jul 25, 2018

Oh. It turns out that @dilipbiswal is talking about that PR. I didn't find it in your recent PR. Let’s wait if the test can pass.

@SparkQA
Copy link

SparkQA commented Jul 25, 2018

Test build #93538 has finished for PR 21871 at commit 8ef142f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum wangyum closed this Jul 26, 2018
@wangyum
Copy link
Member Author

wangyum commented Oct 18, 2018

workaround:

SELECT * FROM t4
WHERE
(t4a, t4b, t4c) IN (SELECT t5a, t5b, t5c FROM t5);

->

SELECT * FROM t4
WHERE
(t4a, t4b, t4c) IN (SELECT CAST(t5a as DOUBLE), CAST(t5b AS STRING), CAST(t5c AS STRING) FROM t5);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants