Skip to content

Conversation

@yaooqinn
Copy link
Member

What changes were proposed in this pull request?

Here is a case, let see how it goes in different SQL engines.

select cast('1 ' as int) as v1, '1 ' = 1 as v2

spark 1.6

NULL true

spark 2.1

NULL true

spark 2.2

NULL NULL

spark 2.3

NULL NULL

spark 2.4

NULL NULL

hive

NULL true

PostgreSQL

postgres=# select cast('1 ' as int) as v1, '1 ' = 1 as v2;
 v1 | v2
----+----
  1 | t
(1 row)

presto

presto> select cast('1 ' as int) as v1, '1 ' = 1 as v2;
Query 20191120_060530_00002_f5kcs failed: line 1:38: '=' cannot be applied to varchar(2), integer
select cast('1 ' as int) as v1, '1 ' = 1 as v2

presto> select cast('1 ' as int) as v1, '1 ' = '1 ' as v2;
Query 20191120_060545_00003_f5kcs failed: Cannot cast '1 ' to INT

Our behavior is unstable because type coercion changed since 2.2.
Personally, I think what PostgreSQL and Presto does here is more reasonable and consistent

Currently, this pull request obeys PostgreSQL, might need further discussion against this behavior change.

Why are the changes needed?

For better dirty data auto handling, keep consistency with older version sparks

Does this PR introduce any user-facing change?

ad ut

@yaooqinn
Copy link
Member Author

cc @cloud-fan @maropu @dongjoon-hyun @gatorsmile @HyukjinKwon, thanks for reviewing in advance.

@wangyum
Copy link
Member

wangyum commented Nov 21, 2019

We had a discussion before: #24872

@yaooqinn
Copy link
Member Author

We had a discussion before: #24872

oops, will close this.

@yaooqinn yaooqinn closed this Nov 21, 2019
@SparkQA
Copy link

SparkQA commented Nov 21, 2019

Test build #114191 has finished for PR 26618 at commit dc95213.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants