Skip to content

Conversation

@AngersZhuuuu
Copy link
Contributor

What changes were proposed in this pull request?

For query

select array_intersect(array(cast('nan' as double), 1d), array(cast('nan' as double)))

This returns [NaN], but it should return [].
This issue is caused by OpenHashSet can't handle Double.NaN and Float.NaN too.
In this pr fix this based on #33955

Why are the changes needed?

Fix bug

Does this PR introduce any user-facing change?

ArrayIntersect won't show equal NaN value

How was this patch tested?

Added UT

@github-actions github-actions bot added the SQL label Sep 14, 2021
@SparkQA
Copy link

SparkQA commented Sep 14, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47768/

hsResult.add(elem)
if (isNaN(elem)) {
if (hs.containsNaN() && !hsResult.containsNaN()) {
arrayBuffer += elem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, let's wait a little bit for the decision at the first PR.

@SparkQA
Copy link

SparkQA commented Sep 14, 2021

Test build #143265 has finished for PR 33995 at commit 4189d71.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 15, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47811/

@SparkQA
Copy link

SparkQA commented Sep 15, 2021

Test build #143308 has finished for PR 33995 at commit 64afef9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AngersZhuuuu
Copy link
Contributor Author

ping @cloud-fan

@AngersZhuuuu AngersZhuuuu marked this pull request as draft September 17, 2021 05:43
@SparkQA
Copy link

SparkQA commented Sep 17, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47904/

@SparkQA
Copy link

SparkQA commented Sep 17, 2021

Test build #143396 has finished for PR 33995 at commit a9e6205.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AngersZhuuuu AngersZhuuuu marked this pull request as ready for review September 17, 2021 12:54
@AngersZhuuuu
Copy link
Contributor Author

ping @cloud-fan

@SparkQA
Copy link

SparkQA commented Sep 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47919/

@SparkQA
Copy link

SparkQA commented Sep 17, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47919/

@SparkQA
Copy link

SparkQA commented Sep 17, 2021

Test build #143412 has finished for PR 33995 at commit 85f9d9d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class WriterBucketSpec(

@cloud-fan cloud-fan closed this in 2fc7f2f Sep 20, 2021
cloud-fan pushed a commit that referenced this pull request Sep 20, 2021
…oat.NaN

### What changes were proposed in this pull request?
For query
```
select array_intersect(array(cast('nan' as double), 1d), array(cast('nan' as double)))
```
This returns [NaN], but it should return [].
This issue is caused by `OpenHashSet` can't handle `Double.NaN` and `Float.NaN` too.
In this pr fix this based on #33955

### Why are the changes needed?
Fix bug

### Does this PR introduce _any_ user-facing change?
ArrayIntersect won't show equal `NaN` value

### How was this patch tested?
Added UT

Closes #33995 from AngersZhuuuu/SPARK-36754.

Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 2fc7f2f)
Signed-off-by: Wenchen Fan <[email protected]>
cloud-fan pushed a commit that referenced this pull request Sep 20, 2021
…oat.NaN

### What changes were proposed in this pull request?
For query
```
select array_intersect(array(cast('nan' as double), 1d), array(cast('nan' as double)))
```
This returns [NaN], but it should return [].
This issue is caused by `OpenHashSet` can't handle `Double.NaN` and `Float.NaN` too.
In this pr fix this based on #33955

### Why are the changes needed?
Fix bug

### Does this PR introduce _any_ user-facing change?
ArrayIntersect won't show equal `NaN` value

### How was this patch tested?
Added UT

Closes #33995 from AngersZhuuuu/SPARK-36754.

Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 2fc7f2f)
Signed-off-by: Wenchen Fan <[email protected]>
cloud-fan pushed a commit that referenced this pull request Sep 20, 2021
…oat.NaN

### What changes were proposed in this pull request?
For query
```
select array_intersect(array(cast('nan' as double), 1d), array(cast('nan' as double)))
```
This returns [NaN], but it should return [].
This issue is caused by `OpenHashSet` can't handle `Double.NaN` and `Float.NaN` too.
In this pr fix this based on #33955

### Why are the changes needed?
Fix bug

### Does this PR introduce _any_ user-facing change?
ArrayIntersect won't show equal `NaN` value

### How was this patch tested?
Added UT

Closes #33995 from AngersZhuuuu/SPARK-36754.

Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 2fc7f2f)
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan
Copy link
Contributor

thanks, merging to master/3.2/3.1/3.0!

@karenfeng
Copy link
Contributor

For clarification @AngersZhuuuu: the PR description says:

For query

select array_intersect(array(cast('nan' as double), 1d), array(cast('nan' as double)))

This returns [NaN], but it should return [].

Is this the right way around? It seems like we now correctly return [NaN], but previously incorrectly returned [].

@AngersZhuuuu
Copy link
Contributor Author

For clarification @AngersZhuuuu: the PR description says:

For query

select array_intersect(array(cast('nan' as double), 1d), array(cast('nan' as double)))

This returns [NaN], but it should return [].

Is this the right way around? It seems like we now correctly return [NaN], but previously incorrectly returned [].

Oh, sorry for the mistake. Correct is we should return [NaN]

fishcus pushed a commit to fishcus/spark that referenced this pull request Jan 12, 2022
…oat.NaN

### What changes were proposed in this pull request?
For query
```
select array_intersect(array(cast('nan' as double), 1d), array(cast('nan' as double)))
```
This returns [NaN], but it should return [].
This issue is caused by `OpenHashSet` can't handle `Double.NaN` and `Float.NaN` too.
In this pr fix this based on apache#33955

### Why are the changes needed?
Fix bug

### Does this PR introduce _any_ user-facing change?
ArrayIntersect won't show equal `NaN` value

### How was this patch tested?
Added UT

Closes apache#33995 from AngersZhuuuu/SPARK-36754.

Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 2fc7f2f)
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants