Skip to content

[SPARK-30682][R][SQL][FOLLOW-UP] Keep the name similar with Scala side in higher order functions#31226

Closed
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-30682
Closed

[SPARK-30682][R][SQL][FOLLOW-UP] Keep the name similar with Scala side in higher order functions#31226
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-30682

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Jan 18, 2021

What changes were proposed in this pull request?

This PR is a followup of #27433. It fixes the naming to match with Scala side, and this is similar with #31062.

Note that:

  • there are a bit of inconsistency already e.g.) x, y in SparkR and they are documented together for doc deduplication. This part I did not change but the name zero vs initialValue looks unnecessary.
  • such naming matching seems already pretty common in SparkR.

Why are the changes needed?

To make the usage similar with Scala side, and for consistency.

Does this PR introduce any user-facing change?

No, this is not released yet.

How was this patch tested?

GitHub Actions and Jenkins build will test it out.

Also, I manually tested:

> df <- select(createDataFrame(data.frame(id = 1)),expr("CAST(array(1.0, 2.0, -3.0, -4.0) AS array<double>) xs"))
> collect(select(df, array_aggregate("xs", initialValue = lit(0.0), merge = function(x, y) otherwise(when(x > y, x), y))))
  aggregate(xs, 0.0, lambdafunction(CASE WHEN (x > y) THEN x ELSE y END, x, y), lambdafunction(id, id))
1                                                                                                     2

@HyukjinKwon
Copy link
Member Author

@dongjoon-hyun and @viirya, would you mind taking a quick look please?
@zero323 and @mengxr FYI

@viirya
Copy link
Member

viirya commented Jan 18, 2021

lgtm

@SparkQA
Copy link

SparkQA commented Jan 18, 2021

Test build #134176 has finished for PR 31226 at commit 535bc85.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 18, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38760/

@SparkQA
Copy link

SparkQA commented Jan 18, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38760/

@HyukjinKwon
Copy link
Member Author

Merged to master and branch-3.1.

HyukjinKwon added a commit that referenced this pull request Jan 18, 2021
…e in higher order functions

### What changes were proposed in this pull request?

This PR is a followup of #27433. It fixes the naming to match with Scala side, and this is similar with #31062.

Note that:

- there are a bit of inconsistency already e.g.) `x`, `y` in SparkR and they are documented together for doc deduplication. This part I did not change but the name `zero` vs `initialValue` looks unnecessary.
- such naming matching seems already pretty common in SparkR.

### Why are the changes needed?

To make the usage similar with Scala side, and for consistency.

### Does this PR introduce _any_ user-facing change?

No, this is not released yet.

### How was this patch tested?

GitHub Actions and Jenkins build will test it out.

Also, I manually tested:

```r
> df <- select(createDataFrame(data.frame(id = 1)),expr("CAST(array(1.0, 2.0, -3.0, -4.0) AS array<double>) xs"))
> collect(select(df, array_aggregate("xs", initialValue = lit(0.0), merge = function(x, y) otherwise(when(x > y, x), y))))
  aggregate(xs, 0.0, lambdafunction(CASE WHEN (x > y) THEN x ELSE y END, x, y), lambdafunction(id, id))
1                                                                                                     2
```

Closes #31226 from HyukjinKwon/SPARK-30682.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit b5bdbf2)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@dongjoon-hyun
Copy link
Member

+1, late LGTM.

skestle pushed a commit to skestle/spark that referenced this pull request Feb 3, 2021
…e in higher order functions

### What changes were proposed in this pull request?

This PR is a followup of apache#27433. It fixes the naming to match with Scala side, and this is similar with apache#31062.

Note that:

- there are a bit of inconsistency already e.g.) `x`, `y` in SparkR and they are documented together for doc deduplication. This part I did not change but the name `zero` vs `initialValue` looks unnecessary.
- such naming matching seems already pretty common in SparkR.

### Why are the changes needed?

To make the usage similar with Scala side, and for consistency.

### Does this PR introduce _any_ user-facing change?

No, this is not released yet.

### How was this patch tested?

GitHub Actions and Jenkins build will test it out.

Also, I manually tested:

```r
> df <- select(createDataFrame(data.frame(id = 1)),expr("CAST(array(1.0, 2.0, -3.0, -4.0) AS array<double>) xs"))
> collect(select(df, array_aggregate("xs", initialValue = lit(0.0), merge = function(x, y) otherwise(when(x > y, x), y))))
  aggregate(xs, 0.0, lambdafunction(CASE WHEN (x > y) THEN x ELSE y END, x, y), lambdafunction(id, id))
1                                                                                                     2
```

Closes apache#31226 from HyukjinKwon/SPARK-30682.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@HyukjinKwon HyukjinKwon deleted the SPARK-30682 branch January 4, 2022 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments