Skip to content

[SPARK-30681][PYTHON][FOLLOW-UP] Keep the name similar with Scala side in higher order functions#31062

Closed
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-30681
Closed

[SPARK-30681][PYTHON][FOLLOW-UP] Keep the name similar with Scala side in higher order functions#31062
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-30681

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR is a followup of #27406. It fixes the naming to match with Scala side.

Note that there are a bit of inconsistency already e.g.) col, e, expr and column. This part I did not change but other names like zero vs initialValue or col1/col2 vs left/right looks unnecessary.

Why are the changes needed?

To make the usage similar with Scala side, and for consistency.

Does this PR introduce any user-facing change?

No, this is not released yet.

How was this patch tested?

GitHub Actions and Jenkins build will test it out.

@HyukjinKwon
Copy link
Member Author

Thanks for post-reviewing, @mengxr. Can you take a look please? cc @viirya, @zero323 too FYI.

Copy link
Contributor

@mengxr mengxr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending tests.

@SparkQA
Copy link

SparkQA commented Jan 6, 2021

Test build #133728 has finished for PR 31062 at commit e1a2cda.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

Thanks @mengxr and @viirya. Merged to master and branch-3.1.

HyukjinKwon added a commit that referenced this pull request Jan 6, 2021
…e in higher order functions

### What changes were proposed in this pull request?

This PR is a followup of #27406. It fixes the naming to match with Scala side.

Note that there are a bit of inconsistency already e.g.) `col`, `e`, `expr` and `column`. This part I did not change but other names like `zero` vs `initialValue` or `col1`/`col2` vs `left`/`right` looks unnecessary.

### Why are the changes needed?

To make the usage similar with Scala side, and for consistency.

### Does this PR introduce _any_ user-facing change?

No, this is not released yet.

### How was this patch tested?

GitHub Actions and Jenkins build will test it out.

Closes #31062 from HyukjinKwon/SPARK-30681.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit ff284fb)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@SparkQA
Copy link

SparkQA commented Jan 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38316/

@SparkQA
Copy link

SparkQA commented Jan 6, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38316/

Copy link
Member

@zero323 zero323 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for working on this @HyukjinKwon

HyukjinKwon added a commit that referenced this pull request Jan 18, 2021
…e in higher order functions

### What changes were proposed in this pull request?

This PR is a followup of #27433. It fixes the naming to match with Scala side, and this is similar with #31062.

Note that:

- there are a bit of inconsistency already e.g.) `x`, `y` in SparkR and they are documented together for doc deduplication. This part I did not change but the name `zero` vs `initialValue` looks unnecessary.
- such naming matching seems already pretty common in SparkR.

### Why are the changes needed?

To make the usage similar with Scala side, and for consistency.

### Does this PR introduce _any_ user-facing change?

No, this is not released yet.

### How was this patch tested?

GitHub Actions and Jenkins build will test it out.

Also, I manually tested:

```r
> df <- select(createDataFrame(data.frame(id = 1)),expr("CAST(array(1.0, 2.0, -3.0, -4.0) AS array<double>) xs"))
> collect(select(df, array_aggregate("xs", initialValue = lit(0.0), merge = function(x, y) otherwise(when(x > y, x), y))))
  aggregate(xs, 0.0, lambdafunction(CASE WHEN (x > y) THEN x ELSE y END, x, y), lambdafunction(id, id))
1                                                                                                     2
```

Closes #31226 from HyukjinKwon/SPARK-30682.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
HyukjinKwon added a commit that referenced this pull request Jan 18, 2021
…e in higher order functions

### What changes were proposed in this pull request?

This PR is a followup of #27433. It fixes the naming to match with Scala side, and this is similar with #31062.

Note that:

- there are a bit of inconsistency already e.g.) `x`, `y` in SparkR and they are documented together for doc deduplication. This part I did not change but the name `zero` vs `initialValue` looks unnecessary.
- such naming matching seems already pretty common in SparkR.

### Why are the changes needed?

To make the usage similar with Scala side, and for consistency.

### Does this PR introduce _any_ user-facing change?

No, this is not released yet.

### How was this patch tested?

GitHub Actions and Jenkins build will test it out.

Also, I manually tested:

```r
> df <- select(createDataFrame(data.frame(id = 1)),expr("CAST(array(1.0, 2.0, -3.0, -4.0) AS array<double>) xs"))
> collect(select(df, array_aggregate("xs", initialValue = lit(0.0), merge = function(x, y) otherwise(when(x > y, x), y))))
  aggregate(xs, 0.0, lambdafunction(CASE WHEN (x > y) THEN x ELSE y END, x, y), lambdafunction(id, id))
1                                                                                                     2
```

Closes #31226 from HyukjinKwon/SPARK-30682.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit b5bdbf2)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
skestle pushed a commit to skestle/spark that referenced this pull request Feb 3, 2021
…e in higher order functions

### What changes were proposed in this pull request?

This PR is a followup of apache#27433. It fixes the naming to match with Scala side, and this is similar with apache#31062.

Note that:

- there are a bit of inconsistency already e.g.) `x`, `y` in SparkR and they are documented together for doc deduplication. This part I did not change but the name `zero` vs `initialValue` looks unnecessary.
- such naming matching seems already pretty common in SparkR.

### Why are the changes needed?

To make the usage similar with Scala side, and for consistency.

### Does this PR introduce _any_ user-facing change?

No, this is not released yet.

### How was this patch tested?

GitHub Actions and Jenkins build will test it out.

Also, I manually tested:

```r
> df <- select(createDataFrame(data.frame(id = 1)),expr("CAST(array(1.0, 2.0, -3.0, -4.0) AS array<double>) xs"))
> collect(select(df, array_aggregate("xs", initialValue = lit(0.0), merge = function(x, y) otherwise(when(x > y, x), y))))
  aggregate(xs, 0.0, lambdafunction(CASE WHEN (x > y) THEN x ELSE y END, x, y), lambdafunction(id, id))
1                                                                                                     2
```

Closes apache#31226 from HyukjinKwon/SPARK-30682.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@HyukjinKwon HyukjinKwon deleted the SPARK-30681 branch January 4, 2022 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments