Skip to content

Conversation

@ZiyueHuang
Copy link
Member

@ZiyueHuang ZiyueHuang commented Jun 12, 2017

What changes were proposed in this pull request?

df.groupBy.count() should be df.groupBy().count() , otherwise there is an error :

ambiguous reference to overloaded definition, both method groupBy in class Dataset of type (col1: String, cols: String*) and method groupBy in class Dataset of type (cols: org.apache.spark.sql.Column*)

How was this patch tested?

val df = spark.readStream.schema(...).json(...)
val dfCounts = df.groupBy().count()

…groupBy in class Dataset of type (col1: String, cols: String*) and (cols: org.apache.spark.sql.Column*)
@srowen
Copy link
Member

srowen commented Jun 12, 2017

Hm, how do both match? the first one must have at least one String arg.

@ZiyueHuang
Copy link
Member Author

@srowen Ambiguity is due to the lack of parentheses behind groupBy. So it should be df.groupBy().count(), not df.groupBy.count(), otherwise there is an error of ambiguity. I have tested.

@srowen
Copy link
Member

srowen commented Jun 12, 2017

OK yeah I tried it and the error is a little more specifically that you can't invoke a varargs method this way.

@SparkQA
Copy link

SparkQA commented Jun 12, 2017

Test build #3791 has finished for PR 18272 at commit 67dd3c7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jun 12, 2017

Merged to master/2.2

asfgit pushed a commit that referenced this pull request Jun 12, 2017
## What changes were proposed in this pull request?

`df.groupBy.count()` should be `df.groupBy().count()` , otherwise there is an error :

ambiguous reference to overloaded definition, both method groupBy in class Dataset of type (col1: String, cols: String*) and method groupBy in class Dataset of type (cols: org.apache.spark.sql.Column*)

## How was this patch tested?

```scala
val df = spark.readStream.schema(...).json(...)
val dfCounts = df.groupBy().count()
```

Author: Ziyue Huang <[email protected]>

Closes #18272 from ZiyueHuang/master.

(cherry picked from commit e6eb02d)
Signed-off-by: Sean Owen <[email protected]>
@asfgit asfgit closed this in e6eb02d Jun 12, 2017
dataknocker pushed a commit to dataknocker/spark that referenced this pull request Jun 16, 2017
## What changes were proposed in this pull request?

`df.groupBy.count()` should be `df.groupBy().count()` , otherwise there is an error :

ambiguous reference to overloaded definition, both method groupBy in class Dataset of type (col1: String, cols: String*) and method groupBy in class Dataset of type (cols: org.apache.spark.sql.Column*)

## How was this patch tested?

```scala
val df = spark.readStream.schema(...).json(...)
val dfCounts = df.groupBy().count()
```

Author: Ziyue Huang <[email protected]>

Closes apache#18272 from ZiyueHuang/master.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants