Skip to content

Conversation

@actuaryzhang
Copy link
Contributor

What changes were proposed in this pull request?

The [[ method is supposed to take a single index and return a column. This is different from base R which takes a vector index. We should check for this and issue warning or error when vector index is supplied (which is very likely given the behavior in base R).

Currently I'm issuing a warning message and just take the first element of the vector index. We could change this to an error it that's better.

How was this patch tested?

new tests

@actuaryzhang
Copy link
Contributor Author

actuaryzhang commented Feb 21, 2017

@felixcheung
Simple example to illustrate this

df <- suppressWarnings(createDataFrame(iris))
df[[1:2]]
Error in cols[[i]] : 
  attempt to select more than one element in vectorIndex

Instead of issuing warning and taking the first element, we can throw a more meaningful error message. Minor issue, but many R users may be accustomed to use vector index with [[.

Another possibility is to return a dataframe with the selected columns, e.g., df[[1:2]] will return a data frame with the first two columns.

@SparkQA
Copy link

SparkQA commented Feb 21, 2017

Test build #73227 has finished for PR 17017 at commit 66888b6.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 21, 2017

Test build #73228 has finished for PR 17017 at commit f614d00.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 21, 2017

Test build #73231 has finished for PR 17017 at commit 91204aa.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

Jenkins, retest this please

@felixcheung
Copy link
Member

That's one weird test failure.
Looks good

@SparkQA
Copy link

SparkQA commented Feb 22, 2017

Test build #73264 has finished for PR 17017 at commit 91204aa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

if (length(i) > 1) {
warning("Subset index has length > 1. Only the first index is used.")
i <- i[1]
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, sorry I missed this - could you add a test for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixcheung Test added. Thanks for catching this.

@SparkQA
Copy link

SparkQA commented Feb 23, 2017

Test build #73329 has finished for PR 17017 at commit 46f6ca8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Feb 23, 2017
…" takes vector index

## What changes were proposed in this pull request?
The `[[` method is supposed to take a single index and return a column. This is different from base R which takes a vector index.  We should check for this and issue warning or error when vector index is supplied (which is very likely given the behavior in base R).

Currently I'm issuing a warning message and just take the first element of the vector index. We could change this to an error it that's better.

## How was this patch tested?
new tests

Author: actuaryzhang <[email protected]>

Closes #17017 from actuaryzhang/sparkRSubsetter.

(cherry picked from commit 7bf0943)
Signed-off-by: Felix Cheung <[email protected]>
@felixcheung
Copy link
Member

thanks - merged to master and branch-2.1

@asfgit asfgit closed this in 7bf0943 Feb 23, 2017
Yunni pushed a commit to Yunni/spark that referenced this pull request Feb 27, 2017
…" takes vector index

## What changes were proposed in this pull request?
The `[[` method is supposed to take a single index and return a column. This is different from base R which takes a vector index.  We should check for this and issue warning or error when vector index is supplied (which is very likely given the behavior in base R).

Currently I'm issuing a warning message and just take the first element of the vector index. We could change this to an error it that's better.

## How was this patch tested?
new tests

Author: actuaryzhang <[email protected]>

Closes apache#17017 from actuaryzhang/sparkRSubsetter.
@actuaryzhang actuaryzhang deleted the sparkRSubsetter branch July 1, 2017 00:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants