Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all_of() or c_across() function error when tidyselect, since update 1.0.6 of dplyr #5883

Closed
HYoung1698 opened this issue May 13, 2021 · 0 comments · Fixed by #5885
Closed
Assignees
Labels
bug an unexpected problem or unintended behavior
Milestone

Comments

@HYoung1698
Copy link

HYoung1698 commented May 13, 2021

all_of or c_across function cannot help tidyselect a set of variables according to a particular set of strings. This is a really really weird error and please see my demo below. Thank you!

library(tidyverse)

### Behavior of tidy-select is changed after 1.0.6 release of dplyr. 
### If all_of function has been modified, it would be likely that all_of is causing the problem below. 

# Generate a random dataset with colname A, B, C, D, E
FOOBAR.df <- data.frame(matrix(rnorm(50), nrow = 10)) %>% `colnames<-`(LETTERS[1:5])

# Usually if you want the rowSums, c_across, all_of and a vector of string always did the trick when version <= 1.0.5

FOOBAR.df %>% rowwise() %>% transmute(sum(c_across(all_of(c("A", "B", "C", "D", "E")))))
#> # A tibble: 10 x 1
#> # Rowwise: 
#>    `sum(c_across(all_of(c("A", "B", "C", "D", "E"))))`
#>                                                  <dbl>
#>  1                                             -0.967 
#>  2                                             -0.168 
#>  3                                              0.0993
#>  4                                             -0.959 
#>  5                                             -2.26  
#>  6                                             -4.41  
#>  7                                              1.48  
#>  8                                             -0.261 
#>  9                                              2.16  
#> 10                                             -3.38

# But I got an error specific to the vector string of colname I was using in my dataset, thus I am typing in exactly what I used when this error occurred
colnames(FOOBAR.df) <- c("Prevotella", "Streptococcus", "Gemella", "Rothia", "Haemophilus")

# Try to get the row sum of all the variables in the vector of string, will result in an error. 
FOOBAR.df %>% rowwise() %>% transmute(sum(c_across(all_of(c("Prevotella", "Streptococcus", "Gemella", "Rothia", "Haemophilus"))))) #%>% tryCatch(error = function(e) "no such index at level 1") 
#> Error: Problem with `mutate()` input `..1`.
#> ℹ `..1 = sum(...)`.
#> x no such index at level 1
#> 
#> ℹ The error occurred in row 1.

# Why I said that this error seems to be particularly to the vector string of colname? I've tried the following tests: 

#  a) Replace a spelling, for example I shall spell "Prevotella", the first element in previous colname, as "Prevotela" with only 1 l and it works fine

colnames(FOOBAR.df) <- c("Prevotela", "Streptococcus", "Gemella", "Rothia", "Haemophilus")
FOOBAR.df %>% rowwise() %>% transmute(sum(c_across(all_of(c("Prevotela", "Streptococcus", "Gemella", "Rothia", "Haemophilus")))))
#> # A tibble: 10 x 1
#> # Rowwise: 
#>    `sum(...)`
#>         <dbl>
#>  1    -0.967 
#>  2    -0.168 
#>  3     0.0993
#>  4    -0.959 
#>  5    -2.26  
#>  6    -4.41  
#>  7     1.48  
#>  8    -0.261 
#>  9     2.16  
#> 10    -3.38

#  b) Removing one or some elements off that string and it works just fine 

FOOBAR.df %>% rowwise() %>% transmute(sum(c_across(all_of(c("Streptococcus", "Gemella", "Rothia", "Haemophilus")))))
#> # A tibble: 10 x 1
#> # Rowwise: 
#>    `sum(c_across(all_of(c("Streptococcus", "Gemella", "Rothia", "Haemophilus"))…
#>                                                                            <dbl>
#>  1                                                                       -1.60  
#>  2                                                                       -1.10  
#>  3                                                                       -0.0986
#>  4                                                                       -0.576 
#>  5                                                                       -1.25  
#>  6                                                                       -4.27  
#>  7                                                                        1.96  
#>  8                                                                        0.491 
#>  9                                                                        1.44  
#> 10                                                                       -2.45

#  c) Wrap the string I was using into a vector, and feed the name of the vector as parameter of all_of, and it also works fine

test_colname_vec <- c("Prevotella", "Streptococcus", "Gemella", "Rothia", "Haemophilus")
colnames(FOOBAR.df) <- test_colname_vec
FOOBAR.df %>% rowwise() %>% transmute(sum(c_across(all_of(test_colname_vec))))
#> # A tibble: 10 x 1
#> # Rowwise: 
#>    `sum(c_across(all_of(test_colname_vec)))`
#>                                        <dbl>
#>  1                                   -0.967 
#>  2                                   -0.168 
#>  3                                    0.0993
#>  4                                   -0.959 
#>  5                                   -2.26  
#>  6                                   -4.41  
#>  7                                    1.48  
#>  8                                   -0.261 
#>  9                                    2.16  
#> 10                                   -3.38

#  In summary, it seems to be an error related to how 'all_of' or 'c_across' deal with string vector as input parameter, 
#  and is also likely sensitive to the content and length of string. This is really confusing, and the same code had zero problem in a 1.0.5 dplyr env
#  FYI, I am also listing the version of R I was using below: 
version[c("system", "version.string")]
#>                _                           
#> system         x86_64, darwin15.6.0        
#> version.string R version 3.6.2 (2019-12-12)

Created on 2021-05-12 by the reprex package (v2.0.0)

@romainfrancois romainfrancois changed the title all_of or c_across` function error when tidyselect, since update 1.0.6 of dplyr all_of() or c_across() function error when tidyselect, since update 1.0.6 of dplyr May 17, 2021
@romainfrancois romainfrancois self-assigned this May 17, 2021
@romainfrancois romainfrancois added the bug an unexpected problem or unintended behavior label May 17, 2021
@romainfrancois romainfrancois added this to the 1.0.7 milestone May 17, 2021
romainfrancois added a commit that referenced this issue May 18, 2021
* making sure key_deparse() returns a single string

closes #5883
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants