Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in colMeans(as.matrix(imp[[j]]), na.rm = TRUE) : 'x' must be numeric #601

Open
stefvanbuuren opened this issue Nov 20, 2023 · 4 comments

Comments

@stefvanbuuren
Copy link
Member

stefvanbuuren commented Nov 20, 2023

Describe the bug
MICE crashes on an incomplete character variable

To Reproduce

library(mice)
#> 
#> Attaching package: 'mice'
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following objects are masked from 'package:base':
#> 
#>     cbind, rbind
nh3 <- nhanes2
nh3$chl <- as.character(nh3$chl)
mice(nh3)
#> 
#>  iter imp variable
#>   1   1  bmi  hyp
#>   1   2  bmi  hyp
#>   1   3  bmi  hyp
#>   1   4  bmi  hyp
#>   1   5  bmi  hyp
#> Error in colMeans(as.matrix(imp[[j]]), na.rm = TRUE): 'x' must be numeric

Created on 2023-11-20 with reprex v2.0.2

Expected behavior
mice() should not touch or impute character variables.

@hanneoberman
Copy link
Member

Cannot reproduce with mice 3.16.8

nh3 <- mice::nhanes2
nh3$chl <- as.character(nh3$chl)
imp <- mice::mice(nh3)
#> 
#>  iter imp variable
#>   1   1  bmi  hyp
#>   1   2  bmi  hyp
#>   1   3  bmi  hyp
#>   1   4  bmi  hyp
#>   1   5  bmi  hyp
#>   2   1  bmi  hyp
#>   2   2  bmi  hyp
#>   2   3  bmi  hyp
#>   2   4  bmi  hyp
#>   2   5  bmi  hyp
#>   3   1  bmi  hyp
#>   3   2  bmi  hyp
#>   3   3  bmi  hyp
#>   3   4  bmi  hyp
#>   3   5  bmi  hyp
#>   4   1  bmi  hyp
#>   4   2  bmi  hyp
#>   4   3  bmi  hyp
#>   4   4  bmi  hyp
#>   4   5  bmi  hyp
#>   5   1  bmi  hyp
#>   5   2  bmi  hyp
#>   5   3  bmi  hyp
#>   5   4  bmi  hyp
#>   5   5  bmi  hyp
#> Warning: Number of logged events: 1
imp$loggedEvents
#>   it im dep     meth out
#> 1  0  0     constant chl

Created on 2023-11-20 with reprex v2.0.2

@stefvanbuuren
Copy link
Member Author

Ah, thanks. I forgot to mention that my test was calculated from the branch support_blocks branch.

I will add a test to that branch to ban this baby from appearing in master.

@stefvanbuuren
Copy link
Member Author

Test added to mice4 branch

@stefvanbuuren
Copy link
Member Author

I got a report that the error may also appear in the CRAN version, mice 3.16.0. Here's an example and work-around.

library(mice)
library(dplyr)
packageVersion('mice') # 3.16.0

nh3 <- mice::nhanes2
# add column with a character variable
rin <- c("123456789", "123456788", "123456778", "123456678", "123455678", 
         "123456799", "123445689", "123445679", "123345689", "122345678",
         "223456789", "223456788", "223456778", "223456678", "223455678", 
         "223456799", "223445689", "223445679", "223345689", "222345678",
         "323456799", "323445689", "323445679", "323345689", "322345678")
nh3_data <- nh3 %>% cbind(rin)

# impute train data
imp <- mice(nh3_data, m = 3, seed = 22112)
# use mice.mids and the mids object imp on test data (I used the same data set, but suppose it is new test data)
imp_test <- mice.mids(imp, newdata = nh3_data, maxit = 1)

# If you're unlucky (BUT WHY??) you'll get: Error in colMeans(as.matrix(imp[[j]]), na.rm = TRUE) : 'x' must be numeric

# ad-hoc solution 
nh3_data <- nh3_data %>% mutate(rin = as.numeric(rin),
                                chl = as.numeric(chl))
# the error seems to be caused by character variable, even complete ones that are not imputed
imp_test <- mice.mids(imp, newdata = nh3_data, maxit = 1) 

When I run this in my system, everything is fine. However some users report a crash with Error in colMeans(as.matrix(imp[[j]]), na.rm = TRUE) : 'x' must be numeric. It is not yet clear why behaviours across systems is inconsistent.

@stefvanbuuren stefvanbuuren reopened this Apr 26, 2024
@stefvanbuuren stefvanbuuren changed the title mice() crashes on incomplete character variable Error in colMeans(as.matrix(imp[[j]]), na.rm = TRUE) : 'x' must be numeric Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants