-
Notifications
You must be signed in to change notification settings - Fork 9
/
README.Rmd
151 lines (94 loc) · 6.42 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
---
output: github_document
bibliography: bib.bib
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# r2glmm
This package computes model and semi partial $R^2$ with confidence limits for the linear and generalized linear mixed model (LMM and GLMM). The $R^2$ measure from @edwards2008r2 is extended to the GLMM using penalized quasi-likelihood (PQL) estimation (see @r2glmmJaeger).
* Changes: Version 0.1.3
1. Updated semi-partial computation. Categorical variables and polynomial regression variables are now grouped.
* Changes: Version 0.1.2
1. Optimized computation of matrix inverses and cross-products in order to decrease computation time.
* Changes: Version 0.1.1
1. Updated the r2beta function with an optional data argument. Users who wish to use a function (i.e. log(x)) in their model formula should specify the original data frame used by the model when using the r2beta function.
2. Included support for GLMMs fitted using the glmer function from the lme4 package.
4. Added generic plot and print functions for R2 objects.
* Why use this package?
The $R^2$ statistic is a well known tool that describes goodness-of-fit for a statistical model. In the linear model, $R^2$ is interpreted as the proportion of variance in the data explained by the fixed predictors and semi-partial $R^2$ provide standardized measures of effect size for subsets of fixed predictors. In the linear mixed model, numerous definitions of $R^2$ exist and interpretations vary by definition. The r2glmm package computes $R^2$ using three definitions:
1. $R_\beta^2$, a standardized measure of multivariate association between the fixed predictors and the observed outcome. This method was introduced by @edwards2008r2.
2. $R_\Sigma^2$, the proportion of generalized variance explained by the fixed predictors. This method was introduced by @r2glmmJaeger
3. $R_{(m)}^2$, the proportion of variance explained by the fixed predictors. This method was introduced by @nakagawa2013general and later modified by @johnson2014extension.
Each interpretation can be used for model selection and is helpful for summarizing model goodness-of-fit. While the information criteria are useful tools for model selection, they do not quantify goodness-of-fit, making the $R^2$ statistic an excellent tool to accompany values of AIC and BIC. Additionally, in the context of mixed models, semi-partial $R^2$ and confidence limits are two useful and exclusive features of the r2glmm package.
* Instructions for installation:
The most up-to-date version of the r2glmm package is available on Github. To download the package from Github, after installing and loading the devtools package, run the following code from the R console:
```{r, eval=FALSE}
devtools::install_github('bcjaeger/r2glmm')
```
Alternatively, There is a version of the package available on CRAN. To download the package from CRAN, run the following code from the R console:
```{r, eval=FALSE}
install.packages('r2glmm')
```
* How to use this package
The main function in this package is called r2beta. The r2beta function summarizes a mixed model by computing the model $R^2$ statistic and semi-partial $R^2$ statistics for each fixed predictor in the model. The r2glmm package computes $R^2$ using three definitions. Below we list the methods, their interpretation, and an example of their application:
1. $R_\beta^2$, a standardized measure of multivariate association between the fixed predictors and the observed outcome. This statistic is primarily used to select fixed effects in the linear and generalized linear mixed model.
```{r}
library(lme4)
library(nlme)
library(r2glmm)
library(splines)
data(Orthodont)
# Compute mean models with the r2beta statistic
# using the Kenward-Roger approach.
mer1 = lmer(distance ~ bs(age)*Sex + (1|Subject), data = Orthodont)
mer2 = lmer(distance ~ age + (1|Subject), data = Orthodont)
(r2.mer1 = r2beta(mer1, method = 'kr', partial = T, data = Orthodont))
(r2.mer2 = r2beta(mer2, method = 'kr', partial = T, data = Orthodont))
```
2. $R_\Sigma^2$, the proportion of generalized variance explained by the fixed predictors. This statistic is primarily used to select a covariance structure in the linear and generalized linear mixed model.
```{r}
# m1 has a compound symmetric (CS) covariance structure.
lme1 = lme(distance ~ age*Sex, ~1|Subject, data = Orthodont)
# m2 is an order 1 autoregressive (AR1) model with
# gender-specific residual variance estimates.
lme2 = lme(distance ~ age*Sex, ~1|Subject, data=Orthodont,
correlation = corAR1(form=~1|Subject),
weights = varIdent(form=~1|Sex))
# Compare the models
(r2m1 = r2beta(model=lme1,method='sgv',partial=FALSE))
(r2m2 = r2beta(model=lme2,method='sgv',partial=FALSE))
```
3. $R_{(m)}^2$, the proportion of variance explained by the fixed predictors. This statistic is a simplified version of $R_\beta^2$ that can be used as a substitute for models fitted to very large datasets.
```{r}
# Compute the R2 statistic using Nakagawa and Schielzeth's approach.
(r2nsj = r2beta(mer1, method = 'nsj', partial = TRUE))
# Check the result with MuMIn's r.squaredGLMM
r2nsj_mum = MuMIn::r.squaredGLMM(mer1)
all.equal(r2nsj[1,'Rsq'],as.numeric(r2nsj_mum[1]), tolerance = 1e-3)
```
* $R^2$ for the Generalized Linear Mixed Model (GLMM)
The r2glmm package can compute $R_\beta^2$ for models fitted using the glmer function from the lme4 package. Note that this method is experimental in R and values of $R_\beta^2$ sometimes exceed 1. We recommend using the SAS macro available at https://github.com/bcjaeger/R2FixedEffectsGLMM/blob/master/Glimmix_R2_V3.sas. $R_\Sigma^2$ is more stable and can be computed for models fitted using either the glmer function or the glmmPQL function from the MASS package; however, minor differences in model estimation can lead to slight variation in the values of $R_\Sigma^2$.
```{r}
library(lattice)
library(MASS)
cbpp$period = as.numeric(cbpp$period)
# using glmer (based in lme4)
gm1 <- glmer(
formula=cbind(incidence, size-incidence) ~ bs(period) + (1|herd),
data = cbpp, family = binomial)
# using glmmPQL (based on nlme)
pql1 <- glmmPQL(
cbind(incidence, size-incidence) ~ bs(period),
random = ~ 1|herd, family = binomial, data = cbpp
)
# Note minor differences in R^2_Sigma
r2beta(model = gm1, method = 'sgv', data = cbpp)
r2beta(model = pql1, method = 'sgv', data = cbpp)
```
# References