Skip to content

Commit

Permalink
added documentation on model
Browse files Browse the repository at this point in the history
  • Loading branch information
topepo committed Oct 10, 2024
1 parent 76d4ff6 commit 4db8ca6
Show file tree
Hide file tree
Showing 11 changed files with 481 additions and 7 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,8 @@ Remotes:
tidymodels/hardhat
ByteCompile: true
Config/Needs/website: C50, dbarts, earth, glmnet, keras, kernlab, kknn,
LiblineaR, mgcv, nnet, parsnip, randomForest, ranger, rpart, rstanarm,
tidymodels/tidymodels, tidyverse/tidytemplate, rstudio/reticulate,
LiblineaR, mgcv, nnet, parsnip, quantreg, randomForest, ranger, rpart,
rstanarm, tidymodels/tidymodels, tidyverse/tidytemplate, rstudio/reticulate,
xgboost
Config/rcmdcheck/ignore-inconsequential-notes: true
Config/testthat/edition: 3
Expand Down
4 changes: 2 additions & 2 deletions R/install_packages.R
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ install_engine_packages <- function(extension = TRUE, extras = TRUE,
}

if (extras) {
rmd_pkgs <- c("tidymodels", "broom.mixed", "glmnet", "Cubist", "xrf", "ape",
"rmarkdown")
rmd_pkgs <- c("ape", "broom.mixed", "Cubist", "glmnet", "quantreg",
"rmarkdown", "tidymodels", "xrf")
engine_packages <- unique(c(engine_packages, rmd_pkgs))
}

Expand Down
11 changes: 11 additions & 0 deletions R/linear_reg_quantreg.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#' Linear quantile regression via the quantreg package
#'
#' [quantreg::rq()] optimizes quantile loss to fit models with numeric outcomes.
#'
#' @includeRmd man/rmd/linear_reg_quantreg.md details
#'
#' @name details_linear_reg_quantreg
#' @keywords internal
NULL

# See inst/README-DOCS.md for a description of how these files are processed
1 change: 1 addition & 0 deletions inst/models.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@
"linear_reg" "regression" "lm" NA
"linear_reg" "regression" "lme" "multilevelmod"
"linear_reg" "regression" "lmer" "multilevelmod"
"linear_reg" "quantile regression" "quantreg" NA
"linear_reg" "regression" "spark" NA
"linear_reg" "regression" "stan" NA
"linear_reg" "regression" "stan_glmer" "multilevelmod"
Expand Down
173 changes: 173 additions & 0 deletions man/details_linear_reg_quantreg.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/other_predict.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/rmd/linear_reg_lm.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ linear_reg() %>%

_However_, the documentation in [stats::lm()] assumes that is specific type of case weights are being used: "Non-NULL weights can be used to indicate that different observations have different variances (with the values in weights being inversely proportional to the variances); or equivalently, when the elements of weights are positive integers `w_i`, that each response `y_i` is the mean of `w_i` unit-weight observations (including the case that there are w_i observations equal to `y_i` and the data have been summarized). However, in the latter case, notice that within-group variation is not used. Therefore, the sigma estimate and residual degrees of freedom may be suboptimal; in the case of replication weights, **even wrong**. Hence, standard errors and analysis of variance tables should be treated with care" (emphasis added)

Depending on your application, the degrees of freedown for the model (and other statistics) might be incorrect.
Depending on your application, the degrees of freedom for the model (and other statistics) might be incorrect.

## Saving fitted model objects

Expand Down
79 changes: 79 additions & 0 deletions man/rmd/linear_reg_quantreg.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
```{r, child = "aaa.Rmd", include = FALSE}
```

`r descr_models("linear_reg", "quantreg")`

This model has the same structure as the model fit by `lm()`, but instead of optimizing the sum of squared errors, it optimizes "quantile loss" in order to produce better estimates of the predictive distribution.

## Tuning Parameters

This engine has no tuning parameters.

## Translation from parsnip to the original package

This model only works with the `"quantile regression"` model and requires users to specify which areas of the distribution to predict via the `quantile_levels` argument. For example:

```{r quantreg-reg}
linear_reg() %>%
set_engine("quantreg") %>%
set_mode("quantile regression", quantile_levels = (1:3) / 4) %>%
translate()
```

## Output format

When multiple quantile levels are predicted, there are multiple predicted values for each row of new data. The `predict()` method for this mode produces a column named `.pred_quantile` that has a special class of `"quantile_pred"`, and it contains the predictions for each row.

For example:

```{r example}
library(modeldata)
rlang::check_installed("quantreg")
n <- nrow(Chicago)
Chicago <- Chicago %>% select(ridership, Clark_Lake)
Chicago_train <- Chicago[1:(n - 7), ]
Chicago_test <- Chicago[(n - 6):n, ]
qr_fit <-
linear_reg() %>%
set_engine("quantreg") %>%
set_mode("quantile regression", quantile_levels = (1:3) / 4) %>%
fit(ridership ~ Clark_Lake, data = Chicago_train)
qr_fit
qr_pred <- predict(qr_fit, Chicago_test)
qr_pred
```

We can unnest these values and/or convert them to a rectangular format:

```{r example-format}
as_tibble(qr_pred$.pred_quantile)
as.matrix(qr_pred$.pred_quantile)
```

## Preprocessing requirements

```{r child = "template-makes-dummies.Rmd"}
```

## Case weights

```{r child = "template-uses-case-weights.Rmd"}
```

## Saving fitted model objects

```{r child = "template-butcher.Rmd"}
```

## Examples

The "Fitting and Predicting with parsnip" article contains [examples](https://parsnip.tidymodels.org/articles/articles/Examples.html#linear-reg-quantreg) for `linear_reg()` with the `"quantreg"` engine.

## References

- Waldmann, E. (2018). Quantile regression: a short story on how and why. _Statistical Modelling_, 18(3-4), 203-218.
Loading

0 comments on commit 4db8ca6

Please sign in to comment.