tidying up refs and organizing things

ropensci · Oct 8, 2023 · e883301 · e883301
1 parent de2b9f0
commit e883301
Show file tree

Hide file tree

Showing 19 changed files with 256 additions and 249 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -25,7 +25,7 @@ Authors@R: c(
            family = "Burk",
            role = "rev")
     )
-Description: Fit, interpret, and make predictions with oblique random survival forests. Oblique decision trees are notoriously slow compared to their axis based counterparts, but 'aorsf' runs as fast or faster than axis-based decision tree algorithms for right-censored time-to-event outcomes. Methods to accelerate and interpret the oblique random survival forest are described in Jaeger et al., (2022) <arXiv:2208.01129>.
+Description: Fit, interpret, and make predictions with oblique random survival forests. Oblique decision trees are notoriously slow compared to their axis based counterparts, but 'aorsf' runs as fast or faster than axis-based decision tree algorithms for right-censored time-to-event outcomes. Methods to accelerate and interpret the oblique random survival forest are described in Jaeger et al., (2023) <DOI: 10.1080/10618600.2023.2231048>.
 License: MIT + file LICENSE
 Encoding: UTF-8
 LazyData: true

diff --git a/R/orsf.R b/R/orsf.R
@@ -325,7 +325,7 @@
 #'
 #' `r roxy_cite_jaeger_2019()`
 #'
-#' `r roxy_cite_jaeger_2022()`
+#' `r roxy_cite_jaeger_2023()`
 #'
 #' @export
 #'

diff --git a/R/orsf_vi.R b/R/orsf_vi.R
@@ -76,7 +76,7 @@
 #'
 #' `r roxy_cite_menze_2011()`
 #'
-#' `r roxy_cite_jaeger_2022()`
+#' `r roxy_cite_jaeger_2023()`
 #'
 #'
 orsf_vi <- function(object,

diff --git a/R/roxy.R b/R/roxy.R
@@ -191,15 +191,16 @@ roxy_cite_jaeger_2019 <- function(){
 
 }
 
-roxy_cite_jaeger_2022 <- function(){
+roxy_cite_jaeger_2023 <- function(){
 
  roxy_cite(
   authors = "Jaeger BC, Welden S, Lenoir K, Speiser JL, Segar MW, Pandey A, Pajewski NM",
   title = "Accelerated and interpretable oblique random survival forests",
-  journal = "arXiv e-prints",
-  date = "2022 Aug",
-  number = 'arXiv-2208',
-  url = "https://arxiv.org/abs/2208.01129"
+  journal = "Journal of Computational and Graphical Statistics",
+  date = "Published online 08 Aug 2023",
+  number = NULL,
+  # doi = "10.1080/10618600.2023.2231048",
+  url = "https://doi.org/10.1080/10618600.2023.2231048"
  )
 
 }
@@ -270,7 +271,7 @@ roxy_dots <- function(){
 roxy_vi_describe <- function(type){
 
  switch(type,
-        'negate' = "Each variable is assessed separately by multiplying the variable's coefficients by -1 and then determining how much the model's performance changes. The worse the model's performance after negating coefficients for a given variable, the more important the variable. This technique is promising b/c it does not require permutation and it emphasizes variables with larger coefficients in linear combinations, but it is also relatively new and hasn't been studied as much as permutation importance. See [Jaeger, 2022](https://arxiv.org/abs/2208.01129) for more details on this technique.",
+        'negate' = "Each variable is assessed separately by multiplying the variable's coefficients by -1 and then determining how much the model's performance changes. The worse the model's performance after negating coefficients for a given variable, the more important the variable. This technique is promising b/c it does not require permutation and it emphasizes variables with larger coefficients in linear combinations, but it is also relatively new and hasn't been studied as much as permutation importance. See [Jaeger, 2023](https://doi.org/10.1080/10618600.2023.2231048) for more details on this technique.",
         'permute' = "Each variable is assessed separately by randomly permuting the variable's values and then determining how much the model's performance changes. The worse the model's performance after permuting the values of a given variable, the more important the variable. This technique is flexible, intuitive, and frequently used. It also has several [known limitations](https://christophm.github.io/interpretable-ml-book/feature-importance.html#disadvantages-9)",
         'anova' = "A p-value is computed for each coefficient in each linear combination of variables in each decision tree. Importance for an individual predictor variable is the proportion of times a p-value for its coefficient is < 0.01. This technique is very efficient computationally, but may not be as effective as permutation or negation in terms of selecting signal over noise variables. See [Menze, 2011](https://link.springer.com/chapter/10.1007/978-3-642-23783-6_29) for more details on this technique.")
 

diff --git a/README.Rmd b/README.Rmd
@@ -78,7 +78,7 @@ knitr::include_graphics('man/figures/tree_axis_v_oblique.png')
 
 ## Examples
 
-The `orsf()` function can fit several types of ORSF ensembles. My personal favorite is the accelerated ORSF because it has a great combination of prediction accuracy and computational efficiency (see [arXiv paper](https://arxiv.org/abs/2208.01129)).^2^
+The `orsf()` function can fit several types of ORSF ensembles. My personal favorite is the accelerated ORSF because it has a great combination of prediction accuracy and computational efficiency (see [JCGS paper](https://doi.org/10.1080/10618600.2023.2231048)).^2^
 
 ```{r, child='Rmd/orsf-fit-accelerated.Rmd'}
 
@@ -152,7 +152,7 @@ For more on ICE, see the [vignette](https://docs.ropensci.org/aorsf/articles/pd.
 
 ## Comparison to existing software
 
-Comparisons between `aorsf` and existing software are presented in our [arXiv paper](https://arxiv.org/abs/2208.01129). The paper
+Comparisons between `aorsf` and existing software are presented in our [JCGS paper](https://doi.org/10.1080/10618600.2023.2231048). The paper:
 
 - describes `aorsf` in detail with a summary of the procedures used in the tree fitting algorithm 
 
@@ -173,7 +173,7 @@ A more hands-on comparison of `aorsf` and other R packages is provided in [orsf
 
 cat("1. ", aorsf:::roxy_cite_jaeger_2019(), '\n\n')
 
-cat("2. ", aorsf:::roxy_cite_jaeger_2022(), '\n\n')
+cat("2. ", aorsf:::roxy_cite_jaeger_2023(), '\n\n')
 
 cat("3. ", aorsf:::roxy_cite_menze_2011())
 

diff --git a/README.md b/README.md
@@ -78,7 +78,8 @@ separating the two classes.
 The `orsf()` function can fit several types of ORSF ensembles. My
 personal favorite is the accelerated ORSF because it has a great
 combination of prediction accuracy and computational efficiency (see
-[arXiv paper](https://arxiv.org/abs/2208.01129)).<sup>2</sup>
+[JCGS
+paper](https://doi.org/10.1080/10618600.2023.2231048)).<sup>2</sup>
 
 ``` r
 
@@ -144,20 +145,20 @@ using `aorsf`:
   require permutation and it emphasizes variables with larger
   coefficients in linear combinations, but it is also relatively new and
   hasn’t been studied as much as permutation importance. See [Jaeger,
-  2022](https://arxiv.org/abs/2208.01129) for more details on this
-  technique.
+  2023](https://doi.org/10.1080/10618600.2023.2231048) for more details
+  on this technique.
 
   ``` r
 
   orsf_vi_negate(fit)
   #>          bili           sex        copper           ast           age 
-  #>  0.1190578208  0.0619364315  0.0290605798  0.0260108174  0.0251162396 
+  #>  0.1190290560  0.0619448918  0.0290622719  0.0260108174  0.0251263919 
   #>         stage       protime         edema       ascites        hepato 
-  #>  0.0237810058  0.0158443269  0.0117270641  0.0105685230  0.0092028195 
+  #>  0.0237725455  0.0158527871  0.0117258458  0.0105685230  0.0092045115 
   #>       albumin          chol           trt      alk.phos       spiders 
-  #>  0.0082647861  0.0041510636  0.0036548364  0.0010239241 -0.0003298163 
+  #>  0.0082732463  0.0041510636  0.0036632967  0.0010256161 -0.0003298163 
   #>          trig      platelet 
-  #> -0.0011111508 -0.0045314656
+  #> -0.0011060747 -0.0045517701
   ```
 
 - **permutation**: Each variable is assessed separately by randomly
@@ -172,13 +173,13 @@ using `aorsf`:
 
   orsf_vi_permute(fit)
   #>          bili        copper           ast           age           sex 
-  #>  0.0514084384  0.0170611427  0.0142227933  0.0140274813  0.0131527430 
+  #>  0.0514033622  0.0170611427  0.0142515581  0.0140224052  0.0131459748 
   #>         stage       protime       ascites         edema       albumin 
-  #>  0.0119752045  0.0102865556  0.0098067817  0.0081730899  0.0080568255 
+  #>  0.0119768965  0.0102950158  0.0098067817  0.0081730899  0.0080652857 
   #>        hepato          chol      alk.phos          trig       spiders 
-  #>  0.0069734562  0.0032811220  0.0015862128  0.0014909643  0.0007811902 
+  #>  0.0069734562  0.0032811220  0.0015862128  0.0014943484  0.0007825752 
   #>           trt      platelet 
-  #> -0.0007067631 -0.0022135241
+  #> -0.0007067631 -0.0022338286
   ```
 
 - **analysis of variance (ANOVA)**<sup>3</sup>: A p-value is computed
@@ -223,18 +224,18 @@ orsf_summarize_uni(fit, n_variables = 2)
 #> 
 #> -- bili (VI Rank: 1) ----------------------------
 #> 
-#>        |----------------- risk -----------------|
+#>        |----------------- Risk -----------------|
 #>  Value      Mean     Median     25th %    75th %
-#>   0.70 0.2074286 0.09039332 0.03827337 0.3146957
-#>    1.3 0.2261739 0.10784929 0.04915971 0.3425934
-#>    3.2 0.3071951 0.21242141 0.11889617 0.4358309
+#>   0.70 0.2094827 0.09046313 0.03827429 0.3184979
+#>    1.3 0.2283358 0.11078307 0.05347112 0.3492104
+#>    3.2 0.3090977 0.21368937 0.11889617 0.4412656
 #> 
 #> -- sex (VI Rank: 2) -----------------------------
 #> 
-#>        |----------------- risk -----------------|
+#>        |----------------- Risk -----------------|
 #>  Value      Mean    Median     25th %    75th %
-#>      m 0.3648659 0.2572239 0.15554270 0.5735661
-#>      f 0.2479179 0.1021787 0.04161796 0.3591612
+#>      m 0.3667488 0.2614335 0.15611841 0.5836574
+#>      f 0.2507675 0.1051310 0.04355687 0.3596206
 #> 
 #>  Predicted risk at time t = 1826.25 for top 2 predictors
 ```
@@ -255,7 +256,7 @@ For more on ICE, see the
 ## Comparison to existing software
 
 Comparisons between `aorsf` and existing software are presented in our
-[arXiv paper](https://arxiv.org/abs/2208.01129). The paper
+[JCGS paper](https://doi.org/10.1080/10618600.2023.2231048). The paper:
 
 - describes `aorsf` in detail with a summary of the procedures used in
   the tree fitting algorithm
@@ -286,8 +287,9 @@ examples](https://docs.ropensci.org/aorsf/reference/orsf.html#tidymodels)
 
 2.  Jaeger BC, Welden S, Lenoir K, Speiser JL, Segar MW, Pandey A,
     Pajewski NM. Accelerated and interpretable oblique random survival
-    forests. *arXiv e-prints* 2022 Aug; arXiv-2208. URL:
-    <https://arxiv.org/abs/2208.01129>
+    forests. *Journal of Computational and Graphical Statistics*
+    Published online 08 Aug 2023. URL:
+    <https://doi.org/10.1080/10618600.2023.2231048>
 
 3.  Menze BH, Kelm BM, Splitthoff DN, Koethe U, Hamprecht FA. On oblique
     random forests. *Joint European Conference on Machine Learning and

diff --git a/cran-comments.md b/cran-comments.md
@@ -1,11 +1,15 @@
+## Version 0.1.0
 
 ## R CMD check results
 
-Duration: 4m 3.8s
+Duration: 3m 53.1s
 
-0 errors v | 0 warnings v | 0 notes v
+❯ checking C++ specification ... NOTE
+    Specified C++14: please drop specification unless essential
 
-R CMD check succeeded
+0 errors ✔ | 0 warnings ✔ | 1 note ✖
+
+I have specified C++14 for this release. C++14 is essential, as this release uses `std::make_unique`.
 
 ## Downstream dependencies
 

diff --git a/man/aorsf-package.Rd b/man/aorsf-package.Rd