fixed bug in runCrossValidate for merging of MCMC configurations (#1068)

* temp comment out one of numericTypes tests * temp comment out another numericTypes test * more monkeying with numericTypes list * more monkeying * more monkeying * monkeying with test-numericTypes * more monkeying * more monkeying * more monkeying * more monkeying * more monkeying * more monkeying * more monkeying * more monkeying * more monkeying * more monkeying * try bionic again * fixed bug in runCrossValidate for merging of MCMC configurations Co-authored-by: Christopher Paciorek <[email protected]> Co-authored-by: perrydv <[email protected]>
nimble-dev · Sep 20, 2020 · 493f791 · 493f791
1 parent cf55502
commit 493f791
Show file tree

Hide file tree

Showing 6 changed files with 81 additions and 11 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -3,6 +3,7 @@ r:
     - 4.0
 
 dist: bionic  ## removed Aug 2020 due to travis errors, see: https://travis-ci.community/t/r-install-broken-travis-ci-rstudio-org-returns-403/9640; added back Sep 2020 as seemed fine on further testing
+
 sudo: false
 
 cache:

diff --git a/UserManual/src/chapter_BNP.Rmd b/UserManual/src/chapter_BNP.Rmd
@@ -111,13 +111,11 @@ One of the advantages of BNP mixture models is that the number of clusters is tr
 
 ### Extensions {#sec:extensionscrp}
 
-The BNP functionality has been recently extended to more general models. These extensions enable users, for instance, to use a DP or DPM prior for the distribution of the random effects in a generalized linear mixed effects model with mutiple measurements over time or multiple trials per participant. 
-
-#### Example {#sec:exextensionscrp}
+The BNP functionality in NIMBLE was extended in version 0.10.0 to more general models. These extensions enable users, for instance, to use a DP or DPM prior for the distribution of the random effects in a generalized linear mixed effects model with mutiple measurements over time or multiple trials per participant. 
 
 The following example illustrates how to use NIMBLE in a random effects model with repeated measurements per subject using a DP prior for the distribution of the subject's random effects. The model is given by
 
-$$ y_{i,j} \mid \tilde{\boldsymbol{\theta}}, z_i, \sigma^2 \ {\sim}\ N(\tilde{\theta}_{z_i}, {\sigma}^2,)\hspace{0.5cm} i = 1, \ldots, N,\hspace{0.5cm} j = 1, \ldots, J, $$
+$$ y_{i,j} \mid \tilde{\boldsymbol{\theta}}, z_i, \sigma^2 \ {\sim}\ N(\tilde{\boldsymbol{\theta}}_{z_i}, {\sigma}^2,)\hspace{0.5cm} i = 1, \ldots, N,\hspace{0.5cm} j = 1, \ldots, J, $$
 
 $$ \boldsymbol{z} \sim \mbox{CRP}(\alpha), \hspace{0.5cm} \alpha \sim \mbox{Gamma}(1, 1), $$
 
@@ -150,6 +148,8 @@ inits <- list(thetatilde = rnorm(constants$M, 0, 10),
 modelRandEff <- nimbleModel(code, constants, data, inits)
 ```
 
+Alternatively, each subject could have a vector of parameters being clustered. For example in the model above one could instead specify a vector of means, such as `thetaTilde[z[i], j]`, instead of a single mean. 
+
 As before, the model can be fitted through  MCMC sampling. NIMBLE will assign a specialized sampler to update `z` and `alpha`. See Chapter  \@ref(cha:mcmc) for information about NIMBLE's MCMC engine, and Section \@ref(sec:mcmcdcrp) for details on  MCMC sampling of the CRP.
 
 ## Stick-breaking model {#sec:sb}
@@ -238,7 +238,7 @@ NIMBLE's MCMC engine provides specialized samplers for the `dCRP` distribution,
   1. A non-conjugate sampler in the case of the baseline distribution  not being conjugate for the mixture kernel.
 
 
-Note that both samplers are specialized versions that operate on a vector having a CRP distribution. Details of these assignments are strictly internal to the CRP samplers. The current release of NIMBLE supports conjugate sampling for the `dCRP` distribution for the relationships listed in Table \@ref(tab:BNPconjugacy). Additional relationships are provided in Table \@ref(tab:BNPconjugacy2) for normal mixture kernels when both,  mean and  variance, are unknown.
+Note that both samplers are specialized versions that operate on a vector having a CRP distribution. Details of these assignments are strictly internal to the CRP samplers. The current release of NIMBLE supports conjugate sampling for the `dCRP` distribution for the relationships listed in Table \@ref(tab:BNPconjugacy). Additional relationships are provided in Table \@ref(tab:BNPconjugacy2) for normal mixture kernels when both mean and  variance are unknown.
 
 
 ```{r, child = 'tables/BNPconjugacyTable.md'}

diff --git a/packages/nimble/R/crossValidation.R b/packages/nimble/R/crossValidation.R
@@ -54,9 +54,14 @@ calcCrossVal <- function(i,
     leaveOutNames <- paramNames
     predLoss <- TRUE
   }
-  if(!silent) modelMCMCConf <- configureMCMC(newModel, nodes = leaveOutNames, monitors = leaveOutNames)
-  else modelMCMCConf <- suppressMessages(configureMCMC(newModel, nodes = leaveOutNames, monitors = leaveOutNames))
-  if(!predLoss) modelMCMCConf$samplerConfs <- c(conf$samplerConfs,modelMCMCConf$samplerConfs)
+  if(!silent) modelMCMCConf <- configureMCMC(newModel, nodes = leaveOutNames, monitors = leaveOutNames, print = silent)
+  else modelMCMCConf <- suppressMessages(configureMCMC(newModel, nodes = leaveOutNames, monitors = leaveOutNames, print = silent))
+  if(!predLoss) {
+      for(i in seq_along(modelMCMCConf$samplerConfs)) {
+          sConf <- modelMCMCConf$samplerConfs[[i]]
+          conf$addSampler(target=sConf$target, type=sConf$samplerFunction, control=sConf$control, silent=TRUE)
+      }
+  }
   if(!silent){
     modelMCMC <- buildMCMC(modelMCMCConf)
     C.modelMCMC <- compileNimble(modelMCMC,

diff --git a/packages/nimble/inst/NEWS b/packages/nimble/inst/NEWS
@@ -10,6 +10,7 @@ USER LEVEL CHANGES
 
 BUG FIXES
 
+-- Fixed a bug in k-fold cross-validation routine (`runCrossValidate`), where the merging of MCMC sampler configurations was done incorrectly.
 -- Updated all MCMC sampler functions to use a new syntax for control list element extraction, which prevents a possible caused by R's partial matching of list names.
 
                             CHANGES IN VERSION 0.9.1 (May 2020)

diff --git a/packages/nimble/inst/tests/test-crossVal.R b/packages/nimble/inst/tests/test-crossVal.R
@@ -139,6 +139,65 @@ test_that("Radon model cross-validation runs using MSE loss: ", {
                               nBootReps = 2)
 })
 
+test_that("dyes model cross-validation exact results", {
+    dyesCode <- nimbleCode({
+        for (i in 1:BATCHES) {
+            for (j in 1:SAMPLES) {
+                y[i,j] ~ dnorm(mu[i], tau.within);
+            }
+            mu[i] ~ dnorm(theta, tau.between);
+        }
+        theta ~ dnorm(0.0, 1.0E-10);
+        tau.within ~ dgamma(0.001, 0.001);  sigma2.within <- 1/tau.within;
+        tau.between ~ dgamma(0.001, 0.001);  sigma2.between <- 1/tau.between;
+    })
+    ##
+    dyesData <- list(y = matrix(c(1545, 1540, 1595, 1445, 1595,
+                         1520, 1440, 1555, 1550, 1440,
+                         1630, 1455, 1440, 1490, 1605,
+                         1595, 1515, 1450, 1520, 1560,
+                         1510, 1465, 1635, 1480, 1580,
+                         1495, 1560, 1545, 1625, 1445),
+                         nrow = 6, ncol = 5))
+    ##
+    dyesConsts <- list(BATCHES = 6, SAMPLES = 5)
+    ##
+    dyesInits <- list(theta = 1500, tau.within = 1, tau.between =  1)
+    ##
+    dyesModel <- nimbleModel(dyesCode, dyesConsts, dyesData, dyesInits)
+    ##
+    dyesFoldFunction <- function(i){
+        foldNodes_i <- paste0('y[', i, ', ]')  # will return 'y[1,]' for i = 1 e.g.
+        return(foldNodes_i)
+    }
+    ##
+    RMSElossFunction <- function(simulatedDataValues, actualDataValues){
+        dataLength <- length(simulatedDataValues) # simulatedDataValues is a vector
+        SSE <- 0
+        for(i in 1:dataLength){
+            SSE <- SSE + (simulatedDataValues[i] - actualDataValues[i])^2
+        }
+        MSE <- SSE / dataLength
+        RMSE <- sqrt(MSE)
+        return(RMSE)
+    }
+    ##
+    dyesMCMCconfiguration <- configureMCMC(dyesModel)
+    ##
+    set.seed(0)
+    ##
+    crossValOutput <- runCrossValidate(MCMCconfiguration = dyesMCMCconfiguration, k = 6,
+                                       foldFunction = dyesFoldFunction,
+                                       lossFunction = RMSElossFunction,
+                                       MCMCcontrol = list(niter = 5000, nburnin = 500))
+    ##
+    expect_equal(crossValOutput$CVvalue, 63.872534)
+    expect_equal(crossValOutput$CVstandardError, 0.0023289839)
+    expect_equal(unname(sapply(crossValOutput$foldCVinfo, function(x) x['foldCVvalue'])), c(56.525316, 41.326957, 73.621281, 61.466634, 109.806489, 40.488530))
+    expect_equal(unname(sapply(crossValOutput$foldCVinfo, function(x) x['foldCVstandardError'])), c(0.0052012467, 0.0058034916, 0.0060221136, 0.0059338851, 0.0045952414, 0.0064763733))
+})
+
+
 
 options(warn = RwarnLevel)
 nimbleOptions(verbose = nimbleVerboseSetting)

diff --git a/packages/nimble/inst/tests/test_utils.R b/packages/nimble/inst/tests/test_utils.R
@@ -208,8 +208,12 @@ test_coreRfeature_batch_internal <- function(input_batch, verbose = nimbleOption
             if(is.null(checkEqual)) checkEqual <- FALSE
             if(is.null(input[['return']])) { ## use default 'out' object
                 if(!checkEqual) {
+                    expect_identical(class(out), class(out_nfC), info = paste('iden tmp test of class', class(out), class(out_nfC)))
+                    expect_identical(dim(out), dim(out_nfC), info = 'iden test of dim')
+                    expect_identical(round(out, 10), round(out_nfC, 10), info='iden test of round')
                     expect_identical(out, out_nfR, info = "Identical test of coreRfeature (direct R vs. R nimbleFunction)")
-                    expect_identical(out, out_nfC, info = "Identical test of coreRfeature (direct R vs. C++ nimbleFunction)")
+                    wh <- which.max(abs(out - out_nfC))
+                    expect_identical(out, out_nfC, info = paste("Identical test of coreRfeature (direct R vs. C++ nimbleFunction)", system('gcc --version', intern = T)[1], ' ', sprintf("%0.20f", out[wh]), " ", sprintf("%0.20f", out_nfC[wh])))
                 } else {
                     expect_equal(out, out_nfR, info = "Equal test of coreRfeature (direct R vs. R nimbleFunction)")
                     expect_equal(out, out_nfC, info = "Equal test of coreRfeature (direct R vs. C++ nimbleFunction)")
@@ -316,8 +320,8 @@ test_coreRfeature_internal <- function(input, verbose = nimbleOptions('verbose')
   if(is.null(checkEqual)) checkEqual <- FALSE
   if(is.null(input[['return']])) { ## use default 'out' object
       if(!checkEqual) {
-          expect_identical(out, out_nfR, info = paste0("Identical test of coreRfeature (direct R vs. R nimbleFunction): ", input$name))
-          expect_identical(out, out_nfC, info = paste0("Identical test of coreRfeature (direct R vs. C++ nimbleFunction): ", input$name))
+          expect_identical(out, out_nfR, info = paste0("FOO Identical test of coreRfeature (direct R vs. R nimbleFunction): ", input$name))
+          expect_identical(out, out_nfC, info = paste0("FOO Identical test of coreRfeature (direct R vs. C++ nimbleFunction): ", input$name))
       } else {
           expect_equal(out, out_nfR, info = paste0("Equal test of coreRfeature (direct R vs. R nimbleFunction): ", input$name) )
           expect_equal(out, out_nfC, info = paste0("Equal test of coreRfeature (direct R vs. C++ nimbleFunction): ", input$name))
-Original file line number
+Diff line change
@@ Expand Up / @@ -3,6 +3,7 @@ r: @@
         - 4.0
     dist: bionic  ## removed Aug 2020 due to travis errors, see: https://travis-ci.community/t/r-install-broken-travis-ci-rstudio-org-returns-403/9640; added back Sep 2020 as seemed fine on further testing
     sudo: false
     cache:
@@ Expand Down @@