Skip to content

Commit

Permalink
updating hw in marss cov chapter
Browse files Browse the repository at this point in the history
  • Loading branch information
eeholmes committed Apr 20, 2019
1 parent 9711be6 commit 38ea25d
Show file tree
Hide file tree
Showing 16 changed files with 44 additions and 17 deletions.
50 changes: 34 additions & 16 deletions Lab-fitting-multi-ss-models/multivariate-ss-cov.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,12 @@ dat = t(fulldat[years,c("Greens", "Bluegreens")])
covariates = t(fulldat[years,c("Temp", "TP")])
```

Packages:
```{r msscov-packages}
library(MARSS)
library(ggplot2)
```

## Overview {#sec-msscov-overview}

A multivariate autoregressive state-space (MARSS) model with covariate effects in both the process and observation components is written as:
Expand Down Expand Up @@ -451,7 +457,7 @@ Here are some guidelines to help you answer the questions:
* Assume that the observation errors are independent and identically distributed with known variance of 0.10.
* Assume that the process errors are independent from one another, but the variances differ by taxon.
* Assume that each group is an observation of its own process. This means `Z="identity"`.
* Use `B="unconstrained"`. This implies that each of the taxa are operating under varying degrees of density-dependence, and they are allowed to interact.
* Use `B="diagonal and unequal"`. This implies that each of the taxa are operating under varying degrees of density-dependence, and they are not allowed to interact.
* All the data have been de-meaned and $\mathbf{Z}$ is identity, therefore use `U="zero"` and `A="zero"`. Make sure to check that the means of the data are 0 and the variance is 1.
* Use `tinitx=1`. It makes $\mathbf{B}$ estimation more stable. It goes in your model list.
* Include a plot of residuals versus time and acf of residuals for each question. You only need to show these for the top (best) model if the question involves comparing multiple models.
Expand All @@ -466,52 +472,64 @@ Read Section \@ref(sec-msscov-hw-discussion) for the data and tips on answering
$$\mathbf{x}_t=\mathbf{B}\mathbf{x}_{t-1}+\mathbf{C}\mathbf{c}_t+\mathbf{w}_t$$
The $\mathbf{C}\mathbf{c}_t+\mathbf{w}_t$ are the process errors and represent the growth rates (growth above or below what you would expect given $\mathbf{x}_{t-1}$). Use your raw data in the MARSS model. You do not need to difference the data to get at the growth rates since the process model is modeling that.

1. How does month affect the mean phytoplankton population growth rates? Show a plot of the estimated mean growth rate versus month for each taxon using 3 approaches to estimate the month effect (factor, polynomial, Fourier series). Estimate seasonal effects without any covariate (Temp, TP) effects.
1. How does month affect the mean phytoplankton population growth rates? Show a plot of the estimated mean growth rate versus month for each taxon using three approaches to estimate the month effect (factor, polynomial, Fourier series). Estimate seasonal effects without any covariate (Temp, TP) effects.

2. It is likely that both temperature and total phosphorus (TP) affect phytoplankton population growth rates. Using MARSS models, estimate the effect of Temp and TP on growth rates of each taxon.
Leave out the seasonal covariates from question 1, i.e. only use Temp and TP as covariates. Make a plot of the point estimates of the Temp and TP effects with the 95% CIs added to the plot. `tidy()` is an easy way to get the parameters CIs.

3. Estimate the Temp and TP effects using `B="diagonal and unequal"`.
a. Compare the $\mathbf{B}$ matrix for the fit from question 2 and from question 3. Describe the species interactions modeled by the $\mathbf{B}$ matrix from question 2. How is it different than the $\mathbf{B}$ matrix from question 3? Note, you can retrieve the matrix using `coef(fit, type="matrix")$B`.
b. Do the Temp and TP effects change when you use `B="diagonal and unequal"`? Make sure to look at the CIs also.
3. Estimate the Temp and TP effects using `B="unconstrained"`.
a. Compare the $\mathbf{B}$ matrix for the fit from question 2 and from question 3. Describe the species interactions modeled by the $\mathbf{B}$ matrix when `B="unconstrained"`. How is it different than the $\mathbf{B}$ matrix from question 2? Note, you can retrieve the matrix using `coef(fit, type="matrix")$B`.
b. Do the Temp and TP effects change when you use `B="unconstrained"`? Make sure to look at the CIs also.

4. Using MARSS models, evaluate which (Temp or TP) is the more important driver or if both are important. Again, leave out the seasonal covariates from question 1, i.e. only use Temp and TP as covariates. Compare two approaches: comparison of effect sizes in a model with both Temp and TP and model selection using a set of models.

5. Evaluate whether the effect of temperature on the taxa manifests itself via their underlying physiology (by affecting growth rates and thus abundance) or because physical changes in the water stratification makes them easier/harder to sample in some months. Leave out the seasonal covariates from question 1, i.e. only use Temp and TP as the covariates. Assume the TP always affects growth rates, never the observation errors.
5. Evaluate whether the effect of temperature (Temp) on the taxa manifests itself via their underlying physiology (by affecting growth rates and thus abundance) or because physical changes in the water stratification makes them easier/harder to sample in some months. Leave out the seasonal covariates from question 1, i.e. only use Temp and TP as the covariates. For TP, assume it always affects growth rates, never the observation errors.

6. Is there support for temperature or TP affecting all functional groups' growth rates the same, or are the effects on one taxon different from another? Make sure to test both Temp and TP effects the same, or one the same while the others are unique.
6. Is there support for temperature or TP affecting all functional groups' growth rates the same, or are the effects on one taxon different from another? Make sure to test all possibilities: the Temp and TP effects are the same for all taxa, and one covariate effect is the same across taxa while the other's effects are unique across taxa.

7. Compare your results for question 2 using linear regression, by using the `lm()` function. You'll need to look at the response of each taxon separately, i.e. one response variable. You can have a multivariate response variable with `lm()` but the functions will be doing 6 independent linear regressions. In your `lm()` model, use only Temp and TP (and intercept) as covariates. Compare the estimated effects to those from question 2. How are they different? How is this model different from the model you fit in question 2?

6. Using Greens only, fit a linear regression with autocorrelated residuals using the `Arima()` function with the `xreg` argument. Use Temp and TP as your covariates. Compare to the results from questions 5 and question 2. How is this model different from the models you fit in questions 2 and 5?
<!--
8. Using Greens only, fit a linear regression with autocorrelated residuals using the `Arima()` function with the `xreg` argument. Use Temp and TP as your covariates. Compare your results to the results from questions 7. How is this model different from the models you fit in questions 7?
-->

7. Temp and TP are negatively correlated (cor = `r format(cor(covars[1,],covars[2,]), digits=2)`). A common threshold for collinearity in regression models is 0.7. Temp and TP fall below that but are close. One approach to collinearity is sequential regression [@Dormannetal2013]. The first (most influential) covariate is included 'as is' and the second covariate appears as the residuals of a regression of the second against the first. The covariates are now orthogonal however the second covariate is conditioned on the first. If we see an effect of the residuals covariate, it is the effect of TP additional to the contribution it already made through its relationship with temperature. Rerun question 2 using the sequential regression covariates (below). Do your conclusions about the effects of Temperature and TP change?
8. Temp and TP are negatively correlated (cor = `r format(cor(covars[1,],covars[2,]), digits=2)`). A common threshold for collinearity in regression models is 0.7. Temp and TP fall below that but are close. One approach to collinearity is sequential regression [@Dormannetal2013]. The first (most influential) covariate is included 'as is' and the second covariate appears as the residuals of a regression of the second against the first. The covariates are now orthogonal however the second covariate is conditioned on the first. If we see an effect of the residuals covariate, it is the effect of TP additional to the contribution it already made through its relationship with temperature. Rerun question 2 using sequential regression (see code below).

Make your Temp and TP covariates orthogonal using sequential regression. Do your conclusions about the effects of Temperature and TP change?

Below is code to construct your orthogonal covariates for sequential regression.
```{r msscov-problems-orthocov}
fit <- lm(covars[1,]~covars[2,])
seqcovs=rbind(covars[1,], residuals(fit))
seqcovs <- rbind(covars[1,], residuals(fit))
avg <- apply(seqcovs, 1, mean)
sd <- sqrt(apply(seqcovs, 1, var, na.rm=TRUE))
seqcovs <- (seqcovs-avg)/sd
rownames(seqcovs) <- c("Temp","TPresids")
```

8. Another approach to looking at effects of covariates is to examine if the seasonal anomalies of the independent variable can be explained by the seasonal anomalies of the dependent variables. In other words, can an unusually high February abundance (higher than expected) be explained by an unusually high or low February temperature? The `stl` function can be used to decompose a time series using LOESS. We'll use `stl` since it can handle missing values.
a. Decompose the Diatom time series using `stl` and plot. Modify the following code to decompose.
9. Compare the AICc's of the 3 seasonal models from question 1 and the 4 Temp/TP models from question 5. What does this tell you about the Temp and TP only models?

10. We cannot just fit a model with season and Temp plus TP since Temp and TP are highly seasonal. That will cause problems if we have something that explain season (a polynomial) and a covariate that has seasonality. Instead, use sequential regression to fit a model with seasonality, Temp and TP. Use a 3rd order polynomial with the `poly()` function to create orthogonal season covariates and then use sequential regression (code in problem 8) to create Temp and TP covariates that are orthogonal to your season covariates. Fit the model and compare a model with only season to a model with season and Temp plus TP.

11. Another approach to looking at effects of covariates which have season cycles is to examine if the seasonal anomalies of the independent variable can be explained by the seasonal anomalies of the dependent variables. In other words, can an unusually high February abundance (higher than expected) be explained by an unusually high or low February temperature? In this approach, you remove season so you do not need to model it (with factor, polynomial, etc). The `stl()` function can be used to decompose a time series using LOESS. We'll use `stl()` since it can handle missing values.

a. Decompose the Diatom time series using `stl()` and plot. Use `na.action=zoo::na.approx` to deal with the NAs. Use `s.window="periodic"`. Other than that you can use the defaults.
```{r, eval=FALSE}
dati <- ts(dat["nameoftaxon",], frequency=12)
i <- "Diatoms"
dati <- ts(dat[i,], frequency=12)
a <- stl(dati, "periodic", na.action=zoo::na.approx)
```

b. Create a new anomaly independent variables and dependent variables from `dat` and `covars` by modifying the following code. For the anomaly, you will use the remainder plus the trend.
b. Create dependent variables and covariates that are anomolies by modifying the following code. For the anomaly, you will use the remainder plus the trend. You will need to adapt this code to create the anomalies for Temp and TP and for `dat` (your data).
```{r}
i <- 1
i <- "Diatoms"
a <- stl(ts(dat[i,],frequency=12), "periodic", na.action=zoo::na.approx)
anom <- a$time.series[,"remainder"]+a$time.series[,"trend"]
```

c. Notice that you have simply removed the seasonal cycle from the data. Using the seasonal anomalies (from part b), estimate the effect of Temp and TP on each taxon's growth rate. You will use the same model as in question 2, but use the seasonal anomalies as data and covariates.
c. Notice that you have simply removed the seasonal cycle from the data. Using the seasonal anomalies (from part b), estimate the effect of Temp and TP on each taxon's growth rate. You will use the same model as in question 2, but use the seasonal anomalies as data and covariates.



Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
base
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 9 additions & 1 deletion marss-cov-lab-apr-5-update.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# April 18 update

[done] Need to change cov to use Temp and TP at t-1. Does not make sense
to have temp at time t affect x(t) if we are modeling growth.

# April 5th update

The Temp and TP models are much worse than any seasonal model.
Expand All @@ -9,6 +14,7 @@ Note, need to switch Key to use poly() not i,i^2 etc.

ord=3
c.m.poly=t(poly(rep(1:12,15),ord))
rownames(c.m.poly) <- paste0("Mon", 1:ord)
C = "unconstrained"; c=c.m.poly
model.list = c(common, list(C=C,c=c,D=D,d=d))
test2 = MARSS(dat, model=model.list, control=ctl, silent=TRUE)
Expand All @@ -21,12 +27,14 @@ test2$AIC
# test2 = MARSS(dat, model=model.list, control=ctl, silent=TRUE)
# test2$AICc

Instead use anomalies. Temp & TP part not explained by the polynomials. This has lower AICc.
Instead use anomalies. Temp & TP part not explained by the polynomials.
This has lower AICc.

fit <- lm(covars[1,]~poly(rep(1:12,15),ord))
anoms = residuals(fit)
fit <- lm(covars[2,]~poly(rep(1:12,15),ord))
anoms = rbind(anoms,residuals(fit))
rownames(anoms) <- paste0(rownames(covars)[1:2], "_anom")
C = "unconstrained"; c=rbind(c.m.poly,anoms)
model.list = c(common, list(C=C,c=c,D=D,d=d))
test2 = MARSS(dat, model=model.list, control=ctl, silent=TRUE)
Expand Down

0 comments on commit 38ea25d

Please sign in to comment.