-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path06-unmarked.Rmd
326 lines (241 loc) · 15.7 KB
/
06-unmarked.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
# Occupancy analysis, ML method {#unmarked}
Now that we have understood how the two processes work and interact; the ecological and the observational to produce the occupation data. After generating several data sets, now we only have to analyze them. The most direct and intuitive way is to use the occu function from the unmarked [@Fiske2011] package. Later we can use a Bayesian type model in the BUGS language to analyze the same data and in the end compare which of the two estimators, Maximum Likelihood or Bayesian, is closer to the true parameters.
### Clearing R memory
Before we continue, and since we have already generated a large amount of data and models in the previous steps, we are going to clear the memory of R before we begin. We do this with the command:
```{r clear_WS, eval=FALSE, warning=FALSE}
rm(list = ls())
```
Once we have done this we must re-run the code of the function that generates the data that we have created in Chapter \@ref(function1).
After having loaded the function again we must return to
## Generating the data
This time we will use a TEAM-type design (https://www.wildlifeinsights.org/team-network) with 60 sampling sites and 30 repeated visits, which is equivalent to the 30 days that the cameras remain active in the field. Again our species is the white-tailed tapir. For this example we will assume that detection is 0.6, occupancy is 0.8, and the interactions are much simpler with elevation as the only covariate explaining occupancy. However, for detection there is a more complex relationship, assuming that there is a slight interaction between the observation covariates. For observation, elevation and temperature interact with each other. Also, note how elevation influences in opposite directions with a positive sign at elevation for detection and negative for occupancy.
```{r fun_ch2, warning=FALSE, echo=FALSE, message=FALSE}
###############################
## The function starts here ###
###############################
set.seed(24) # Can choose seed of your choice
# Function definition with set of default values
data.fn <- function(M = 60, J = 30, mean.occupancy = 0.8,
beta1 = -1.5, beta2 = 0, beta3 = 0, mean.detection = 0.6,
alpha1 = 2, alpha2 = 1, alpha3 = 1.5, show.plot = FALSE){
# Function to simulate occupancy measurements replicated at M sites during J occasions.
# Population closure is assumed for each site.
# Expected occurrence may be affected by elevation (elev),
# forest cover (forest) and their interaction.
# Expected detection probability may be affected by elevation,
# temperature (temp) and their interaction.
# Function arguments:
# M: Number of spatial replicates (sites)
# J: Number of temporal replicates (occasions)
# mean.occupancy: Mean occurrence at value 0 of occurrence covariates
# beta1: Main effect of elevation on occurrence
# beta2: Main effect of forest cover on occurrence
# beta3: Interaction effect on occurrence of elevation and forest cover
# mean.detection: Mean detection prob. at value 0 of detection covariates
# alpha1: Main effect of elevation on detection probability
# alpha2: Main effect of temperature on detection probability
# alpha3: Interaction effect on detection of elevation and temperature
# show.plot: if TRUE, plots of the data will be displayed;
# set to FALSE if you are running simulations.
# Create covariates
elev <- runif(n = M, -1, 1) # Scaled elevation
forest <- runif(n = M, -1, 1) # Scaled forest cover
temp <- array(runif(n = M*J, -1, 1), dim = c(M, J)) # Scaled temperature
# Model for occurrence
beta0 <- qlogis(mean.occupancy) # Mean occurrence on link scale
psi <- plogis(beta0 + beta1*elev + beta2*forest + beta3*elev*forest)
z <- rbinom(n = M, size = 1, prob = psi) # Realised occurrence
# Plots
if(show.plot){
par(mfrow = c(2, 2), cex.main = 1)
devAskNewPage(ask = TRUE)
curve(plogis(beta0 + beta1*x), -1, 1, col = "red", frame.plot = FALSE,
ylim = c(0, 1), xlab = "Elevation", ylab = "psi", lwd = 2)
plot(elev, psi, frame.plot = FALSE, ylim = c(0, 1), xlab = "Elevation",
ylab = "")
curve(plogis(beta0 + beta2*x), -1, 1, col = "red", frame.plot = FALSE,
ylim = c(0, 1), xlab = "Forest cover", ylab = "psi", lwd = 2)
plot(forest, psi, frame.plot = FALSE, ylim = c(0, 1), xlab = "Forest cover",
ylab = "")
}
# Model for observations
y <- p <- matrix(NA, nrow = M, ncol = J)# Prepare matrix for y and p
alpha0 <- qlogis(mean.detection) # mean detection on link scale
for (j in 1:J){ # Generate counts by survey
p[,j] <- plogis(alpha0 + alpha1*elev + alpha2*temp[,j] + alpha3*elev*temp[,j])
y[,j] <- rbinom(n = M, size = 1, prob = z * p[,j])
}
# True and observed measures of 'distribution'
sumZ <- sum(z) # Total occurrence (all sites)
sumZ.obs <- sum(apply(y,1,max)) # Observed number of occ sites
psi.fs.true <- sum(z) / M # True proportion of occ. sites in sample
psi.fs.obs <- mean(apply(y,1,max)) # Observed proportion of occ. sites in sample
# More plots
if(show.plot){
par(mfrow = c(2, 2))
curve(plogis(alpha0 + alpha1*x), -1, 1, col = "red",
main = "Relationship p-elevation \nat average temperature",
xlab = "Scaled elevation", frame.plot = F)
matplot(elev, p, xlab = "Scaled elevation",
main = "Relationship p-elevation\n at observed temperature",
pch = "*", frame.plot = F)
curve(plogis(alpha0 + alpha2*x), -1, 1, col = "red",
main = "Relationship p-temperature \n at average elevation",
xlab = "Scaled temperature", frame.plot = F)
matplot(temp, p, xlab = "Scaled temperature",
main = "Relationship p-temperature \nat observed elevation",
pch = "*", frame.plot = F)
}
# Output
return(list(M = M, J = J, mean.occupancy = mean.occupancy,
beta0 = beta0, beta1 = beta1, beta2 = beta2, beta3 = beta3,
mean.detection = mean.detection,
alpha0 = alpha0, alpha1 = alpha1, alpha2 = alpha2, alpha3 = alpha3,
elev = elev, forest = forest, temp = temp,
psi = psi, z = z, p = p, y = y, sumZ = sumZ, sumZ.obs = sumZ.obs,
psi.fs.true = psi.fs.true, psi.fs.obs = psi.fs.obs))
}
###############################
## The function ends here ###
###############################
datos2<-data.fn(M = 60, J = 30, show.plot = FALSE,
mean.occupancy = 0.8, beta1 = -1.5, beta2 = 0, beta3 = 0,
mean.detection = 0.6, alpha1 = 2, alpha2 = 1, alpha3 = 1.5
)
attach(datos2)
```
```{r unmark_data, cache=TRUE, message=FALSE, eval=FALSE, echo=TRUE}
# Data generation
# Lets build a model were elevation explain occupancy and p has interactions
datos2<-data.fn(M = 60, J = 30, show.plot = FALSE,
mean.occupancy = 0.8, beta1 = -1.5, beta2 = 0, beta3 = 0,
mean.detection = 0.6, alpha1 = 2, alpha2 = 1, alpha3 = 1.5
)
# Function to simulate occupancy measurements replicated at M sites during J occasions.
# Population closure is assumed for each site.
# Expected occurrence may be affected by elevation (elev),
# forest cover (forest) and their interaction.
# Expected detection probability may be affected by elevation,
# temperature (temp) and their interaction.
# Function arguments:
# M: Number of spatial replicates (sites)
# J: Number of temporal replicates (occasions)
# mean.occupancy: Mean occurrence at value 0 of occurrence covariates
# beta1: Main effect of elevation on occurrence
# beta2: Main effect of forest cover on occurrence
# beta3: Interaction effect on occurrence of elevation and forest cover
# mean.detection: Mean detection prob. at value 0 of detection covariates
# alpha1: Main effect of elevation on detection probability
# alpha2: Main effect of temperature on detection probability
# alpha3: Interaction effect on detection of elevation and temperature
# show.plot: if TRUE, plots of the data will be displayed;
# set to FALSE if you are running simulations.
#To make the objects inside the list directly accessible to R, without having to address
#them as data$C for instance, you can attach datos2 to the search path.
attach(datos2) # Make objects inside of 'datos2' accessible directly
#Remember to detach the data after use, and in particular before attaching a new data
#object, because more than one data set attached in the search path will cause confusion.
# detach(datos2) # Make clean up
```
## Putting the data in unmarked
[Unmarked](http://cran.r-project.org/web/packages/unmarked/index.html) is the R package we use to analyze the [@Fiske2011] occupancy data. To achieve this we must first prepare the data and collect it in an object of type unmarkedFrame. In this case we use the unmarkedFrameOccu function, which is specific for occupancy analysis of a single season or season. More about unmarked at: https://sites.google.com/site/unmarkedinfo/home
```{r unmark_umf1, cache=TRUE, warning=FALSE, message=FALSE, echo=FALSE}
datos2<-data.fn(M = 60, J = 30, show.plot = FALSE,
mean.occupancy = 0.8, beta1 = -1.5, beta2 = 0, beta3 = 0,
mean.detection = 0.6, alpha1 = 2, alpha2 = 1, alpha3 = 1.5
)
attach(datos2)
```
```{r unmark_umf2, dependson=-1, cache=TRUE, warning=FALSE, message=FALSE}
library(unmarked)
siteCovs <- as.data.frame(cbind(forest,elev)) # occupancy covariates
obselev<-matrix(rep(elev,J),ncol = J) # make elevation per observation
obsCovs <- list(temp= temp,elev=obselev) # detection covariates
# umf is the object joining observations (y), occupancy covariates (siteCovs)
# and detection covariates (obsCovs). Note that obsCovs should be a list of
# matrices or dataframes.
umf <- unmarkedFrameOccu(y = y, siteCovs = siteCovs, obsCovs = obsCovs)
```
The unmarked package allows us to graphically see how the data is arranged at the sampling sites with the plot function.
```{r umfplot, cache=TRUE, fig.cap="Inspección grafica del objeto umf.", fig.scap="fig"}
plot (umf)
```
## Fitting the models
The next step is to fit the models that were required by varying the covariates. This is accomplished with the [occu() function](http://www.rdocumentation.org/packages/unmarked/versions/0.11-0/topics/occu).
Keep in mind that in the model building process your models must have a biological meaning.
```{r occu1, cache=TRUE, dependson=c(-1,-2), cache=TRUE, warning=FALSE, message=FALSE}
# detection first, occupancy next
fm0 <- occu(~1 ~1, umf) # Null model
fm1 <- occu(~ elev ~ 1, umf) # elev explaining detection
fm2 <- occu(~ elev ~ elev, umf) # elev explaining detection and occupancy
fm3 <- occu(~ temp ~ elev, umf)
fm4 <- occu(~ temp ~ forest, umf)
fm5 <- occu(~ elev + temp ~ 1, umf)
fm6 <- occu(~ elev + temp + elev:temp ~ 1, umf)
fm7 <- occu(~ elev + temp + elev:temp ~ elev, umf)
fm8 <- occu(~ elev + temp + elev:temp ~ forest, umf)
```
## Model Selection
Unmarked allows you to perform a model selection procedure to select models based on the Akaike information Criterium (AIC) of each one. So the lowest AIC is the most parsimonious model according to our data [@Burnham2004].
```{r modelsel, cache=TRUE}
models <- fitList( # here e put names to the models
'p(.)psi(.)' = fm0,
'p(elev)psi(.)' = fm1,
'p(elev)psi(elev)' = fm2,
'p(temp)psi(elev)' = fm3,
'p(temp)psi(forest)' = fm4,
'p(temp+elev)psi(.)' = fm5,
'p(temp+elev+elev:temp)psi(.)' = fm6,
'p(temp+elev+elev:temp)psi(elev)' = fm7,
'p(temp+elev+elev:temp)psi(forest)' = fm8)
modSel(models) # model selection procedure
```
## Prediction in graphs
The model with the lowest AIC can be used to predict expected results according to a new data set. For example, one might ask the expected abundance of deer at a higher elevation site. Predictions are also another way of presenting the results of an analysis. Here we will illustrate what the prediction of $\psi$ and _p_ looks like over the range of covariates studied. Note that we are using standardized covariates. If we were using covariates at their real scale, we would have to take into account that they have to be transformed using the mean and standard deviation.
Before using the model to predict it is a good idea to check the model parameters and their errors, then check graphically that the model fits well with the parboot function, which does a resampling of the model. This plot is interpreted as the model having a good fit, when the mean (dotted line) is between the intervals of the histogram. If the line is too far from the histogram the model might not be good at predicting.
```{r fit1, cache=TRUE, fig.cap="Graphical evaluation of the model fit fm7.", fig.scap="fig", warning=FALSE, message=FALSE, fig.show='hold', fig.height=4}
summary(fm7) # see the model parameters
pb <- parboot(fm7, nsim=250, report=10) # goodness of fit
plot (pb) # plot goodness of fit
```
Now that we know that our best model has a good fit, we can use it to predict occupancy in the elevation range to see how it behaves on a graph.
```{r gpsi, fig.scap="fig.", cache=TRUE, warning=FALSE, message=FALSE, fig.show='asis', fig.cap="Plot of occupancy versus elevation."}
elevrange<-data.frame(elev=seq(min(datos2$elev),max(datos2$elev),length=100)) # newdata
pred_psi <-predict(fm7,type="state",newdata=elevrange,appendData=TRUE)
plot(Predicted~elev, pred_psi,type="l",col="blue",
xlab="elev",
ylab="psi")
lines(lower~elev, pred_psi,type="l",col=gray(0.5))
lines(upper~elev, pred_psi,type="l",col=gray(0.5))
```
## Spatially explicit prediction
We can also use the best model to predict spatially explicitly if we have the maps. As an illustration we will construct simulated maps for each of our covariates. The maps arise from a random pattern of points with a Poisson distribution. We then convert these points to an interpolated surface.
```{r spat, cache=TRUE, warning=FALSE, message=FALSE, fig.show='asis', fig.cap="Simulated map of elevation, forest and temperature.", fig.scap="fig"}
# lets make random maps for the three covariates
library(raster)
library(spatstat)
set.seed(24) # Remove for random simulations
# CONSTRUCT ANALYSIS WINDOW USING THE FOLLOWING:
xrange=c(-2.5, 1002.5)
yrange=c(-2.5, 502.5)
window<-owin(xrange, yrange)
# Build maps from random points and interpole in same line
elev <- density(rpoispp(lambda=0.6, win=window)) #
forest <- density(rpoispp(lambda=0.2, win=window)) #
temp <- density(rpoispp(lambda=0.5, win=window)) #
# Convert covs to raster and Put in the same stack
mapdata.m<-stack(raster(elev),raster(forest), raster(temp))
names(mapdata.m)<- c("elev", "forest", "temp") # put names to raster
# lets plot the covs maps
plot(mapdata.m)
```
Once we have our covariate maps, we use them to predict with the best model. In this way we can have a map with predictions of the occupancy and the probability of detection.
```{r spatr, cache=TRUE, warning=FALSE, message=FALSE, fig.height=2.5, fig.show='asis', fig.cap="Spatially explicit detection and occupancy models.", fig.scap="fig."}
# make the predictions
predictions_psi <- predict(fm7, type="state", newdata=mapdata.m) # predict psi
predictions_p <- predict(fm7, type="det", newdata=mapdata.m) # predict p
# put in the same stack
predmaps<-stack(predictions_psi$Predicted,predictions_p$Predicted)
names(predmaps)<-c("psi_predicted", "p_predicted") # put names
plot(predmaps)
```