Chapter 5 Spatial non-stationarity
In this chapter we focus on non-stationary models. In particular, these models will show how to model the covariance using covariates and how to deal with physical barriers in the study region.
5.1 Explanatory variables in the covariance
In this section we present an example of the model proposed in Ingebrigtsen, Lindgren, and Steinsland (2014). This example describes a way to include explanatory variables (i.e., covariates) in both of the SPDE model parameters. We will consider the parametrization used along this book and detailed in Lindgren, Rue, and Lindström (2011).
5.1.1 Introduction
First of all, we remind ourselves of the definition for the precision matrix. Restricting to \(\alpha=1\) and \(\alpha=2\), the precision matrix is given as follows:
\(\alpha=1\): \(\mathbf{Q}_{1,\tau,\kappa} = \tau(\kappa^2\mathbf{C} + \mathbf{G})\)
\(\alpha=2\): \(\mathbf{Q}_{2,\tau,\kappa} = \tau(\kappa^4\mathbf{C} + \kappa^2\mathbf{G} + \kappa^2\mathbf{G} + \mathbf{G}\mathbf{C}^{-1}\mathbf{G})\)
The default stationary SPDE model implemented in inla.spde2.matern()
function considers the \(\theta_1\) as the log of the local precision parameter \(\tau\)
and \(\theta_2\) the log of the scale \(\kappa\) parameter.
The marginal variance \(\sigma^2\) and the range \(\rho\) are function of these parameters as
\(\sigma^2=\Gamma(\nu) / (\Gamma(\alpha)(4\pi)^{d/2}\kappa^{2\nu}\tau^2)\) and \(\rho = \sqrt{8\nu}/\kappa\). These relations give
\[\begin{align} \begin{array}{rcl} \log(\tau) & = & \frac{1}{2}\log\left(\frac{\Gamma(\nu)}{\Gamma(\alpha)(4\pi)^{d/2}}\right) - \log(\sigma) -\nu\log(\kappa) \\ \log(\kappa) & = & \frac{\log(8\nu)}{2} - \log(\rho). \end{array} \end{align}\]
The approach to non-stationarity proposed in Ingebrigtsen, Lindgren, and Steinsland (2014) is to consider a regression like model for \(\log\tau\) and \(\log\kappa\). It is done considering basis functions as
\[\begin{align}\begin{array}{rcl} \log(\tau(\mathbf{s})) & = & b_0^{(\tau)}(\mathbf{s}) + \sum_{k=1}^p b_k^{(\tau)}(\mathbf{s})\theta_k \\ \log(\kappa(\mathbf{s})) & = & b_0^{(\kappa)}(\mathbf{s}) + \sum_{k=1}^p b_k^{(\kappa)}(\mathbf{s})\theta_k \end{array}\end{align}\]
where we have the \(\theta_k\) parameters as regression on the basis functions. Setting \(\mathbf{T}\) as a diagonal matrix with elements as \(\tau(\mathbf{s})\) and \(\mathbf{K}\) a diagonal matrix with elements as \(\kappa(\mathbf{s})\) we now have the precision matrix as
\(\alpha=1\): \(\mathbf{Q}_{1,\theta,\mathbf{B}^{(\tau)},\mathbf{B}^{(\kappa)}}\) = \(\mathbf{T}(\mathbf{K}\mathbf{C}\mathbf{K} + \mathbf{G})\mathbf{T}\)
\(\alpha=2\): \(\mathbf{Q}_{2,\theta,\mathbf{B}^{(\tau)},\mathbf{B}^{(\kappa)}}\) = \(\mathbf{T}(\mathbf{K}^2\mathbf{C}\mathbf{K}^2 + \mathbf{K}\mathbf{G} + \mathbf{G}^T\mathbf{K} + \mathbf{G}\mathbf{C}^{-1}\mathbf{G})\mathbf{T}\)
These basis functions are passed in the argument B.tau
and B.kappa
arguments of the inla.spde2.matern()
function.
\(b_0^{(\tau)}\) goes in the first column of B.tau
and \(b_k^{(\tau)}\) in the next columns.
\(b_0^{(\kappa)}\) goes in the first column of B.kappa
and \(b_k^{(\tau)}\) in the next columns.
When these basis matrices are supplied as just one line matrix,
the actual basis matrix will be formed having all lines equal to this
unique line matrix and the model is stationary.
The default uses \([0\; 1\; 0]\) (one by three) matrix
for the local precision parameter \(\tau\) and \([0\; 0\; 1]\)
(one by three) matrix for the scaling parameter \(\kappa\).
Thus \(\theta_1\) controls \(\log(\tau)\) and \(\theta_2\) controls \(\log(\kappa)\).
We can use B.tau
and B.kappa
in such a way to build a
model for the marginal standard deviation and range, see Lindgren and Rue (2015).
This is the case of the function inla.spde2.pcmatern()
, where
B.tau
is \([\log(\tau_0) \; \nu \; -1]\) and
B.kappa
is \([\log(\kappa_0) \; -1 \; 0]\).
We are using this along this book having the
range as the first parameter and \(\sigma\) the second.
Both the marginal standard deviation and the range can be modeled considering a regression model as detailed in Lindgren and Rue (2015). This is the case of considering that the log of \(\sigma\) and log of \(\rho\) are modeled by a regression on basis functions as
\[\begin{align} \log(\sigma(\mathbf{s})) = b_0^\sigma(\mathbf{s}) + \sum_{k=1}^p b_k^\sigma(\mathbf{s}) \theta_k \\ \log(\rho(\mathbf{s})) = b_0^{\rho}(\mathbf{s}) + \sum_{k=1}^p b_k^\rho(\mathbf{s}) \theta_k \end{align}\]
where \(b_0^\sigma(\mathbf{s})\) and \(b_0^{\rho(\mathbf{s})}\) are offsets,
\(b_k^\sigma()\) and \(b_k^{\rho()}\) are basis functions that can be defined on spatial
locations or covariates, each one with an associated \(\theta_k\) parameter.
Notice that B.tau
and B.kappa
are basis functions evaluated at each mesh
node. Therefore, to include a covariate in \(\sigma^2\) or in the range we do
need to have it available at each mesh node.
In our example we consider the domain area as the rectangle \((0,10)\times(0,5)\) and
the range as a function of the first coordinate of the location defined by
\(\rho(\mathbf{s}) = \exp(\theta_2 + \theta_3 (s_{i,1} - 5) / 10)\). This gives
\[\begin{align}\begin{array}{rcl} \log(\sigma) & = & \log(\sigma_0) + \theta_1 \\ \log(\rho(\mathbf{s})) & = & \log(\rho_0) + \theta_2 + \theta_3 b(\mathbf{s}) \end{array}\end{align}\]
where \(b(\mathbf{s}) = (s_{i,1}-5)/10\).
We now define the values for \(\theta_1\), \(\theta_2\) and \(\theta_3\). Because we choose \(\theta_3\) positive it implies an increasing range along the first location coordinate. The range as function of the first coordinate is shown in Figure 5.1.
We have to define the basis functions for \(\log(\tau)\) and \(\log(\kappa)\). Following Lindgren and Rue (2015) we need to write \(\log(\kappa(\mathbf{s}))\) and \(\log(\tau(\mathbf{s}))\) as functions of \(\log(\rho(\mathbf{s}))\) and \(\log(\sigma)\). Substituting \(\log(\sigma)\) and \(\log(\rho(\mathbf{s}))\) into the equations for \(\log(\kappa(\mathbf{s}))\) and \(\log(\tau(\mathbf{s}))\) we have
\[\begin{align}\begin{array}{rcl} \log(\tau(\mathbf{s})) & = & \log(\tau_0) - \theta_1 + \nu\theta_2 + \nu\theta_3 b(\mathbf{s}) \\ \log(\kappa(\mathbf{s})) & = & \log(\kappa_0) - \theta_2 - \theta_3 b(\mathbf{s}) \;. \end{array}\end{align}\]
Therefore we have to include \((s_{i,1}-5)/10\) as a fourth column in B.tau
and minus it in B.kappa
as well.
5.1.2 Implementation of the model
First, we define the \((0,10)\times (0,5)\) rectangle:
cbind(c(0, 1, 1, 0, 0) * 10, c(0, 0, 1, 1, 0) * 5) pl01 <-
The mesh will be created using this polygon, as follows:
inla.mesh.2d(loc.domain = pl01, cutoff = 0.1,
mesh <-max.edge = c(0.3, 1), offset = c(0.5, 1.5))
This mesh is made of 2198 nodes.
We supply \((s_{i,1} - 5) / 10\) as a fourth column of in the B.tau
argument of
the inla.spde2.matern()
function, and minus it in the fourth column of
argument B.kappa
. In addition, it is also necessary to set a prior
distribution on the vector of hyperparameters \(\theta\) according to its new
dimension, a three length vector. The default is a Gaussian distribution, for
which mean and precision diagonal need to be specified in two vectors, as
follows:
1
nu <- nu + 2 / 2
alpha <-# log(kappa)
log(8 * nu) / 2
logkappa0 <-# log(tau); in two lines to keep code width within range
(lgamma(nu) - lgamma(alpha) -1 * log(4 * pi)) / 2
logtau0 <- logtau0 - logkappa0
logtau0 <-# SPDE model
inla.spde2.matern(mesh,
spde <-B.tau = cbind(logtau0, -1, nu, nu * (mesh$loc[,1] - 5) / 10),
B.kappa = cbind(logkappa0, 0, -1, -1 * (mesh$loc[,1] - 5) / 10),
theta.prior.mean = rep(0, 3),
theta.prior.prec = rep(1, 3))
The precision matrices are built with
inla.spde2.precision(spde, theta = theta) Q <-
5.1.3 Simulation at the mesh nodes
We can draw one realization of the process using the inla.qsample()
function as
as.vector(inla.qsample(1, Q, seed = 1)) sample <-
The top plot in Figure 5.2 shows the simulated values
projected to a grid Here, the projector matrix has been used to project the
simulated values in the grid limited in the unit square with limits (0, 10) and
(0, 5). The bottom plot in Figure 5.2 shows the spatial
correlation with respect to two points, computed using function
book.spatial.correlation()
. The interesting issue to observe here is how
the range at which spatial autocorrelation disappears (i.e., correlation is
zero) is smaller around the point on the left due to the increasing range with
the x-coordinate simulated in the model.
The results and plots in Figure 5.2 have been computed with this code:
# Plot parameters
par(mfrow = c(2, 1), mar = c(0, 0, 0, 0))
#Plot Field
inla.mesh.projector(mesh, xlim = 0:1 * 10, ylim = 0:1 * 5,
proj <-dims = c(200, 100))
book.plot.field(sample, projector = proj)
# Compute spatial autocorrelation
.5 <- book.spatial.correlation(Q, c(1, 2.5), mesh)
cx1y2.5 <- book.spatial.correlation(Q, c(7, 2.5), mesh)
cx7y2# Plot spatial autocorrelation
book.plot.field(cx1y2.5, projector = proj, zlim = c(0.1, 1))
book.plot.field(cx7y2.5, projector = proj, zlim = c(0.1, 1),
add = TRUE)
5.1.4 Estimation with data simulated at the mesh nodes
A model can be easily fitted with the data simulated at mesh nodes. Considering that there are observations exactly at each mesh node, there is no need to use any predictor matrix and the stack functionality. Because data are just realizations from the random field, there is no noise and then the precision of the Gaussian likelihood will be fixed to a very large value. For example, to value \(\exp(20)\):
list(hyper = list(prec = list(initial = 20,
clik <-fixed = TRUE)))
Because the random field has a zero mean, there are not fixed parameters to fit. Hence, model fitting can be done as follows:
y ~ 0 + f(i, model = spde)
formula <-
inla(formula, control.family = clik,
res1 <-data = data.frame(y = sample, i = 1:mesh$n))
Now, the summary of the posterior marginal distribution for \(\theta\) (joined with the true values) is provided in Table 5.1. The results obtained are quite good and the point estimates are very close to the actual values.
Parameter | True | Mean | St. Dev. | 2.5% quant. | 97.5% quant. |
---|---|---|---|---|---|
\(\theta_1\) | -1 | -0.9800 | 0.0328 | -1.0413 | -0.9125 |
\(\theta_2\) | 0 | 0.0006 | 0.0408 | -0.0749 | 0.0851 |
\(\theta_3\) | 1 | 1.1679 | 0.0591 | 1.0530 | 1.2846 |
5.1.5 Estimation with locations not at the mesh nodes
The more general case is when observations are not at the mesh nodes.
For this reason, some in the next example data will be at the locations
simulated with the following R
code:
set.seed(2)
200
n <- cbind(runif(n) * 10, runif(n) * 5) loc <-
Next, we take the simulated random field at the mesh vertices and project it into these locations. For this, a projector matrix is needed:
inla.mesh.projector(mesh, loc) projloc <-
Hence, the projection is:
inla.mesh.project(projloc, sample) x <-
This provides the sample data at the simulated locations.
Now, because these locations are not vertices of the mesh, the stack functionality is required to put all the data together. First, the predictor matrix is needed, but this is the same used to sample the data.
Then, a stack is defined for this sample:
inla.stack(
stk <-data = list(y = x),
A = list(projloc$proj$A),
effects = list(data.frame(i = 1:mesh$n)),
tag = 'd')
Finally, the model is fitted:
inla(formula, data = inla.stack.data(stk),
res2 <-control.family = clik,
control.predictor = list(compute = TRUE, A = inla.stack.A(stk)))
The true values and a summary of marginal posterior distributions for \(\theta\) are given in Table 5.2. As in the previous example, estimates are quite close to the actual values of the parameters used in the simulation.
Parameter | True | Mean | St. Dev. | 2.5% quant. | 97.5% quant. |
---|---|---|---|---|---|
\(\theta_1\) | -1 | -0.9115 | 0.0708 | -1.0507 | -0.7723 |
\(\theta_2\) | 0 | -0.0063 | 0.1248 | -0.2539 | 0.2364 |
\(\theta_3\) | 1 | 1.0367 | 0.2539 | 0.5547 | 1.5582 |
Figure 5.3 shows the simulated values, the predicted (posterior mean) and the projected posterior standard deviation. Here, it can be seen how the predicted values are similar to the simulated true values.
5.2 The Barrier model
The most common spatial models are the stationary isotropic ones. For these, moving or rotating the map does not change the model. Even though this is a reasonable assumption in many cases, it is not a reasonable assumption if there are physical barriers in the study area. For example, when we model aquatic species near the coast, a stationary model cannot be aware of the coastline, and a new model is needed (one which is aware of the coastline).
In the paper by Bakka et al. (2016) a new non-stationary model was constructed for
use in INLA
, with a syntax very similar to the stationary model (to make it
easy to use). In this section we give an example using this model on a part of
the Canadian coastline.
5.2.1 Canadian coastline example
The example developed here is based on simulated data using the Canadian coastline as a physical barrier to spatial correlation. This example requires some manipulation of maps and polygons in order to create an appropriate mesh for the model, and we have loaded additional packages for this.
5.2.1.1 Polygon
First, we construct some spatial polygon covering our study area. Here we do
this via the mapdata
package (Becker, Wilks, and Brownrigg 2016):
# Select region
map("world", "Canada", fill = TRUE,
map <-col = "transparent", plot = FALSE)
sapply(strsplit(map$names, ":"), function(x) x[1])
IDs <- map2SpatialPolygons(
map.sp <-IDs = IDs,
map, proj4string = CRS("+proj=longlat +datum=WGS84"))
Next, we define our study area pl.sel
, as a manually constructed polygon, and
intersect this with the coastal area. Since we have a polygon for land, we
take the difference instead of an intersection, as follows:
SpatialPolygons(list(Polygons(list(Polygon(
pl.sel <-cbind(c(-69, -62.2, -57, -57, -69, -69),
c(47.8, 45.2, 49.2, 52, 52, 48)),
FALSE)), '0')), proj4string = CRS(proj4string(map.sp)))
gDifference(pl.sel, map.sp) poly.water <-
The Canadian coastline and the manually constructed polygon can be seen on the left plot in Figure 5.4.
5.2.1.2 Transforming to UTM coordinates
So far we have used longitude and latitude in this example, but this is not a valid coordinate system for spatial modeling, as distances in longitude and distances in latitude are in different scales. Modeling should be done in a CRS (Coordinate Reference System) where units along each axis is measured in e.g. kilometers. We do not want to use meters as this would result in very large values along the axes, and could cause unstable numerical results.
The projection is done with function spTransform()
as follows:
# Define UTM projection
CRS("+proj=utm +zone=20 ellps=WGS84 +units=km")
kmproj <-# Project data
spTransform(poly.water, kmproj)
poly.water = spTransform(pl.sel, kmproj)
pl.sel = spTransform(map.sp, kmproj) map.sp =
5.2.1.3 Simple mesh
Before we construct the mesh we are going to use, we first show how to construct a mesh only in water. We then discuss the difference between the two approaches.
inla.mesh.2d(boundary = poly.water, max.edge = 30,
mesh.not <-cutoff = 2)
The created mesh has 1115 nodes. This mesh has been plotted in Figure 5.4 (right plot).
Before developing the Barrier model, we used to construct this simple mesh and use the SPDE model here, as this avoids smoothing across land. The problem with this approach, however, is that we introduce other hidden assumptions, called Neumann boundary conditions. As shown by Bakka et al. (2016), this assumption can lead to even worse behavior than the stationary model (which smooths over islands).
5.2.1.4 Mesh over land and sea
Next, we construct a mesh over both water and land. This is the mesh we are going to use. We include the coastline polygon in the function call, to make the edges of the triangles follow the coastline.
30
max.edge = 150
bound.outer = inla.mesh.2d(boundary = poly.water,
mesh <-max.edge = c(1,5) * max.edge,
cutoff = 2,
offset = c(max.edge, bound.outer))
Next, we select the triangles of the mesh that are inside poly.water
. We
define the triangles in the barrier area (i.e. on land) to be all the triangles
that are not on water. From this we construct the poly.barrier
. This should
match the original land polygon closely, but will deviate a little. The new
polygon poly.barrier
is where our model assumes there to be land; hence we
use this polygon also for plotting the results.
inla.over_sp_mesh(poly.water, y = mesh,
water.tri =type = "centroid", ignore.CRS = TRUE)
length(mesh$graph$tv[, 1])
num.tri = setdiff(1:num.tri, water.tri)
barrier.tri = inla.barrier.polygon(mesh,
poly.barrier =barrier.triangles = barrier.tri)
The mesh in Figure 5.5 is different from the simple mesh in Figure 5.4, as we have also defined the mesh over land. We can use this mesh to construct both a stationary model and a Barrier model.
5.2.1.5 The Barrier model vs. the stationary model
Next we define the precision matrix for the Barrier model and the stationary model
\[ \mathbf{u} \sim \mathcal N(0, Q^{-1}) \] considering unit marginal variance:
200
range <- inla.barrier.pcmatern(mesh,
barrier.model <-barrier.triangles = barrier.tri)
inla.rgeneric.q(barrier.model, "Q", theta = c(0, log(range))) Q <-
This code spits out a warning because the priors are not defined. However, as we are simulating from the GF, the priors for the hyperparameters are irrelevant.
inla.spde2.pcmatern(mesh,
stationary.model <-prior.range = c(1, 0.1), prior.sigma = c(1, 0.1))
inla.spde2.precision(stationary.model,
Q.stat <-theta = c(log(range), 0))
The priors we chose here are arbitrary as they are not used. Note that the
ordering of theta
is reversed between the two models.
We use the functions from spde-book-functions.R
to compute spatial
correlation surfaces; see Figure 5.6.
# The location we find the correlation with respect to
c(500, 5420)
loc.corr <- book.spatial.correlation(Q, loc = loc.corr, mesh)
corr <- book.spatial.correlation(Q.stat, loc = loc.corr,
corr.stat <- mesh)
5.2.1.6 Sample from the Barrier model
Next, we sample locations from the water polygon using spsample()
. And we
simplify the data structure loc.data
from SpatialPoints
to just a matrix of
values.
set.seed(201805)
spsample(poly.water, n = 1000, type = "random")
loc.data <- loc.data@coords loc.data <-
We sample the continuous field \(\mathbf{u}\) and project it to data locations:
# Seed is the month the code was first written times some number
inla.qsample(n = 1, Q = Q, seed = 201805 * 3)[, 1]
u <- inla.spde.make.A(mesh, loc.data)
A.data <- A.data %*% u
u.data <-
# df is the dataframe used for modeling
data.frame(loc.data)
df <-names(df) <- c('locx', 'locy')
# Size of the spatial signal
1
sigma.u <-# Size of the measurement noise
0.1
sigma.epsilon <-$y <- drop(sigma.u * u.data + sigma.epsilon * rnorm(nrow(df))) df
5.2.1.7 Inference with the Barrier model
Similarly as with previous models, we construct the typical stack object to prepare the data for model fitting:
inla.stack(
stk <-data = list(y = df$y),
A = list(A.data, 1),
effects =list(s = 1:mesh$n, intercept = rep(1, nrow(df))),
tag = 'est')
The formula for the model only takes an intercept plus the spatial effect:
y ~ 0 + intercept + f(s, model = barrier.model) form.barrier <-
Finally, we run INLA
with a Gaussian likelihood:
inla(form.barrier, data = inla.stack.data(stk),
res.barrier <-control.predictor = list(A = inla.stack.A(stk)),
family = 'gaussian',
control.inla = list(int.strategy = "eb"))
5.2.1.8 Posterior spatial field
In Figure 5.7 we compare the results from the Barrier model to the true (simulated) spatial field. These have a close match. We observe quick changes across islands or small peninsulas, which may be smoothed out by a stationary model. See Bakka et al. (2016) for a more detailed discussion.
5.2.1.9 Posterior of hyperparameters
The summary of the posterior marginals of the model hyperparameters is:
$summary.hyperpar
res.barrier## mean sd
## Precision for the Gaussian observations 94.4892 6.5776
## Theta1 for s -0.1348 0.1182
## Theta2 for s 5.1357 0.1379
## 0.025quant 0.5quant
## Precision for the Gaussian observations 82.0188 94.3386
## Theta1 for s -0.3381 -0.1457
## Theta2 for s 4.8992 5.1229
## 0.975quant mode
## Precision for the Gaussian observations 107.8866 94.1400
## Theta1 for s 0.1218 -0.1861
## Theta2 for s 5.4352 5.0749
To summarize or plot the hyperparameters in the Barrier model, we must know the order they come in and the transformations:
\[\begin{align} \sigma &= e^ {\theta_1} \quad \text{is the marginal standard deviation} \\ r &= e^ {\theta_2} \quad \text{is the spatial range} \end{align}\]
From this we compute Table 5.3. We see that our estimates and intervals recover the true values.
Parameter | True | 50% quant. | 2.5% quant. | 97.5% quant. |
---|---|---|---|---|
\(\sigma\) | 1 | 0.8644 | 0.7131 | 1.129 |
\(range\) | 200 | 167.8143 | 134.1823 | 229.344 |
5.3 Barrier model for noise data in Albacete (Spain)
Barrier models can be appropriate tools to model different types of environmental phenomena that have a clear anisotropic behavior. A clear example is the propagation of noise in cities, where buildings act as barriers in the propagation of noise. In this section, spatial and spatio-temporal models are developed using noise data from the city of Albacete (Castilla-La Mancha, Spain).
Data has been extracted from a report commissioned by the local authorities due to increasing concern about noise in the city and, in particular, in certain areas of the city center. The analysis will focus on a busy area in the city center (popularly known as “La zona”, i.e., “The zone”) with plenty of bars and restaurants. The original reports are available (as PDF documents) from website http://www.pioneraconsultores.com/es/mapa-de-ruidos.zhtm?corp=medioambiente, which contains the original data.
Measurements were taken in March 2010 for a period of 24 hours at different points throughout the city. In this analysis, hourly measurements of sound pressure levels (in A-weighted decibels, or dbA) at 7 different points within the study region will be considered. Local regulations require noise levels to be below 65 db during the day and below 60 db at night in residential areas. Hence, the aim of this example is to study how noise levels vary within the city center to get an idea of whether local regulations are fulfilled.
The data used in this example is provided as a
SpatialPointsDataFrame
in file noise.RData
. Table 5.4
summarizes the main variables in the dataset.
Variable | Description |
---|---|
X |
x-coordinate (in UTM). |
Y |
y-coordinate (in UTM). |
LAeqZZh |
Hourly sound pressure level (in dbA) between ZZ and ZZ + 1 hours. ZZ is an integer from 1 to 24. |
5.3.0.1 Data loading
Data about the locations of the buildings, which will act as barriers, will be
obtained from OpenStreetMap (OSM, OpenStreetMap contributors 2018) using package osmar
(Eugster and Schlesinger 2013). This package provides a simple way to download OSM data and
extract the relevant information required to build the desired models.
In the next lines of R
code, function osmsource_api()
sets up access to
the OSM API. Next, function center_bbox
is used to set the squared region
where the data is extracted from. The first two arguments are the coordinates
of the center of the rectangle, and the next two its width and height.
Finally, function get_osm()
gets the data from the defined bounding box using
the OSM API.
#Get building data from Albacete using OpenStreetMap
library(osmar)
osmsource_api()
src <-
center_bbox(-1.853152, 38.993318, 400, 400)
bb <- get_osm(bb, source = src) ua <-
OSM data provides a wealth of information about street data in variable ua
.
Since our focus is on modeling noise using buildings as barriers, the
information about buildings will be extracted. Functions find()
and
find_down()
are used below to obtain an index of all features in the OSM data
in ua
tagged as building
. Then subset()
is used to select the elements in
ua
that are in the index of buildings. Finally, function as_sp
converts the
boundaries of the buildings (as associated information) into a
SpatialPolygonsDataFrame
.
find(ua, way(tags(k == "building")))
idx <- find_down(ua, way(idx))
idx <- subset(ua, ids = idx)
bg <-
as_sp(bg, "polygons") bg_poly <-
The newly created SpatialPolygonsDataFrame
object stores different buildings
as separate entities. Hence, we will use function unionSpatialPolygons()
to
create a single entity, which will be used later to obtain the boundaries of
the streets.
library(maptools)
unionSpatialPolygons(bg_poly,
bg_poly <-rep("1", length(bg_poly)))
Next, the actual study area is defined by using function center_bbox()
again, this time with a smaller width and height, to make sure that all
buildings in the study area have been retrieved before. Finally,
a SpatialPolygons
object is created and the intersection between
the boundary of the study area and the polygons of the buildings is
created using function gIntersection
from the rgeos
package (R. Bivand and Rundel 2017).
#Outer boundary to overlay with buildings
center_bbox(-1.853152, 38.993318, 350, 350)
bb.outer <- matrix(c(bb.outer[1:2], bb.outer[c(3, 2)], bb.outer[3:4],
pl <-c(1, 4)]), ncol = 2, byrow = TRUE)
bb.outer[
SpatialPolygons(
pl_sp <-list(Polygons(list(Polygon(pl)), ID = 1)),
proj4string = CRS(proj4string(bg_poly)))
library(rgeos)
gIntersection(bg_poly, pl_sp, byid = TRUE) bg_poly2 <-
Coordinates of the buildings are in longitude and latitude, which is not
an adequate coordinate reference system to deal with distances. For this
reason, coordinates will be converted into UTM, so that distances
are measured in meters. For this, function spTransform()
will be
used:
## Transform data
library(rgdal)
spTransform(pl_sp,
pl_sp.utm30 <-CRS("+proj=utm +zone=30 +ellps=GRS80 +units=m +no_defs"))
spTransform(bg_poly2,
bg_poly2.utm30 <-CRS("+proj=utm +zone=30 +ellps=GRS80 +units=m +no_defs"))
The last part of the building data processing involves obtaining not the polygons of the buildings, but the polygons of the streets, which is where the spatial process takes place. This will be done by taking the bounding box of the study region and removing the buildings, as follows:
## Compute difference to get streets
gDifference(pl_sp.utm30, bg_poly2.utm30) bg_poly2.utm30 <-
Figure 5.8 shows the buildings (left plot) and the streets (right plot) that will be used in the analysis.
Noise data is provided in a data.frame
in file noise.RData
, and it
can be loaded as:
load("data/noise.RData")
This dataset contains data from 7 locations where noise has been measured. This includes 5 outdoors locations plus 2 indoors locations. Although noise may be attenuated at these 2 locations they have been kept in the dataset to have more observations in the analysis.
5.3.0.2 Mesh building
The mesh will be built similarly to that in Section 5.2.1. In this case, distances are measured in meters. The maximum edges of the triangles will be 5 meters and the outer boundary will be 10 meters:
5
max.edge = 10
bound.outer = inla.mesh.2d(boundary = pl_sp.utm30,
mesh <-max.edge = c(1, 1.5) * max.edge,
cutoff = 10,
offset = c(max.edge, bound.outer))
The next lines of R
code will use function inla.over_sp_mesh()
(to
check which mesh triangles are inside a polygon) and call function
inla.barrier.polygon()
to obtain the polygon around the barrier. Figure
5.9 shows the mesh created, together with the streets and the
polygon around the barrier obtained with inla.barrier.polygon()
(in red).
inla.over_sp_mesh(bg_poly2.utm30, y = mesh,
city.tri =type = "centroid", ignore.CRS = TRUE)
length(mesh$graph$tv[, 1])
num.tri <- setdiff(1:num.tri, city.tri)
barrier.tri <- inla.barrier.polygon(mesh,
poly.barrier <-barrier.triangles = barrier.tri)
5.3.0.3 Barrier model for noise data
Before fitting the actual barrier model to the noise data, we will explore how
the spatial correlation in the barrier model is defined. First of all, a model
with range 100 and precision 1 is simulated. Here, function
inla.barrier.pcmatern()
is used to define a barrier model using the
rgeneric
approach in INLA
to define latent models; see
inla.doc("rgeneric")
for details. Furthermore, inla.rgeneric.q()
returns
the precision matrix of the model for some values of the hyperparameters, i.e.,
precision and range (in the log scale).
100
range <- 1
prec <- inla.barrier.pcmatern(mesh,
barrier.model <-barrier.triangles = barrier.tri)
inla.rgeneric.q(barrier.model, "Q",
Q <-theta = c(log(prec), log(range)))
In order to compare the Barrier model, a stationary model is defined below.
Function inla.spde2.pcmatern()
defines a stationary model using a PC prior
for the standard deviation and range, and function inla.spde2.precision()
returns the precision matrix of the model.
inla.spde2.pcmatern(mesh,
stationary.model <-prior.range = c(100, 0.9), prior.sigma = c(1, 0.1))
inla.spde2.precision(stationary.model,
Q.stat <-theta = c(log(range), 0))
Then, the correlation field at point (599318.3, 4316661)
is computed
for the Barrier and stationary models:
# The location we find the correlation with respect to
c(599318.3, 4316661)
loc.corr <-
book.spatial.correlation(Q, loc = loc.corr, mesh)
corr <- book.spatial.correlation(Q.stat, loc = loc.corr, mesh) corr.stat =
These are shown in Figure 5.10. Note how the spatial correlation for the stationary model (right plot in Figure 5.10) seems to ignore buildings and only considers Euclidean distance between two points, while the spatial autocorrelation for the Barrier model (left plot in Figure 5.10) does take buildings into account.
The first Barrier model will consider the spatial variation of noise
at 1 am, which is measured by variable LAeq1h
. The projector matrix
and stack required to fit the model can be obtained as follows:
inla.spde.make.A(mesh, coordinates(noise))
A.data <-
inla.stack(
stk <-data = list(y = noise$LAeq1h),
A = list(A.data, 1),
effects =list(s = 1:mesh$n, intercept = rep(1, nrow(noise))),
tag = 'est')
Similarly, a new stack will be created for prediction at the mesh nodes:
# Projector matrix at prediction points
inla.spde.make.A(mesh, mesh$loc[, 1:2])
A.pred <-
#Stack for prediction at mesh nodes
inla.stack(
stk.pred <-data = list(y = NA),
A = list(A.pred, 1),
effects =list(s = 1:mesh$n, intercept = rep(1, nrow(A.pred))),
tag = 'pred')
# Joint stack for model fitting and prediction
inla.stack(stk, stk.pred) joint.stk <-
Next, the formula to fit the model is:
y ~ 0 + intercept + f(s, model = barrier.model) form.barrier <-
Then, the model is fitted using a PC prior on the standard deviation parameter of the likelihood:
# PC-prior for st. dev.
list(prior = "pc.prec", param = c(2, 0.01))
stdev.pcprior <-# Model fitting
inla(form.barrier,
res.barrier <-data = inla.stack.data(joint.stk),
control.predictor = list(A = inla.stack.A(joint.stk),
compute = TRUE),
family = 'gaussian',
control.inla = list(int.strategy = "eb"),
control.family = list(hyper = list(prec = stdev.pcprior)),
control.mode = list(theta = c(1.647, 1.193, 3.975)))
The summary of the fitted model is shown below. Figure 5.11 shows estimates of the posterior mean and 97.5% quantile of the noise. As can be seen, it is clear that noise level at 1 am seems to be very close to or higher than the cut off of 65 dB.
summary(res.barrier)
##
## Call:
## c("inla(formula = form.barrier, family = \"gaussian\",
## data = inla.stack.data(joint.stk), ", "
## control.predictor = list(A = inla.stack.A(joint.stk),
## compute = TRUE), ", " control.family = list(hyper =
## list(prec = stdev.pcprior)), ", " control.inla =
## list(int.strategy = \"eb\"), control.mode = list(theta
## = c(1.647, ", " 1.193, 3.975)))")
## Time used:
## Pre = 0.592, Running = 111, Post = 1.26, Total = 113
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## intercept 64.89 1.389 62.17 64.89 67.62 64.89 0
##
## Random effects:
## Name Model
## s RGeneric2
##
## Model hyperparameters:
## mean sd 0.025quant
## Precision for the Gaussian observations 5.46 0.701 4.211
## Theta1 for s 1.04 0.145 0.731
## Theta2 for s 4.25 0.861 2.712
## 0.5quant 0.975quant
## Precision for the Gaussian observations 5.42 6.96
## Theta1 for s 1.05 1.29
## Theta2 for s 4.18 6.03
## mode
## Precision for the Gaussian observations 5.33
## Theta1 for s 1.09
## Theta2 for s 3.87
##
## Expected number of effective parameters(stdev): 6.92(0.00)
## Number of equivalent replicates : 1.01
##
## Marginal log-Likelihood: -32.65
## Posterior marginals for the linear predictor and
## the fitted values are computed
5.3.0.4 Space-time model
Space-time models are fully described in Chapter 7 and Chapter 8, but we have included here a simple example using the noise data. Given that information is available for 24 hours, it is possible to fit a space-time model to the noise data. First, a projector matrix is defined for the measurement points for the 24 hours:
inla.spde.make.A(mesh = mesh,
A.st <-loc = coordinates(noise)[rep(1:7, 24), ])
The INLA
stack is now created by adding variables LAeq1h
, … LAeq24h
together with a time
identifier (from 1 to 24 hours):
inla.stack(
stk.st <-data = list(y = unlist(noise@data[, 2 + 1:24])),
A = list(A.st, 1, 1),
effects = list(s = 1:mesh$n,
intercept = rep(1, 24 * nrow(noise)),
time = rep(1:24, each = nrow(noise) )))
The model to be fitted is similar to the previous one, but now a cyclic random walk of order 1 is added as a separable time effect:
# Model formula
y ~ 0 + intercept +
form.barrier.st <- f(s, model = barrier.model) +
f(time, model = "rw1", cyclic = TRUE, scale.model = TRUE,
hyper = list(prec = stdev.pcprior))
Note that a PC prior has been used on the standard deviation parameter of the random walk effect. Then the model is fitted similarly as before:
inla(form.barrier.st,
res.barrier.st <-data = inla.stack.data(stk.st),
control.predictor = list(A = inla.stack.A(stk.st)),
family = 'gaussian',
control.inla = list(int.strategy = "eb"),
control.family = list(hyper = list(prec = stdev.pcprior)),
control.mode = list(theta = c(-1.883, 1.123, 3.995, -0.837)))
A summary of the fitted model can be seen below. Figure 5.12 shows noise estimates at 6, 12, 18 and 24 hours.
summary(res.barrier.st)
##
## Call:
## c("inla(formula = form.barrier.st, family =
## \"gaussian\", data = inla.stack.data(stk.st), ", "
## control.predictor = list(A = inla.stack.A(stk.st)),
## control.family = list(hyper = list(prec =
## stdev.pcprior)), ", " control.inla = list(int.strategy
## = \"eb\"), control.mode = list(theta = c(-1.883, ", "
## 1.123, 3.995, -0.837)))")
## Time used:
## Pre = 0.459, Running = 7.2, Post = 0.134, Total = 7.79
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## intercept 64.86 1.468 61.98 64.86 67.74 64.86 0
##
## Random effects:
## Name Model
## s RGeneric2
## time RW1 model
##
## Model hyperparameters:
## mean sd 0.025quant
## Precision for the Gaussian observations 0.153 0.017 0.121
## Theta1 for s 1.118 0.232 0.660
## Theta2 for s 3.999 0.307 3.395
## Precision for time 0.454 0.139 0.242
## 0.5quant 0.975quant
## Precision for the Gaussian observations 0.152 0.189
## Theta1 for s 1.119 1.571
## Theta2 for s 3.998 4.606
## Precision for time 0.434 0.784
## mode
## Precision for the Gaussian observations 0.150
## Theta1 for s 1.123
## Theta2 for s 3.995
## Precision for time 0.397
##
## Expected number of effective parameters(stdev): 17.54(0.00)
## Number of equivalent replicates : 9.58
##
## Marginal log-Likelihood: -452.70
References
Bakka, H., J. Vanhatalo, J. Illian, D. Simpson, and H. Rue. 2016. “Non-stationary Gaussian models with physical barriers.” ArXiv E-Prints, August. http://arxiv.org/abs/1608.03787.
Becker, R. A., A. R. Wilks, and R. Brownrigg. 2016. Mapdata: Extra Map Databases. https://CRAN.R-project.org/package=mapdata.
Bivand, R., and C. Rundel. 2017. Rgeos: Interface to Geometry Engine - Open Source (’Geos’). https://CRAN.R-project.org/package=rgeos.
Eugster, Manuel J. A., and Thomas Schlesinger. 2013. “Osmar: OpenStreetMap and R.” The R Journal 5 (1): 53–63. http://journal.r-project.org/archive/2013/RJ-2013-005/index.html.
Ingebrigtsen, R., F. Lindgren, and I. Steinsland. 2014. “Spatial Models with Explanatory Variables in the Dependence Structure.” Spatial Statistics 8: 20–38.
Lindgren, F., and H. Rue. 2015. “Bayesian Spatial and Spatio-Temporal Modelling with R-INLA.” Journal of Statistical Software 63 (19).
Lindgren, F., H. Rue, and J. Lindström. 2011. “An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach (with Discussion).” J. R. Statist. Soc. B 73 (4): 423–98.
OpenStreetMap contributors. 2018. “Planet dump retrieved from https://planet.osm.org.” https://www.openstreetmap.org.