# Chapter 5 Spatial non-stationarity

In this chapter we focus on non-stationary models. In particular, these models will show how to model the covariance using covariates and how to deal with physical barriers in the study region.

## 5.1 Explanatory variables in the covariance

In this section we present an example of the model proposed in Ingebrigtsen, Lindgren, and Steinsland (2014). This example describes a way to include explanatory variables (i.e., covariates) in both of the SPDE model parameters. We will consider the parametrization used along this book and detailed in Lindgren, Rue, and Lindström (2011).

### 5.1.1 Introduction

First of all, we remind ourselves of the definition for the precision matrix. Restricting to \(\alpha=1\) and \(\alpha=2\), the precision matrix is given as follows:

\(\alpha=1\): \(\mathbf{Q}_{1,\tau,\kappa} = \tau(\kappa^2\mathbf{C} + \mathbf{G})\)

\(\alpha=2\): \(\mathbf{Q}_{2,\tau,\kappa} = \tau(\kappa^4\mathbf{C} + \kappa^2\mathbf{G} + \kappa^2\mathbf{G} + \mathbf{G}\mathbf{C}^{-1}\mathbf{G})\)

The default stationary SPDE model implemented in `inla.spde2.matern()`

function considers the \(\theta_1\) as the log of the local precision parameter \(\tau\)
and \(\theta_2\) the log of the scale \(\kappa\) parameter.
The marginal variance \(\sigma^2\) and the range \(\rho\) are function of these parameters as

\(\sigma^2=\Gamma(\nu) / (\Gamma(\alpha)(4\pi)^{d/2}\kappa^{2\nu}\tau^2)\) and \(\rho = \sqrt{8\nu}/\kappa\). These relations give

\[\begin{align} \begin{array}{rcl} \log(\tau) & = & \frac{1}{2}\log\left(\frac{\Gamma(\nu)}{\Gamma(\alpha)(4\pi)^{d/2}}\right) - \log(\sigma) -\nu\log(\kappa) \\ \log(\kappa) & = & \frac{\log(8\nu)}{2} - \log(\rho). \end{array} \end{align}\]

The approach to non-stationarity proposed in Ingebrigtsen, Lindgren, and Steinsland (2014) is to consider a regression like model for \(\log\tau\) and \(\log\kappa\). It is done considering basis functions as

\[\begin{align}\begin{array}{rcl} \log(\tau(\mathbf{s})) & = & b_0^{(\tau)}(\mathbf{s}) + \sum_{k=1}^p b_k^{(\tau)}(\mathbf{s})\theta_k \\ \log(\kappa(\mathbf{s})) & = & b_0^{(\kappa)}(\mathbf{s}) + \sum_{k=1}^p b_k^{(\kappa)}(\mathbf{s})\theta_k \end{array}\end{align}\]

where we have the \(\theta_k\) parameters as regression on the basis functions. Setting \(\mathbf{T}\) as a diagonal matrix with elements as \(\tau(\mathbf{s})\) and \(\mathbf{K}\) a diagonal matrix with elements as \(\kappa(\mathbf{s})\) we now have the precision matrix as

\(\alpha=1\): \(\mathbf{Q}_{1,\theta,\mathbf{B}^{(\tau)},\mathbf{B}^{(\kappa)}}\) = \(\mathbf{T}(\mathbf{K}\mathbf{C}\mathbf{K} + \mathbf{G})\mathbf{T}\)

\(\alpha=2\): \(\mathbf{Q}_{2,\theta,\mathbf{B}^{(\tau)},\mathbf{B}^{(\kappa)}}\) = \(\mathbf{T}(\mathbf{K}^2\mathbf{C}\mathbf{K}^2 + \mathbf{K}\mathbf{G} + \mathbf{G}^T\mathbf{K} + \mathbf{G}\mathbf{C}^{-1}\mathbf{G})\mathbf{T}\)

These basis functions are passed in the argument `B.tau`

and `B.kappa`

arguments of the `inla.spde2.matern()`

function.
\(b_0^{(\tau)}\) goes in the first column of `B.tau`

and \(b_k^{(\tau)}\) in the next columns.
\(b_0^{(\kappa)}\) goes in the first column of `B.kappa`

and \(b_k^{(\tau)}\) in the next columns.
When these basis matrices are supplied as just one line matrix,
the actual basis matrix will be formed having all lines equal to this
unique line matrix and the model is stationary.
The default uses \([0\; 1\; 0]\) (one by three) matrix
for the local precision parameter \(\tau\) and \([0\; 0\; 1]\)
(one by three) matrix for the scaling parameter \(\kappa\).
Thus \(\theta_1\) controls \(\log(\tau)\) and \(\theta_2\) controls \(\log(\kappa)\).

We can use `B.tau`

and `B.kappa`

in such a way to build a
model for the marginal standard deviation and range, see Lindgren and Rue (2015).
This is the case of the function `inla.spde2.pcmatern()`

, where
`B.tau`

is \([\log(\tau_0) \; \nu \; -1]\) and
`B.kappa`

is \([\log(\kappa_0) \; -1 \; 0]\).
We are using this along this book having the
range as the first parameter and \(\sigma\) the second.

Both the marginal standard deviation and the range can be modeled considering a regression model as detailed in Lindgren and Rue (2015). This is the case of considering that the log of \(\sigma\) and log of \(\rho\) are modeled by a regression on basis functions as

\[\begin{align} \log(\sigma(\mathbf{s})) = b_0^\sigma(\mathbf{s}) + \sum_{k=1}^p b_k^\sigma(\mathbf{s}) \theta_k \\ \log(\rho(\mathbf{s})) = b_0^{\rho}(\mathbf{s}) + \sum_{k=1}^p b_k^\rho(\mathbf{s}) \theta_k \end{align}\]

where \(b_0^\sigma(\mathbf{s})\) and \(b_0^{\rho(\mathbf{s})}\) are offsets,
\(b_k^\sigma()\) and \(b_k^{\rho()}\) are basis functions that can be defined on spatial
locations or covariates, each one with an associated \(\theta_k\) parameter.
Notice that `B.tau`

and `B.kappa`

are basis functions evaluated at each mesh
node. Therefore, to include a covariate in \(\sigma^2\) or in the range we do
need to have it available at each mesh node.

In our example we consider the domain area as the rectangle \((0,10)\times(0,5)\) and
the range as a function of the first coordinate of the location defined by

\(\rho(\mathbf{s}) = \exp(\theta_2 + \theta_3 (s_{i,1} - 5) / 10)\). This gives

\[\begin{align}\begin{array}{rcl} \log(\sigma) & = & \log(\sigma_0) + \theta_1 \\ \log(\rho(\mathbf{s})) & = & \log(\rho_0) + \theta_2 + \theta_3 b(\mathbf{s}) \end{array}\end{align}\]

where \(b(\mathbf{s}) = (s_{i,1}-5)/10\).

We now define the values for \(\theta_1\), \(\theta_2\) and \(\theta_3\). Because we choose \(\theta_3\) positive it implies an increasing range along the first location coordinate. The range as function of the first coordinate is shown in Figure 5.1.

We have to define the basis functions for \(\log(\tau)\) and \(\log(\kappa)\). Following Lindgren and Rue (2015) we need to write \(\log(\kappa(\mathbf{s}))\) and \(\log(\tau(\mathbf{s}))\) as functions of \(\log(\rho(\mathbf{s}))\) and \(\log(\sigma)\). Substituting \(\log(\sigma)\) and \(\log(\rho(\mathbf{s}))\) into the equations for \(\log(\kappa(\mathbf{s}))\) and \(\log(\tau(\mathbf{s}))\) we have

\[\begin{align}\begin{array}{rcl} \log(\tau(\mathbf{s})) & = & \log(\tau_0) - \theta_1 + \nu\theta_2 + \nu\theta_3 b(\mathbf{s}) \\ \log(\kappa(\mathbf{s})) & = & \log(\kappa_0) - \theta_2 - \theta_3 b(\mathbf{s}) \;. \end{array}\end{align}\]

Therefore we have to include \((s_{i,1}-5)/10\) as a fourth column in `B.tau`

and minus it in `B.kappa`

as well.

### 5.1.2 Implementation of the model

First, we define the \((0,10)\times (0,5)\) rectangle:

The mesh will be created using this polygon, as follows:

This mesh is made of 2196 nodes.

We supply \((s_{i,1} - 5) / 10\) as a fourth column of in the `B.tau`

argument of
the `inla.spde2.matern()`

function, and minus it in the fourth column of
argument `B.kappa`

. In addition, it is also necessary to set a prior
distribution on the vector of hyperparameters \(\theta\) according to its new
dimension, a three length vector. The default is a Gaussian distribution, for
which mean and precision diagonal need to be specified in two vectors, as
follows:

```
nu <- 1
alpha <- nu + 2 / 2
# log(kappa)
logkappa0 <- log(8 * nu) / 2
# log(tau); in two lines to keep code width within range
logtau0 <- (lgamma(nu) - lgamma(alpha) -1 * log(4 * pi)) / 2
logtau0 <- logtau0 - logkappa0
# SPDE model
spde <- inla.spde2.matern(mesh,
B.tau = cbind(logtau0, -1, nu, nu * (mesh$loc[,1] - 5) / 10),
B.kappa = cbind(logkappa0, 0, -1, -1 * (mesh$loc[,1] - 5) / 10),
theta.prior.mean = rep(0, 3),
theta.prior.prec = rep(1, 3))
```

The precision matrices are built with

### 5.1.3 Simulation at the mesh nodes

We can draw one realization of the process using the `inla.qsample()`

function as

The top plot in Figure 5.2 shows the simulated values
projected to a grid Here, the projector matrix has been used to project the
simulated values in the grid limited in the unit square with limits (0, 10) and
(0, 5). The bottom plot in Figure 5.2 shows the spatial
correlation with respect to two points, computed using function
`book.spatial.correlation()`

. The interesting issue to observe here is how
the range at which spatial autocorrelation disappears (i.e., correlation is
zero) is smaller around the point on the left due to the increasing range with
the x-coordinate simulated in the model.

The results and plots in Figure 5.2 have been computed with this code:

```
# Plot parameters
par(mfrow = c(2, 1), mar = c(0, 0, 0, 0))
#Plot Field
proj <- inla.mesh.projector(mesh, xlim = 0:1 * 10, ylim = 0:1 * 5,
dims = c(200, 100))
book.plot.field(sample, projector = proj)
# Compute spatial autocorrelation
cx1y2.5 <- book.spatial.correlation(Q, c(1, 2.5), mesh)
cx7y2.5 <- book.spatial.correlation(Q, c(7, 2.5), mesh)
# Plot spatial autocorrelation
book.plot.field(cx1y2.5, projector = proj, zlim = c(0.1, 1))
book.plot.field(cx7y2.5, projector = proj, zlim = c(0.1, 1),
add = TRUE)
```

### 5.1.4 Estimation with data simulated at the mesh nodes

A model can be easily fitted with the data simulated at mesh nodes. Considering that there are observations exactly at each mesh node, there is no need to use any predictor matrix and the stack functionality. Because data are just realizations from the random field, there is no noise and then the precision of the Gaussian likelihood will be fixed to a very large value. For example, to value \(\exp(20)\):

Because the random field has a zero mean, there are not fixed parameters to fit. Hence, model fitting can be done as follows:

```
formula <- y ~ 0 + f(i, model = spde)
res1 <- inla(formula, control.family = clik,
data = data.frame(y = sample, i = 1:mesh$n))
```

Now, the summary of the posterior marginal distribution for \(\theta\) (joined with the true values) is provided in Table 5.1. The results obtained are quite good and the point estimates are very close to the actual values.

Parameter | True | Mean | St. Dev. | 2.5% quant. | 97.5% quant. |
---|---|---|---|---|---|

\(\theta_1\) | -1 | -0.9745 | 0.0339 | -1.0409 | -0.9072 |

\(\theta_2\) | 0 | 0.0102 | 0.0435 | -0.0752 | 0.0960 |

\(\theta_3\) | 1 | 1.0936 | 0.0607 | 0.9813 | 1.2196 |

### 5.1.5 Estimation with locations not at the mesh nodes

The more general case is when observations are not at the mesh nodes.
For this reason, some in the next example data will be at the locations
simulated with the following `R`

code:

Next, we take the simulated random field at the mesh vertices and project it into these locations. For this, a projector matrix is needed:

Hence, the projection is:

This provides the sample data at the simulated locations.

Now, because these locations are not vertices of the mesh, the stack functionality is required to put all the data together. First, the predictor matrix is needed, but this is the same used to sample the data.

Then, a stack is defined for this sample:

```
stk <- inla.stack(
data = list(y = x),
A = list(projloc$proj$A),
effects = list(data.frame(i = 1:mesh$n)),
tag = 'd')
```

Finally, the model is fitted:

```
res2 <- inla(formula, data = inla.stack.data(stk),
control.family = clik,
control.predictor = list(compute = TRUE, A = inla.stack.A(stk)))
```

The true values and a summary of marginal posterior distributions for \(\theta\) are given in Table 5.2. As in the previous example, estimates are quite close to the actual values of the parameters used in the simulation.

Parameter | True | Mean | St. Dev. | 2.5% quant. | 97.5% quant. |
---|---|---|---|---|---|

\(\theta_1\) | -1 | -0.9440 | 0.0653 | -1.0702 | -0.8134 |

\(\theta_2\) | 0 | -0.1906 | 0.1204 | -0.4219 | 0.0517 |

\(\theta_3\) | 1 | 0.9571 | 0.3182 | 0.3293 | 1.5810 |

Figure 5.3 shows the simulated values, the predicted (posterior mean) and the projected posterior standard deviation. Here, it can be seen how the predicted values are similar to the simulated true values.

## 5.2 The Barrier model

The most common spatial models are the stationary isotropic ones. For these, moving or rotating the map does not change the model. Even though this is a reasonable assumption in many cases, it is not a reasonable assumption if there are physical barriers in the study area. For example, when we model aquatic species near the coast, a stationary model cannot be aware of the coastline, and a new model is needed (one which is aware of the coastline).

In the paper by Bakka et al. (2016) a new non-stationary model was constructed for
use in `INLA`

, with a syntax very similar to the stationary model (to make it
easy to use). In this section we give an example using this model on a part of
the Canadian coastline.

### 5.2.1 Canadian coastline example

The example developed here is based on simulated data using the Canadian coastline as a physical barrier to spatial correlation. This example requires some manipulation of maps and polygons in order to create an appropriate mesh for the model, and we have loaded additional packages for this.

#### 5.2.1.1 Polygon

First, we construct some spatial polygon covering our study area. Here we do
this via the `mapdata`

package (Becker, Wilks, and Brownrigg 2016):

```
# Select region
map <- map("world", "Canada", fill = TRUE,
col = "transparent", plot = FALSE)
IDs <- sapply(strsplit(map$names, ":"), function(x) x[1])
map.sp <- map2SpatialPolygons(
map, IDs = IDs,
proj4string = CRS("+proj=longlat +datum=WGS84"))
```

Next, we define our study area `pl.sel`

, as a manually constructed polygon, and
intersect this with the coastal area. Since we have a polygon for land, we
take the difference instead of an intersection, as follows:

```
pl.sel <- SpatialPolygons(list(Polygons(list(Polygon(
cbind(c(-69, -62.2, -57, -57, -69, -69),
c(47.8, 45.2, 49.2, 52, 52, 48)),
FALSE)), '0')), proj4string = CRS(proj4string(map.sp)))
poly.water <- gDifference(pl.sel, map.sp)
```

The Canadian coastline and the manually constructed polygon can be seen on the left plot in Figure 5.4.

#### 5.2.1.2 Transforming to UTM coordinates

So far we have used longitude and latitude in this example, but this is not a valid coordinate system for spatial modeling, as distances in longitude and distances in latitude are in different scales. Modeling should be done in a CRS (Coordinate Reference System) where units along each axis is measured in e.g. kilometers. We do not want to use meters as this would result in very large values along the axes, and could cause unstable numerical results.

The projection is done with function `spTransform()`

as follows:

#### 5.2.1.3 Simple mesh

Before we construct the mesh we are going to use, we first show how to construct a mesh only in water. We then discuss the difference between the two approaches.

The created mesh has 1106 nodes. This mesh has been plotted in Figure 5.4 (right plot).

Before developing the Barrier model, we used to construct this simple mesh and use the SPDE model here, as this avoids smoothing across land. The problem with this approach, however, is that we introduce other hidden assumptions, called Neumann boundary conditions. As shown by Bakka et al. (2016), this assumption can lead to even worse behavior than the stationary model (which smooths over islands).

#### 5.2.1.4 Mesh over land and sea

Next, we construct a mesh over both water and land. This is the mesh we are going to use. We include the coastline polygon in the function call, to make the edges of the triangles follow the coastline.

```
max.edge = 30
bound.outer = 150
mesh <- inla.mesh.2d(boundary = poly.water,
max.edge = c(1,5) * max.edge,
cutoff = 2,
offset = c(max.edge, bound.outer))
```

Next, we select the triangles of the mesh that are inside `poly.water`

. We
define the triangles in the barrier area (i.e. on land) to be all the triangles
that are not on water. From this we construct the `poly.barrier`

. This should
match the original land polygon closely, but will deviate a little. The new
polygon `poly.barrier`

is where our model assumes there to be land; hence we
use this polygon also for plotting the results.

```
water.tri = inla.over_sp_mesh(poly.water, y = mesh,
type = "centroid", ignore.CRS = TRUE)
num.tri = length(mesh$graph$tv[, 1])
barrier.tri = setdiff(1:num.tri, water.tri)
poly.barrier = inla.barrier.polygon(mesh,
barrier.triangles = barrier.tri)
```

The mesh in Figure 5.5 is different from the simple mesh in Figure 5.4, as we have also defined the mesh over land. We can use this mesh to construct both a stationary model and a Barrier model.

#### 5.2.1.5 The Barrier model vs. the stationary model

Next we define the precision matrix for the Barrier model and the stationary model

\[ \mathbf{u} \sim \mathcal N(0, Q^{-1}) \] considering unit marginal variance:

```
range <- 200
barrier.model <- inla.barrier.pcmatern(mesh,
barrier.triangles = barrier.tri)
Q <- inla.rgeneric.q(barrier.model, "Q", theta = c(0, log(range)))
```

This code spits out a warning because the priors are not defined. However, as we are simulating from the GF, the priors for the hyperparameters are irrelevant.

```
stationary.model <- inla.spde2.pcmatern(mesh,
prior.range = c(1, 0.1), prior.sigma = c(1, 0.1))
Q.stat <- inla.spde2.precision(stationary.model,
theta = c(log(range), 0))
```

The priors we chose here are arbitrary as they are not used. Note that the
ordering of `theta`

is reversed between the two models.

We use the functions from `spde-book-functions.R`

to compute spatial
correlation surfaces; see Figure 5.6.

```
# The location we find the correlation with respect to
loc.corr <- c(500, 5420)
corr <- book.spatial.correlation(Q, loc = loc.corr, mesh)
corr.stat <- book.spatial.correlation(Q.stat, loc = loc.corr,
mesh)
```

#### 5.2.1.6 Sample from the Barrier model

Next, we sample locations from the water polygon using `spsample()`

. And we
simplify the data structure `loc.data`

from `SpatialPoints`

to just a matrix of
values.

```
set.seed(201805)
loc.data <- spsample(poly.water, n = 1000, type = "random")
loc.data <- loc.data@coords
```

We sample the continuous field \(\mathbf{u}\) and project it to data locations:

```
# Seed is the month the code was first written times some number
u <- inla.qsample(n = 1, Q = Q, seed = 201805 * 3)[, 1]
A.data <- inla.spde.make.A(mesh, loc.data)
u.data <- A.data %*% u
# df is the dataframe used for modeling
df <- data.frame(loc.data)
names(df) <- c('locx', 'locy')
# Size of the spatial signal
sigma.u <- 1
# Size of the measurement noise
sigma.epsilon <- 0.1
df$y <- drop(sigma.u * u.data + sigma.epsilon * rnorm(nrow(df)))
```

#### 5.2.1.7 Inference with the Barrier model

Similarly as with previous models, we construct the typical stack object to prepare the data for model fitting:

```
stk <- inla.stack(
data = list(y = df$y),
A = list(A.data, 1),
effects =list(s = 1:mesh$n, intercept = rep(1, nrow(df))),
tag = 'est')
```

The formula for the model only takes an intercept plus the spatial effect:

Finally, we run `INLA`

with a Gaussian likelihood:

#### 5.2.1.8 Posterior spatial field

In Figure 5.7 we compare the results from the Barrier model to the true (simulated) spatial field. These have a close match. We observe quick changes across islands or small peninsulas, which may be smoothed out by a stationary model. See Bakka et al. (2016) for a more detailed discussion.

#### 5.2.1.9 Posterior of hyperparameters

The summary of the posterior marginals of the model hyperparameters is:

```
res.barrier$summary.hyperpar
## mean sd
## Precision for the Gaussian observations 95.016 6.2467
## Theta1 for s 0.169 0.1583
## Theta2 for s 5.516 0.1723
## 0.025quant 0.5quant
## Precision for the Gaussian observations 83.2403 94.8421
## Theta1 for s -0.1059 0.1552
## Theta2 for s 5.2172 5.5010
## 0.975quant mode
## Precision for the Gaussian observations 107.8314 94.538
## Theta1 for s 0.5109 0.104
## Theta2 for s 5.8884 5.445
```

To summarize or plot the hyperparameters in the Barrier model, we must know the order they come in and the transformations:

\[\begin{align} \sigma &= e^ {\theta_1} \quad \text{is the marginal standard deviation} \\ r &= e^ {\theta_2} \quad \text{is the spatial range} \end{align}\]

From this we compute Table 5.3. We see that our estimates and intervals recover the true values.

Parameter | True | 50% quant. | 2.5% quant. | 97.5% quant. |
---|---|---|---|---|

\(\sigma\) | 1 | 1.168 | 0.8995 | 1.667 |

\(range\) | 200 | 244.941 | 184.4189 | 360.811 |

## 5.3 Barrier model for noise data in Albacete (Spain)

Barrier models can be appropriate tools to model different types of environmental phenomena that have a clear anisotropic behavior. A clear example is the propagation of noise in cities, where buildings act as barriers in the propagation of noise. In this section, spatial and spatio-temporal models are developed using noise data from the city of Albacete (Castilla-La Mancha, Spain).

Data has been extracted from a report commissioned by the local authorities due to increasing concern about noise in the city and, in particular, in certain areas of the city center. The analysis will focus on a busy area in the city center (popularly know as “La zona”, i.e., “The zone”) with plenty of bars and restaurants. The original reports are available (as PDF documents) from website http://www.pioneraconsultores.com/es/mapa-de-ruidos.zhtm?corp=medioambiente, which contains the original data.

Measurements were taken in March 2010 for a period of 24 hours at different points throughout the city. In this analysis, hourly measurements of sound pressure levels (in A-weighted decibels, or dbA) at 7 different points within the study region will be considered. Local regulations require noise levels to be below 65 db during the day and below 60 db at night in residential areas. Hence, the aim of this example is to study how noise levels vary within the city center to get an idea of whether local regulations are fulfilled.

The data used in this example is provided as a
`SpatialPointsDataFrame`

in file `noise.RData`

. Table 5.4
summarizes the main variables in the dataset.

Variable | Description |
---|---|

`X` |
x-coordinate (in UTM). |

`Y` |
y-coordinate (in UTM). |

`LAeqZZh` |
Hourly sound pressure level (in dbA) between `ZZ` and `ZZ + 1` hours. `ZZ` is an integer from 1 to 24. |

#### 5.3.0.1 Data loading

Data about the locations of the buildings, which will act as barriers, will be
obtained from OpenStreetMap (OSM, OpenStreetMap contributors 2018) using package `osmar`

(Eugster and Schlesinger 2013). This package provides a simple way to download OSM data and
extract the relevant information required to build the desired models.

In the next lines of `R`

code, function `osmsource_api()`

sets up access to
the OSM API. Next, function `center_bbox`

is used to set the squared region
where the data is extracted from. The first two arguments are the coordinates
of the center of the rectangle, and the next two its width and height.
Finally, function `get_osm()`

gets the data from the defined bounding box using
the OSM API.

```
#Get building data from Albacete using OpenStreetMap
library(osmar)
src <- osmsource_api()
bb <- center_bbox(-1.853152, 38.993318, 400, 400)
ua <- get_osm(bb, source = src)
```

OSM data provides a wealth of information about street data in variable `ua`

.
Since our focus is on modeling noise using buildings as barriers, the
information about buildings will be extracted. Functions `find()`

and
`find_down()`

are used below to obtain an index of all features in the OSM data
in `ua`

tagged as `building`

. Then `subset()`

is used to select the elements in
`ua`

that are in the index of buildings. Finally, function `as_sp`

converts the
boundaries of the buildings (as associated information) into a
`SpatialPolygonsDataFrame`

.

```
idx <- find(ua, way(tags(k == "building")))
idx <- find_down(ua, way(idx))
bg <- subset(ua, ids = idx)
bg_poly <- as_sp(bg, "polygons")
```

The newly created `SpatialPolygonsDataFrame`

object stores different buildings
as separate entities. Hence, we will use function `unionSpatialPolygons()`

to
create a single entity, which will be used later to obtain the boundaries of
the streets.

Next, the actual study area is defined by using function `center_bbox()`

again, this time with a smaller width and height, to make sure that all
buildings in the study area have been retrieved before. Finally,
a `SpatialPolygons`

object is created and the intersection between
the boundary of the study area and the polygons of the buildings is
created using function `gIntersection`

from the `rgeos`

package (R. Bivand and Rundel 2017).

```
#Outer boundary to overlay with buildings
bb.outer <- center_bbox(-1.853152, 38.993318, 350, 350)
pl <- matrix(c(bb.outer[1:2], bb.outer[c(3, 2)], bb.outer[3:4],
bb.outer[c(1, 4)]), ncol = 2, byrow = TRUE)
pl_sp <- SpatialPolygons(
list(Polygons(list(Polygon(pl)), ID = 1)),
proj4string = CRS(proj4string(bg_poly)))
library(rgeos)
bg_poly2 <- gIntersection(bg_poly, pl_sp, byid = TRUE)
```

Coordinates of the buildings are in longitude and latitude, which is not
an adequate coordinate reference system to deal with distances. For this
reason, coordinates will be converted into UTM, so that distances
are measured in meters. For this, function `spTransform()`

will be
used:

```
## Transform data
library(rgdal)
pl_sp.utm30 <- spTransform(pl_sp,
CRS("+proj=utm +zone=30 +ellps=GRS80 +units=m +no_defs"))
bg_poly2.utm30 <- spTransform(bg_poly2,
CRS("+proj=utm +zone=30 +ellps=GRS80 +units=m +no_defs"))
```

The last part of the building data processing involves obtaining not the polygons of the buildings, but the polygons of the streets, which is where the spatial process takes place. This will be done by taking the bounding box of the study region and removing the buildings, as follows:

Figure 5.8 shows the buildings (left plot) and the streets (right plot) that will be used in the analysis.

Noise data is provided in a `data.frame`

in file `noise.RData`

, and it
can be loaded as:

This dataset contains data from 7 locations where noise has been measured. This includes 5 outdoors locations plus 2 indoors locations. Although noise may be attenuated at these 2 locations they have been kept in the dataset to have more observations in the analysis.

#### 5.3.0.2 Mesh building

The mesh will be built similarly to that in Section 5.2.1. In this case, distances are measured in meters. The maximum edges of the triangles will be 5 meters and the outer boundary will be 10 meters:

```
max.edge = 5
bound.outer = 10
mesh <- inla.mesh.2d(boundary = pl_sp.utm30,
max.edge = c(1, 1.5) * max.edge,
cutoff = 10,
offset = c(max.edge, bound.outer))
```

The next lines of `R`

code will use function `inla.over_sp_mesh()`

(to
check which mesh triangles are inside a polygon) and call function
`inla.barrier.polygon()`

to obtain the polygon around the barrier. Figure
5.9 shows the mesh created, together with the streets and the
polygon around the barrier obtained with `inla.barrier.polygon()`

(in red).

```
city.tri = inla.over_sp_mesh(bg_poly2.utm30, y = mesh,
type = "centroid", ignore.CRS = TRUE)
num.tri <- length(mesh$graph$tv[, 1])
barrier.tri <- setdiff(1:num.tri, city.tri)
poly.barrier <- inla.barrier.polygon(mesh,
barrier.triangles = barrier.tri)
```

#### 5.3.0.3 Barrier model for noise data

Before fitting the actual barrier model to the noise data, we will explore how
the spatial correlation in the barrier model is defined. First of all, a model
with range 100 and precision 1 is simulated. Here, function
`inla.barrier.pcmatern()`

is used to define a barrier model using the
`rgeneric`

approach in `INLA`

to define latent models; see
`inla.doc("rgeneric")`

for details. Furthermore, `inla.rgeneric.q()`

returns
the precision matrix of the model for some values of the hyperparameters, i.e.,
precision and range (in the log scale).

```
range <- 100
prec <- 1
barrier.model <- inla.barrier.pcmatern(mesh,
barrier.triangles = barrier.tri)
Q <- inla.rgeneric.q(barrier.model, "Q",
theta = c(log(prec), log(range)))
```

In order to compare the Barrier model, a stationary model is defined below.
Function `inla.spde2.pcmatern()`

defines a stationary model using a PC prior
for the standard deviation and range, and function `inla.spde2.precision()`

returns the precision matrix of the model.

```
stationary.model <- inla.spde2.pcmatern(mesh,
prior.range = c(100, 0.9), prior.sigma = c(1, 0.1))
Q.stat <- inla.spde2.precision(stationary.model,
theta = c(log(range), 0))
```

Then, the correlation field at point `(599318.3, 4316661)`

is computed
for the Barrier and stationary models:

```
# The location we find the correlation with respect to
loc.corr <- c(599318.3, 4316661)
corr <- book.spatial.correlation(Q, loc = loc.corr, mesh)
corr.stat = book.spatial.correlation(Q.stat, loc = loc.corr, mesh)
```

These are shown in Figure 5.10. Note how the spatial correlation for the stationary model (right plot in Figure 5.10) seems to ignore buildings and only considers Euclidean distance between two points, while the spatial autocorrelation for the Barrier model (left plot in Figure 5.10) does take buildings into account.

The first Barrier model will consider the spatial variation of noise
at 1 am, which is measured by variable `LAeq1h`

. The projector matrix
and stack required to fit the model can be obtained as follows:

```
A.data <- inla.spde.make.A(mesh, coordinates(noise))
stk <- inla.stack(
data = list(y = noise$LAeq1h),
A = list(A.data, 1),
effects =list(s = 1:mesh$n, intercept = rep(1, nrow(noise))),
tag = 'est')
```

Similarly, a new stack will be created for prediction at the mesh nodes:

```
# Projector matrix at prediction points
A.pred <- inla.spde.make.A(mesh, mesh$loc[, 1:2])
#Stack for prediction at mesh nodes
stk.pred <- inla.stack(
data = list(y = NA),
A = list(A.pred, 1),
effects =list(s = 1:mesh$n, intercept = rep(1, nrow(A.pred))),
tag = 'pred')
# Joint stack for model fitting and prediction
joint.stk <- inla.stack(stk, stk.pred)
```

Next, the formula to fit the model is:

Then, the model is fitted using a PC prior on the standard deviation parameter of the likelihood:

```
# PC-prior for st. dev.
stdev.pcprior <- list(prior = "pc.prec", param = c(2, 0.01))
# Model fitting
res.barrier <- inla(form.barrier,
data = inla.stack.data(joint.stk),
control.predictor = list(A = inla.stack.A(joint.stk),
compute = TRUE),
family = 'gaussian',
control.inla = list(int.strategy = "eb"),
control.family = list(hyper = list(prec = stdev.pcprior)),
control.mode = list(theta = c(1.647, 1.193, 3.975)))
```

The summary of the fitted model is shown below. Figure 5.11 shows estimates of the posterior mean and 97.5% quantile of the noise. As can be seen, it is clear that noise level at 1 am seems to be very close to or higher than the cut off of 65 dB.

```
summary(res.barrier)
##
## Call:
## c("inla(formula = form.barrier, family = \"gaussian\",
## data = inla.stack.data(joint.stk), ", "
## control.predictor = list(A = inla.stack.A(joint.stk),
## compute = TRUE), ", " control.family = list(hyper =
## list(prec = stdev.pcprior)), ", " control.inla =
## list(int.strategy = \"eb\"), control.mode = list(theta
## = c(1.647, ", " 1.193, 3.975)))")
## Time used:
## Pre = 2.61, Running = 313, Post = 2.56, Total = 318
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## intercept 64.98 1.505 62.03 64.98 67.93 64.98 0
##
## Random effects:
## Name Model
## s RGeneric2
##
## Model hyperparameters:
## mean sd 0.025quant
## Precision for the Gaussian observations 8.08 8.735 1.076
## Theta1 for s 1.21 0.210 0.802
## Theta2 for s 3.98 0.396 3.212
## 0.5quant 0.975quant
## Precision for the Gaussian observations 5.48 30.82
## Theta1 for s 1.20 1.63
## Theta2 for s 3.98 4.77
## mode
## Precision for the Gaussian observations 2.69
## Theta1 for s 1.19
## Theta2 for s 3.98
##
## Expected number of effective parameters(stdev): 6.93(0.00)
## Number of equivalent replicates : 1.01
##
## Marginal log-Likelihood: -30.16
## Posterior marginals for the linear predictor and
## the fitted values are computed
```

#### 5.3.0.4 Space-time model

Space-time models are fully described in Chapter 7 and Chapter 8, but we have included here a simple example using the noise data. Given that information is available for 24 hours, it is possible to fit a space-time model to the noise data. First, a projector matrix is defined for the measurement points for the 24 hours:

The `INLA`

stack is now created by adding variables `LAeq1h`

, … `LAeq24h`

together with a `time`

identifier (from 1 to 24 hours):

```
stk.st <- inla.stack(
data = list(y = unlist(noise@data[, 2 + 1:24])),
A = list(A.st, 1, 1),
effects = list(s = 1:mesh$n,
intercept = rep(1, 24 * nrow(noise)),
time = rep(1:24, each = nrow(noise) )))
```

The model to be fitted is similar to the previous one, but now a cyclic random walk of order 1 is added as a separable time effect:

```
# Model formula
form.barrier.st <- y ~ 0 + intercept +
f(s, model = barrier.model) +
f(time, model = "rw1", cyclic = TRUE, scale.model = TRUE,
hyper = list(theta = stdev.pcprior))
```

Note that a PC prior has been used on the standard deviation parameter of the random walk effect. Then the model is fitted similarly as before:

```
res.barrier.st <- inla(form.barrier.st,
data = inla.stack.data(stk.st),
control.predictor = list(A = inla.stack.A(stk.st)),
family = 'gaussian',
control.inla = list(int.strategy = "eb"),
control.family = list(hyper = list(prec = stdev.pcprior)),
control.mode = list(theta = c(-1.883, 1.123, 3.995, -0.837)))
```

A summary of the fitted model can be seen below. Figure 5.12 shows noise estimates at 6, 12, 18 and 24 hours.

```
summary(res.barrier.st)
##
## Call:
## c("inla(formula = form.barrier.st, family =
## \"gaussian\", data = inla.stack.data(stk.st), ", "
## control.predictor = list(A = inla.stack.A(stk.st)),
## control.family = list(hyper = list(prec =
## stdev.pcprior)), ", " control.inla = list(int.strategy
## = \"eb\"), control.mode = list(theta = c(-1.883, ", "
## 1.123, 3.995, -0.837)))")
## Time used:
## Pre = 3.37, Running = 460, Post = 0.317, Total = 464
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode
## intercept -0.001 31.62 -62.08 -0.001 62.03 -0.001
## kld
## intercept 0
##
## Random effects:
## Name Model
## s RGeneric2
## time RW1 model
##
## Model hyperparameters:
## mean sd 0.025quant
## Precision for the Gaussian observations 0.152 0.017 0.120
## Theta1 for s 1.137 0.253 0.647
## Theta2 for s 4.198 0.740 2.853
## Precision for time 0.536 0.326 0.163
## 0.5quant 0.975quant
## Precision for the Gaussian observations 0.151 0.188
## Theta1 for s 1.134 1.641
## Theta2 for s 4.156 5.747
## Precision for time 0.455 1.386
## mode
## Precision for the Gaussian observations 0.150
## Theta1 for s 1.123
## Theta2 for s 4.002
## Precision for time 0.336
##
## Expected number of effective parameters(stdev): 17.53(0.00)
## Number of equivalent replicates : 9.59
##
## Marginal log-Likelihood: -443.29
```

### References

Ingebrigtsen, R., F. Lindgren, and I. Steinsland. 2014. “Spatial Models with Explanatory Variables in the Dependence Structure.” *Spatial Statistics* 8: 20–38.

Lindgren, F., H. Rue, and J. Lindström. 2011. “An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach (with Discussion).” *J. R. Statist. Soc. B* 73 (4): 423–98.

Lindgren, F., and H. Rue. 2015. “Bayesian Spatial and Spatio-Temporal Modelling with R-INLA.” *Journal of Statistical Software* 63 (19).

Bakka, H., J. Vanhatalo, J. Illian, D. Simpson, and H. Rue. 2016. “Non-stationary Gaussian models with physical barriers.” *ArXiv E-Prints*, August.

Becker, R. A., A. R. Wilks, and R. Brownrigg. 2016. *Mapdata: Extra Map Databases*. https://CRAN.R-project.org/package=mapdata.

OpenStreetMap contributors. 2018. “Planet dump retrieved from https://planet.osm.org.” https://www.openstreetmap.org.

Eugster, Manuel J. A., and Thomas Schlesinger. 2013. “Osmar: OpenStreetMap and R.” *The R Journal* 5 (1): 53–63. http://journal.r-project.org/archive/2013/RJ-2013-005/index.html.

Bivand, R., and C. Rundel. 2017. *Rgeos: Interface to Geometry Engine - Open Source (’Geos’)*. https://CRAN.R-project.org/package=rgeos.