In this tutorial, we show how to use the API SPDE. This is a free transcription in R of the python note book tuto_SPDE.
Lindgren et al. (2011) defined an explicit link between Gaussian fields and Gaussian markov random fields. This approach is implemented in the SPDE module of \(gstlearn\). The random field is a weak solution of some stochastic partial differential equation solved using the finite element method. It follows that the standard tools of Geostatistics, kriging and stochastic simulation, can be rewritten using sparse linear algebra.
The SPDE model represents the domain by a mesh and random field by its values for each mesh cell \(\mathbf{Z} \sim \mathcal{N}(\mathbf{0,Q^{-1}})\) where the precision matrix \(\mathbf{Q = \Sigma^{-1}}\) is sparse.
The latent vector \(\mathbf{Z}\) can be linearly interpolated to:
the target locations \(\mathbf{Y_T = A_g \, Z}\), or
the observation locations \(\mathbf{Y = A_d \, Z + \tau W}\).
In the latter case, \(\tau\) is the standard deviation of the noise modelling the error of observation and the modelling error. The number of observations is \(p\).
In this case, the precision matrix of the vector \(\mathbf{(Z, Y_D)}\) is: \[ \mathbf{ \tilde{Q} = \tilde{\Sigma}^{-1}= \begin{bmatrix}\mathbf{Q+\tau^{-2}A_d^T A_d} & \mathbf{-\tau^{-2}A_d^T} \\ \mathbf{-\tau^{-2}A_d} & \mathbf{\tau^{-2}I_p} \end{bmatrix} } \] From this expression the kriging and the conditional simulation can be derived:
the Kriging of \(\mathbf{Z}\) is \(E\{\mathbf{Z|Y = y}\} = \tau^{-2}\mathbf{(Q + \tau^{-2}A_d^TA_d)^{-1}A_d^T y}\)
the conditional variance is \(Cov\{\mathbf{Z|Y = y}\} = \mathbf{(Q + \tau^{-2}A_d^TA_d)^{-1}}\)
the non conditional simulation is \(\mathbf{(S, S_D)\sim \mathcal{N}(0, \tilde{Q}^{-1})}\)
the conditional simulation of the latent vector is \(\mathbf{S_{|Y=y} = S + \tau^{-2} (Q+ \tau^{-2}A_d^TA_d)^{-1} A_d^T(y - S_D)}\)
The estimated or simulated latent vector can be linearly interpolated to any target grid.
The latent vector (of length \(n\)) \(\mathbf{Z \sim \mathcal{N}(0, Q^{-1})}\) is defined on the meshing by its sparse precision matrix \(\mathbf{Q = \Sigma^{-1}}\). \(\mathbf{Q}\) is factorized by the Cholesky method: \(\mathbf{Q = L\, L^{T}}\).
Thus, the Gaussian vector can be rewritten \(\mathbf{Z = (L^T)^{-1} \, U}\) with \(\mathbf{U \sim \mathcal{N}(0, I_n)}\).
Finally the Gaussian vector collecting the values of the random field at the grid nodes \(Y\) is achieved by the interpolation of \(\mathbf{Z}\) on the mesh \(\mathbf{Y_g = A_{g} \, Z}\).
To compute a non conditional simulation on the grid,
# Data
ndat = 100
# Model
rangev = 0.2
sill = 1.
# parameters of the final grid on [0,1]^2
nx = 200 * c(1,1)
dx = 1/(nx-1)
x0 = c(0,0)
# Types of statistics
opers = EStatOption_fromKeys(c("NUM", "MINI", "MAXI", "MEAN", "STDV"))
grid = DbGrid_create(nx,dx,x0)
We define a Matern’s stationary covariance function.
model = Model_createFromParam(ECov_MATERN(), param=1,range=rangev,sill=sill)
We define the observations from the initial non conditional simulation.
dat = Db_create()
err = dat$setColumn(tab = runif(n=ndat), name = "x")
err = dat$setColumn(tab = runif(n=ndat), name = "y")
err = dat$setLocators(names = c("x", "y"), locatorType = ELoc_X(), cleanSameLocator = TRUE)
We perform a quick non conditional simulation on the data location, which will serve for conditioning data set.
err = dat$deleteColumns(names = "Z")
err = simulateSPDE(dbin = NULL, dbout = dat, model = model,
namconv = NamingConvention("Z"))
Plotting the data set
p = plot.init(asp=1) +
plot.symbol(dat, nameSize = "Z", col = "red") +
plot.decoration(xlab = "Easting", ylab = "Northing", title = "Data Set")
plot.end(p)
All traditional methods require a neighborhood definition: it will be defined as a Unique neighborhood.
neighU = NeighUnique()
Kriging is computed using the traditional method.
err = grid$deleteColumns(names = "standardK.*")
err = kriging(dbin = dat, dbout = grid, model = model, neigh = neighU,
namconv = NamingConvention("standardK"))
Plotting the results
p1 = plot.init(asp=1) + plot.raster(grid, name="standardK.Z.estim", palette = "Spectral") +
plot.symbol(dat, col = "red") + plot.decoration(title = "Estimation")
p2 = plot.init(asp=1) + plot.raster(grid, name="standardK.Z.stdev", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "St. dev.")
ggarrange(p1, p2, ncol = 2, nrow = 2)
The conditional simulation is performed using the traditional technique (turning bands)
err = simtub(dbin = dat, dbout = grid, model = model, neigh = neighU, nbsimu = 4,
namconv = NamingConvention("standardCD"))
Plotting the conditional simulations
p1 = plot.init(asp=1) + plot.raster(grid, name="standardCD.Z.1", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD1")
p2 = plot.init(asp=1) + plot.raster(grid, name="standardCD.Z.2", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD2")
p3 = plot.init(asp=1) + plot.raster(grid, name="standardCD.Z.3", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD3")
p4 = plot.init(asp=1) + plot.raster(grid, name="standardCD.Z.4", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD4")
ggarrange(p1, p2, p3, p4, ncol = 2, nrow = 2)
err = grid$deleteColumns(names = "spdeNC.*")
err = simulateSPDE(dbin = NULL, dbout = grid, model = model, nbsimu=4,
namconv = NamingConvention("spdeNC"))
Plotting the non conditional simulations
p1 = plot.init(asp=1) + plot.raster(grid, name="spdeNC.1", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation NC1")
p2 = plot.init(asp=1) + plot.raster(grid, name="spdeNC.2", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation NC2")
p3 = plot.init(asp=1) + plot.raster(grid, name="spdeNC.3", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation NC3")
p4 = plot.init(asp=1) + plot.raster(grid, name="spdeNC.4", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation NC4")
ggarrange(p1, p2, p3, p4, ncol = 2, nrow = 2)
err = grid$deleteColumns(names = "spdeCD.*")
err = simulateSPDE(dbin = dat, dbout = grid, model = model, nbsimu=4,
namconv = NamingConvention("spdeCD"))
Plotting the conditional simulations
p1 = plot.init(asp=1) + plot.raster(grid, name="spdeCD.Z.1", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD1")
p2 = plot.init(asp=1) + plot.raster(grid, name="spdeCD.Z.2", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD2")
p3 = plot.init(asp=1) + plot.raster(grid, name="spdeCD.Z.3", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD3")
p4 = plot.init(asp=1) + plot.raster(grid, name="spdeCD.Z.4", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD4")
ggarrange(p1, p2, p3, p4, ncol = 2, nrow = 2)
err = grid$deleteColumns(names = c("spdeK*.Z.*"))
err = krigingSPDE(dbin = dat, dbout = grid, model = model,
flag_est = TRUE, flag_std = TRUE,
namconv = NamingConvention("spdeK"))
p1 = plot.init(asp=1) + plot.raster(grid, name="spdeK.Z.estim", palette = "Spectral") +
plot.symbol(dat, col = "red") + plot.decoration(title = "Estimation")
p2 = plot.init(asp=1) + plot.raster(grid, name="spdeK.Z.stdev", palette = "Spectral") +
plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "St. dev.")
ggarrange(p1, p2, ncol = 2, nrow = 2)
Correlation between traditional and SPDE method for kriging estimation values
p = plot.init() +
plot.correlation(db1 = grid,
namex = "standardK.Z.estim",
namey = "spdeK.Z.estim", asPoint = FALSE, bins=100,
flagDiag = TRUE, diagColor = "red", flagSameAxes = FALSE) +
plot.decoration(xlab = "Traditional", ylab = "SPDE",
title = "Comparison for Kriging")
plot.end(p)
Correlation between traditional and SPDE method for kriging Standard deviation values
p = plot.init() +
plot.correlation(db1 = grid,
namex = "standardK.Z.stdev",
namey = "spdeK.Z.stdev", asPoint = FALSE, bins=100,
flagDiag = TRUE, diagColor = "red", flagSameAxes = FALSE) +
plot.decoration(xlab = "Traditional", ylab = "SPDE",
title = "Comparison for St. Dev.")
plot.end(p)
Comparative statistics
# statistics
knitr::kable(dbStatisticsMono(db = grid, names = "*.estim", opers = opers)$toTL(),
digits = 3, caption = "Estimations")
Number | Minimum | Maximum | Mean | St. Dev. | |
---|---|---|---|---|---|
standardK.Z.estim | 40000 | -2.670 | 1.845 | -0.105 | 0.717 |
spdeK.Z.estim | 40000 | -2.757 | 1.851 | -0.107 | 0.719 |
knitr::kable(dbStatisticsMono(db = grid, names = "*.stdev", opers = opers)$toTL(),
digits = 3, caption = "St. dev. of Estimation error")
Number | Minimum | Maximum | Mean | St. Dev. | |
---|---|---|---|---|---|
standardK.Z.stdev | 40000 | 0.005 | 0.997 | 0.651 | 0.195 |
spdeK.Z.stdev | 40000 | 0.067 | 1.040 | 0.640 | 0.197 |
Produce some auxiliary statistics for simulations
knitr::kable(dbStatisticsMono(db = grid, names = "*NC.*", opers = opers)$toTL(),
digits = 3, caption = "Non conditional simulations")
Number | Minimum | Maximum | Mean | St. Dev. | |
---|---|---|---|---|---|
spdeNC.1 | 40000 | -3.291 | 2.787 | 0.052 | 0.997 |
spdeNC.2 | 40000 | -4.367 | 3.226 | -0.165 | 0.963 |
spdeNC.3 | 40000 | -3.488 | 3.529 | -0.013 | 1.049 |
spdeNC.4 | 40000 | -3.285 | 3.346 | -0.050 | 1.007 |
knitr::kable(dbStatisticsMono(db = grid, names = "*CD.*", opers = opers)$toTL(),
digits = 3, caption = "Conditional simulations")
Number | Minimum | Maximum | Mean | St. Dev. | |
---|---|---|---|---|---|
standardCD.Z.1 | 40000 | -3.282 | 2.692 | -0.084 | 0.970 |
standardCD.Z.2 | 40000 | -3.272 | 2.716 | -0.077 | 0.962 |
standardCD.Z.3 | 40000 | -3.390 | 2.709 | -0.120 | 0.952 |
standardCD.Z.4 | 40000 | -3.157 | 2.807 | -0.006 | 0.963 |
spdeCD.Z.1 | 40000 | -3.567 | 2.575 | -0.206 | 1.019 |
spdeCD.Z.2 | 40000 | -3.374 | 3.312 | -0.080 | 1.010 |
spdeCD.Z.3 | 40000 | -3.506 | 2.965 | -0.089 | 0.886 |
spdeCD.Z.4 | 40000 | -4.082 | 2.496 | -0.243 | 1.043 |
Lindgren, F., Rue, H., and Lindström, J. (2011). An explicit link between gaussian fields and gaussian markov random fields: the spde approach (with discussion). JR 671 Stat Soc, Series B, 73:423–498.
Pereira, M., Desassis, N., & Allard, D. (2022). Geostatistics for Large Datasets on Riemannian Manifolds: A Matrix-Free Approach. Journal of Data Science, 20(4), 512-532. doi:10.6339/22-JDS1075