Introduction

In this tutorial, we show how to use the API SPDE. This is a free transcription in R of the python note book tuto_SPDE.

SPDE approach

Lindgren et al. (2011) defined an explicit link between Gaussian fields and Gaussian markov random fields. This approach is implemented in the SPDE module of \(gstlearn\). The random field is a weak solution of some stochastic partial differential equation solved using the finite element method. It follows that the standard tools of Geostatistics, kriging and stochastic simulation, can be rewritten using sparse linear algebra.

The SPDE model represents the domain by a mesh and random field by its values for each mesh cell \(\mathbf{Z} \sim \mathcal{N}(\mathbf{0,Q^{-1}})\) where the precision matrix \(\mathbf{Q = \Sigma^{-1}}\) is sparse.

The latent vector \(\mathbf{Z}\) can be linearly interpolated to:

the target locations \(\mathbf{Y_T = A_g \, Z}\), or
the observation locations \(\mathbf{Y = A_d \, Z + \tau W}\).

In the latter case, \(\tau\) is the standard deviation of the noise modelling the error of observation and the modelling error. The number of observations is \(p\).

In this case, the precision matrix of the vector \(\mathbf{(Z, Y_D)}\) is: \[ \mathbf{ \tilde{Q} = \tilde{\Sigma}^{-1}= \begin{bmatrix}\mathbf{Q+\tau^{-2}A_d^T A_d} & \mathbf{-\tau^{-2}A_d^T} \\ \mathbf{-\tau^{-2}A_d} & \mathbf{\tau^{-2}I_p} \end{bmatrix} } \] From this expression the kriging and the conditional simulation can be derived:

the Kriging of \(\mathbf{Z}\) is \(E\{\mathbf{Z|Y = y}\} = \tau^{-2}\mathbf{(Q + \tau^{-2}A_d^TA_d)^{-1}A_d^T y}\)
the conditional variance is \(Cov\{\mathbf{Z|Y = y}\} = \mathbf{(Q + \tau^{-2}A_d^TA_d)^{-1}}\)
the non conditional simulation is \(\mathbf{(S, S_D)\sim \mathcal{N}(0, \tilde{Q}^{-1})}\)
the conditional simulation of the latent vector is \(\mathbf{S_{|Y=y} = S + \tau^{-2} (Q+ \tau^{-2}A_d^TA_d)^{-1} A_d^T(y - S_D)}\)

The estimated or simulated latent vector can be linearly interpolated to any target grid.

The latent vector (of length \(n\)) \(\mathbf{Z \sim \mathcal{N}(0, Q^{-1})}\) is defined on the meshing by its sparse precision matrix \(\mathbf{Q = \Sigma^{-1}}\). \(\mathbf{Q}\) is factorized by the Cholesky method: \(\mathbf{Q = L\, L^{T}}\).

Thus, the Gaussian vector can be rewritten \(\mathbf{Z = (L^T)^{-1} \, U}\) with \(\mathbf{U \sim \mathcal{N}(0, I_n)}\).

Finally the Gaussian vector collecting the values of the random field at the grid nodes \(Y\) is achieved by the interpolation of \(\mathbf{Z}\) on the mesh \(\mathbf{Y_g = A_{g} \, Z}\).

To compute a non conditional simulation on the grid,

Compute the projection matrix of the latent vector \(\mathbf{Z}\) to the grid/
Compute \(\mathbf{L}\) the Cholesky decomposition of the sparse precision matrix \(\mathbf{Q}\)
Compute the normal Gaussian vector \(\mathbf{U \sim N(0, I_n)}\),
Compute the latent vector \(\mathbf{Z}\) solving the sparse linear system \(\mathbf{L^{T} \, Z = U}\),
Compute the interpolation of the latent vector to the target grid \(\mathbf{Y_g = A_{g} \, Z}\),

Initialization of gstlearn

Parameters of the SPDE model

# Data
ndat = 100

# Model
rangev = 0.2
sill = 1.

# parameters of the final grid on [0,1]^2
nx = 200 * c(1,1)
dx = 1/(nx-1)
x0 = c(0,0)

# Types of statistics
opers = EStatOption_fromKeys(c("NUM", "MINI", "MAXI", "MEAN", "STDV"))

Grid definition

grid = DbGrid_create(nx,dx,x0)

Model definition

We define a Matern’s stationary covariance function.

model = Model_createFromParam(ECov_MATERN(), param=1,range=rangev,sill=sill)

Observations of the process

We define the observations from the initial non conditional simulation.

dat = Db_create()
err = dat$setColumn(tab = runif(n=ndat), name = "x")
err = dat$setColumn(tab = runif(n=ndat), name = "y")
err = dat$setLocators(names = c("x", "y"), locatorType = ELoc_X(), cleanSameLocator = TRUE)

We perform a quick non conditional simulation on the data location, which will serve for conditioning data set.

err = dat$deleteColumns(names = "Z")
err = simulateSPDE(dbin = NULL, dbout = dat, model = model, 
                   namconv = NamingConvention("Z"))

Plotting the data set

p = plot.init(asp=1) + 
    plot.symbol(dat, nameSize = "Z", col = "red") +
    plot.decoration(xlab = "Easting", ylab = "Northing", title = "Data Set")
plot.end(p)

Using traditional methods

All traditional methods require a neighborhood definition: it will be defined as a Unique neighborhood.

neighU = NeighUnique()

Kriging

Kriging is computed using the traditional method.

err = grid$deleteColumns(names = "standardK.*")
err = kriging(dbin = dat, dbout = grid, model = model, neigh = neighU,
              namconv = NamingConvention("standardK"))

Plotting the results

p1 = plot.init(asp=1) + plot.raster(grid, name="standardK.Z.estim", palette = "Spectral") +
        plot.symbol(dat, col = "red") + plot.decoration(title = "Estimation")
p2 = plot.init(asp=1) + plot.raster(grid, name="standardK.Z.stdev", palette = "Spectral") +
        plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "St. dev.")
ggarrange(p1, p2, ncol = 2, nrow = 2)

Conditional simulations

The conditional simulation is performed using the traditional technique (turning bands)

err = simtub(dbin = dat, dbout = grid, model = model, neigh = neighU, nbsimu = 4, 
             namconv = NamingConvention("standardCD"))

Plotting the conditional simulations

p1 = plot.init(asp=1) + plot.raster(grid, name="standardCD.Z.1", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD1")
p2 = plot.init(asp=1) + plot.raster(grid, name="standardCD.Z.2", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD2")
p3 = plot.init(asp=1) + plot.raster(grid, name="standardCD.Z.3", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD3")
p4 = plot.init(asp=1) + plot.raster(grid, name="standardCD.Z.4", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD4")
ggarrange(p1, p2, p3, p4, ncol = 2, nrow = 2)

Using SPDE

Non conditional simulations

err = grid$deleteColumns(names = "spdeNC.*")
err = simulateSPDE(dbin = NULL, dbout = grid, model = model, nbsimu=4,
                   namconv = NamingConvention("spdeNC"))

Plotting the non conditional simulations

p1 = plot.init(asp=1) + plot.raster(grid, name="spdeNC.1", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation NC1")
p2 = plot.init(asp=1) + plot.raster(grid, name="spdeNC.2", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation NC2")
p3 = plot.init(asp=1) + plot.raster(grid, name="spdeNC.3", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation NC3")
p4 = plot.init(asp=1) + plot.raster(grid, name="spdeNC.4", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation NC4")
ggarrange(p1, p2, p3, p4, ncol = 2, nrow = 2)

Conditional simulations

err = grid$deleteColumns(names = "spdeCD.*")
err = simulateSPDE(dbin = dat, dbout = grid, model = model, nbsimu=4, 
                   namconv = NamingConvention("spdeCD"))

Plotting the conditional simulations

p1 = plot.init(asp=1) + plot.raster(grid, name="spdeCD.Z.1", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD1")
p2 = plot.init(asp=1) + plot.raster(grid, name="spdeCD.Z.2", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD2")
p3 = plot.init(asp=1) + plot.raster(grid, name="spdeCD.Z.3", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD3")
p4 = plot.init(asp=1) + plot.raster(grid, name="spdeCD.Z.4", palette = "Spectral") + 
      plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "Simulation CD4")
ggarrange(p1, p2, p3, p4, ncol = 2, nrow = 2)

Kriging

err = grid$deleteColumns(names = c("spdeK*.Z.*"))
err = krigingSPDE(dbin = dat, dbout = grid, model = model, 
                  flag_est = TRUE, flag_std = TRUE,
                  namconv = NamingConvention("spdeK"))

p1 = plot.init(asp=1) + plot.raster(grid, name="spdeK.Z.estim", palette = "Spectral") +
        plot.symbol(dat, col = "red") + plot.decoration(title = "Estimation")
p2 = plot.init(asp=1) + plot.raster(grid, name="spdeK.Z.stdev", palette = "Spectral") +
        plot.symbol(dat, col = "black", size=0.1) + plot.decoration(title = "St. dev.")
ggarrange(p1, p2, ncol = 2, nrow = 2)

Correlation between traditional and SPDE method for kriging estimation values

p = plot.init() + 
    plot.correlation(db1 = grid, 
                     namex = "standardK.Z.estim", 
                     namey = "spdeK.Z.estim", asPoint = FALSE, bins=100,
                     flagDiag = TRUE, diagColor = "red", flagSameAxes = FALSE) +
    plot.decoration(xlab = "Traditional", ylab = "SPDE", 
                  title = "Comparison for Kriging")
plot.end(p)

Correlation between traditional and SPDE method for kriging Standard deviation values

p = plot.init() + 
    plot.correlation(db1 = grid, 
                     namex = "standardK.Z.stdev", 
                     namey = "spdeK.Z.stdev", asPoint = FALSE, bins=100,
                     flagDiag = TRUE, diagColor = "red", flagSameAxes = FALSE) +
    plot.decoration(xlab = "Traditional", ylab = "SPDE", 
                  title = "Comparison for St. Dev.")
plot.end(p)

Comparative statistics

# statistics
knitr::kable(dbStatisticsMono(db = grid, names = "*.estim", opers = opers)$toTL(),
             digits = 3, caption = "Estimations")

Estimations
	Number	Minimum	Maximum	Mean	St. Dev.
standardK.Z.estim	40000	-2.670	1.845	-0.105	0.717
spdeK.Z.estim	40000	-2.757	1.851	-0.107	0.719

knitr::kable(dbStatisticsMono(db = grid, names = "*.stdev", opers = opers)$toTL(),
             digits = 3, caption = "St. dev. of Estimation error")

St. dev. of Estimation error
	Number	Minimum	Maximum	Mean	St. Dev.
standardK.Z.stdev	40000	0.005	0.997	0.651	0.195
spdeK.Z.stdev	40000	0.067	1.040	0.640	0.197

Produce some auxiliary statistics for simulations

knitr::kable(dbStatisticsMono(db = grid, names = "*NC.*", opers = opers)$toTL(),
             digits = 3, caption = "Non conditional simulations")

Non conditional simulations
	Number	Minimum	Maximum	Mean	St. Dev.
spdeNC.1	40000	-3.291	2.787	0.052	0.997
spdeNC.2	40000	-4.367	3.226	-0.165	0.963
spdeNC.3	40000	-3.488	3.529	-0.013	1.049
spdeNC.4	40000	-3.285	3.346	-0.050	1.007

knitr::kable(dbStatisticsMono(db = grid, names = "*CD.*", opers = opers)$toTL(),
             digits = 3, caption = "Conditional simulations")

Conditional simulations
	Number	Minimum	Maximum	Mean	St. Dev.
standardCD.Z.1	40000	-3.282	2.692	-0.084	0.970
standardCD.Z.2	40000	-3.272	2.716	-0.077	0.962
standardCD.Z.3	40000	-3.390	2.709	-0.120	0.952
standardCD.Z.4	40000	-3.157	2.807	-0.006	0.963
spdeCD.Z.1	40000	-3.567	2.575	-0.206	1.019
spdeCD.Z.2	40000	-3.374	3.312	-0.080	1.010
spdeCD.Z.3	40000	-3.506	2.965	-0.089	0.886
spdeCD.Z.4	40000	-4.082	2.496	-0.243	1.043

References

Lindgren, F., Rue, H., and Lindström, J. (2011). An explicit link between gaussian fields and gaussian markov random fields: the spde approach (with discussion). JR 671 Stat Soc, Series B, 73:423–498.
Pereira, M., Desassis, N., & Allard, D. (2022). Geostatistics for Large Datasets on Riemannian Manifolds: A Matrix-Free Approach. Journal of Data Science, 20(4), 512-532. doi:10.6339/22-JDS1075

Introduction to SPDE in gstlearn

gstlearn Team

2023-07-10

Introduction

SPDE approach

Initialization of gstlearn

Parameters of the SPDE model

Grid definition

Model definition

Observations of the process

Using traditional methods

Kriging

Conditional simulations

Using SPDE

Non conditional simulations

Conditional simulations

Kriging

References