This notebook presents examples for the co-kriging of linear combinations. Starting from a set of variables \(\mathbf{Z}(\alpha) = (Z_i(\alpha))_{i \in 1:n}\), the task is to compute the co-kriging of a linear combination of the variable at location \(\mathbf{o}\): \[ (\beta^T \mathbf{Z}(\mathbf{o}))^{K} = \sum_{i \in 1:n} \sum_{\alpha \in A_i} \lambda_i^{\alpha} Z_i(\alpha) \] Kriging is linear, hence \((\beta^T \cdot \mathbf{Y}(\mathbf{o}))^{K} = \sum_{i \in 1:p} \beta_i \times Y_i(\mathbf{o})^{K}\) but the variance of the kriging error is not easily computable. However, the matLC option of the kriging function offers the possibility to compute directly the estimate and its kriging variance of any linear combinations. The method is evaluated on a simulated data set.
The following cases are considered:
In all cases, the results seem consistent with the theory.
Kriging systems when kriging a linear combination are given below:
The expression of the estimator is: \[ (\beta^T \, \mathbf{Z_{(o)}})^{SK} = \beta^T \mathbf{m} +\sum_{i \in 1:n} \sum_{\alpha \in A_i} \lambda^{\alpha} (Z_i(\alpha) - m_i) \]
There is no bias as: \[ \mathbb{E}\{(\beta^T \, \mathbf{Z_{(o)}})^{SK}\} = \beta^T \mathbf{m} = \mathbb{E}\{\beta^T \, \mathbf{Z_{(o)}}\}. \]
The variance of the kriging error is: \[ \begin{array}{cl} \text{Var}\{\beta^T \, \mathbf{Z_{(o)}} - (\beta^T \, \mathbf{Z_{(o)}})^{SK}\} = & \beta^T \, \text{Var}\{\mathbf{Z_{(o)}}\} \, \beta \\ & - 2 \sum_{k,i \in 1:n} \sum_{\alpha \in A_i} \beta_k \, \lambda^{\alpha} \text{Cov}(Z_k(o),Z_i(\alpha)) \\ & + \sum_{i,j \in 1:n} \sum_{\alpha \in A_i, \alpha' \in A_j} \beta_k \, \lambda^{\alpha} \lambda^{\alpha'} \text{Cov}(Z_i(\alpha),Z_j(\alpha')) \end{array} \]
Minimizing this variance, we gets the kriging system of for the estimation of the linear combination: \[ \left( \text{Cov}(Z_{i}(\alpha),Z_{j}(\alpha')) \right) \Lambda = \sum_{k\in 1:n} \beta_k \text{Cov}(Z_k(o), Z_i(\alpha)) \]
The variance of the kriging error boils down to: \[ \text{Var}\{\beta^T \, \mathbf{Z_{(o)}} - (\beta^T \, \mathbf{Z_{(o)}})^{SK}\} = \beta^T \, \text{Var}\{\mathbf{Z_{(o)}}\} \, \beta - \sum_{k,i \in 1:n} \sum_{\alpha \in A_i} \beta_k \text{Cov}(Z_k(o),Z_i(\alpha)) \lambda^{\alpha} \]
The expression of the estimator is: \[ (\beta^T \, \mathbf{Z_{(o)}})^{OK} = \sum_{i \in 1:n} \sum_{\alpha \in A_i} \lambda^{\alpha} Z_i(\alpha) \]
The no-bias conditions are: \[ \sum_{\alpha \in A_i} \lambda^{\alpha} = \beta_i, \text{ for } \forall i \in 1:n. \]
The variance of the kriging error is: \[ \begin{array}{cl} \text{Var}\{\beta^T \, \mathbf{Z_{(o)}} - (\beta^T \, \mathbf{Z_{(o)}})^{OK}\} = & \beta^T \, \text{Var}\{\mathbf{Z_{(o)}}\} \, \beta \\ & - 2 \sum_{k\in 1:n} \sum_{i, \alpha \in A_i} \beta_k \text{Cov}(Z_k(o),Z_i(\alpha)) \, \lambda^{\alpha} \\ & + \sum_{i, \alpha \in A_i} \sum_{j, \alpha' \in A_j} \lambda^{\alpha} \text{Cov}(Z_i(\alpha),Z_j(\alpha')) \lambda^{\alpha'} \end{array} \]
Minimizing the Lagrangian, \[ \mathcal{L}(\Lambda, \mu) = \text{Var}\{\beta^T \, \mathbf{Z_{(o)}} - (\beta^T \, \mathbf{Z_{(o)}})^{OK}\} + \sum_{k \in 1:n} \mu_k (\sum_{\alpha \in A_k} \lambda^{\alpha} - \beta_k) \]
we gets the kriging system of for the estimation of the linear combination: \[ \left[ \begin{array}{cc} \text{Cov}(Z_i(\alpha), Z_j(\alpha')) & \mathbf{1}_{\alpha \in A_j} \\ \mathbf{1}_{\alpha' \in A_i} & \mathbf{0} \end{array} \right] \left[ \begin{array}{c} \lambda \\ \mu \end{array} \right] = \left[ \begin{array}{c} \sum_{k \in 1:n} \beta_k\text{Cov}(Z_k(o), Z_i(\alpha)) \\ \beta_i \end{array} \right] \]
The variance of the kriging error boils down to: \[ \text{Var}\{\beta^T \, \mathbf{Z_{(o)}} - (\beta^T \, \mathbf{Z_{(o)}})^{OK}\} = \beta^T \, \text{Var}\{\mathbf{Z_{(o)}}\} \, \beta - \sum_{k \in 1:n} \sum_{i,\alpha \in A_i} \beta_k \text{Cov}(Z_k(o),Z_i(\alpha))\, \lambda^{\alpha} - \sum_{k \in 1:n} \beta_k \, \mu_k \]
We create a reference data set using the the Turning Bands method (simtub):
the distribution is log normal,
the stationary covariance function is exponential.
5% of the two dimensional grid is sampled and considered as the reference data set.
# parameters for the simulation
m = c(1, 2)
sig = c(0.25, 0.75)
rho = 0.7
CO = matrix(c(1, rho, rho, 1), 2, 2)
# initialization of the grid
grd = DbGrid_create(x0=c(0.0,0.0), dx=c(0.01,0.01), nx=c(100,100))
# simulation from a model
model = Model_createFromParam(type = ECov_EXPONENTIAL(), range=0.2, sills = CO)
err = simtub(dbin = NULL, dbout = grd, model = model, nbsimu = 1)
err = grd$setName("Simu.1", "Y1")
err = grd$setName("Simu.2", "Y2")
grd["Z1"] = m[1] * exp(sig[1] * grd["Y1"] - sig[1]**2 / 2)
grd["Z2"] = m[2] * exp(sig[2] * grd["Y2"] - sig[2]**2 / 2)
# Data set (ssp proportion of the grid)
data = Db_createFillRandom(ndat = ssp*grd$getSampleNumber(), nvar = 0)
err = data$setName("x-1", "x1")
err = data$setName("x-2", "x2")
err = migrate(dbin = grd, name = "Z1", dbout = data, dmax = c(0.1, 0.1),
namconv = NamingConvention(""))
err = migrate(dbin = grd, name = "Z2", dbout = data, dmax = c(0.1, 0.1),
namconv = NamingConvention(""))
# plots
p1 = ggDefaultGeographic() +
plot.grid(grd, nameRaster = "Z1", palette = "Spectral") +
plot.point(data, colour = "red") +
plot.decoration(title="Z1")
p2 = ggDefault() +
plot.hist(grd, name='Z1', bins = 25, bg = "orange", color= "gray") +
plot.decoration(xlab = "Raw variable", title="Histogram of the initial variable Z1")
p3 = ggDefaultGeographic() +
plot.grid(grd, nameRaster = "Z2", palette = "Spectral") +
plot.point(data, colour = "red") +
plot.decoration(title="Z2")
p4 = ggDefault() +
plot.hist(grd, name='Z2', bins = 25, bg = "orange", color= "gray") +
plot.decoration(xlab = "Raw variable", title="Histogram of the initial variable Z2")
ggarrange(p1, p2, p3, p4, ncol = 2, nrow = 2)
# defining the linear combination
beta = c(1, -3) # coefficients and computing the linear combination
nm_var = c("Z1", "Z2")
nm_LC = c("LC", "Z1")
matLC = matrix(c(beta, 1, 0), nrow = 2, ncol = 2, byrow = TRUE)
colnames(matLC) <- nm_var
rownames(matLC) <- nm_LC
# computing the linear computation
data["LC"] = matrix(data$getColumns(colnames(matLC)), nrow = data$getSampleNumber(),
ncol = dim(matLC)[2]) %*% t(matLC)[,1]
grd["LC"] = matrix(grd$getColumns(colnames(matLC)), nrow = grd$getSampleNumber(),
ncol = dim(matLC)[2]) %*% t(matLC)[,1]
# statistics
tab = rbind(
dbStatisticsMono(grd, names = c("Z*", "LC"), opers = opers)$toTL(),
dbStatisticsMono(data, names = c("Z*", "LC"), opers = opers)$toTL())
rownames(tab) <- c(
"grid-Z1", "grid-Z2", "grid-LC",
"samples-Z1", "samples-Z2", "samples-LC")
knitr::kable(tab, caption = "Statistics on the reference data set", digits = 4)
Number | Minimum | Maximum | Mean | St. Dev. | |
---|---|---|---|---|---|
grid-Z1 | 10000 | 0.4528 | 2.2903 | 1.0419 | 0.2619 |
grid-Z2 | 10000 | 0.1181 | 15.8685 | 1.9705 | 1.7018 |
grid-LC | 10000 | -47.0365 | 1.5622 | -4.8696 | 5.2569 |
samples-Z1 | 50 | 0.7668 | 1.6995 | 1.1311 | 0.2404 |
samples-Z2 | 50 | 0.2648 | 4.6042 | 1.6170 | 1.1103 |
samples-LC | 50 | -12.9096 | 0.7581 | -3.7199 | 3.4745 |
# Variogram parameters (omnidirectional)
varioParam = VarioParam_createOmniDirection(npas=10, dpas=0.05)
# Variogram basic structures
types = ECov_fromKeys(c("NUGGET", "EXPONENTIAL", "EXPONENTIAL"))
# types = ECov_fromKeys(c("EXPONENTIAL"))
# weighted proportional to the number of pairs and inverse proportional to the distance
opt = Option_AutoFit()
err = opt$setWmode(2)
#define neighborhood
neigh = NeighMoving_create(nmaxi=30, radius=0.5)
# Display the kriging system for a fixed target when using kriging
OptDbg_setReference(0)
## NULL
# function to compute the model of the linear combination (using a square matrix)
# TODO: generalize to rectangular matrix?
model_of_LC <- function(model, A) {
stopifnot(dim(A)[1] == dim(A)[2])
nv = model$getVariableNumber()
stopifnot(nv == dim(A)[1])
ns = model$getCovaNumber()
model_LC = model$clone()
for (k in 1:ns) {
CC = A %*% model$getSillValues(k-1)$toTL() %*% t(A)
for (i in 1:nv) {
for (j in 1:nv) {
err = model_LC$setSill(icov = k-1, ivar=i-1, jvar=j-1, value = CC[i, j])
}
}
}
model_LC
}
# Locate Z
err = data$setLocators("Z*", locatorType = ELoc_Z(), cleanSameLocator = TRUE)
# Compute variogram
var_Z = Vario_computeFromDb(varioParam, data)
# fit model
mod_Z_IRF0 = Model()
err = mod_Z_IRF0$fit(var_Z, types = types, mauto = opt)
# setting the drift "unknown mean" for Ordinary Kriging
err = mod_Z_IRF0$setDriftIRF(order = 0)
# plot
multi.varmod(var_Z, mod_Z_IRF0, flagLegend=TRUE)
if(flag.verbose) {
mod_Z_IRF0
}
err = grd$deleteColumns(paste("OK", nm_var, "*", sep = "."))
err = grd$deleteColumns(paste("LC", "OK", "*", sep = "."))
if(flag.verbose) {
print(paste0(">>> Ordinary kriging of the raw variables"))
}
err = data$setLocators(c("Z1", "Z2"), ELoc_Z(), cleanSameLocator=TRUE)
err = kriging(data, grd, mod_Z_IRF0, neigh,
flag_est = TRUE, flag_std = TRUE,
namconv = NamingConvention("OK"))
# computing the LC of kriging
grd["LC.OK.estim"] = matrix(grd$getColumns(names = paste("OK", colnames(matLC), "estim", sep = ".")),
nrow = grd$getSampleNumber(), ncol = dim(matLC)[2]) %*% t(matLC)[,1]
# statistics
vn = c("Z1", paste("OK", "Z1", c("estim", "stdev"), sep = "."),
"Z2", paste("OK", "Z2", c("estim", "stdev"), sep = "."))
tab = dbStatisticsMono(grd, vn, opers = opers)$toTL()
knitr::kable(tab, caption = "Statistics on the grid for Ordinary Kriging with the initial model", digits = 4)
Number | Minimum | Maximum | Mean | St. Dev. | |
---|---|---|---|---|---|
Z1 | 10000 | 0.4528 | 2.2903 | 1.0419 | 0.2619 |
OK.Z1.estim | 10000 | 0.7978 | 1.6791 | 1.1037 | 0.1053 |
OK.Z1.stdev | 10000 | 0.0443 | 0.2663 | 0.2206 | 0.0273 |
Z2 | 10000 | 0.1181 | 15.8685 | 1.9705 | 1.7018 |
OK.Z2.estim | 10000 | 0.2454 | 3.4234 | 1.7027 | 0.6338 |
OK.Z2.stdev | 10000 | 0.7099 | 1.2544 | 0.8919 | 0.0659 |
# plot of estimated values
p1 = ggDefaultGeographic() +
plot.grid(grd, nameRaster = paste("OK", "Z1", "estim", sep = "."),
palette = "Spectral", flagLegendRaster = TRUE, legendNameRaster = "estim") +
plot.decoration(xlab = "", ylab = "", title = paste0("Ordinary coK for Z1"))
p2 = ggDefaultGeographic() +
plot.grid(grd, nameRaster = paste("OK", "Z1", "stdev", sep = "."),
palette = "Spectral", flagLegendRaster = TRUE, legendNameRaster = "stdev") +
plot.decoration(xlab = "", ylab = "", title = paste0("Std. of Ordinary coK for Z1"))
p3 = ggDefaultGeographic() +
plot.grid(grd, nameRaster = paste("OK", "Z2", "estim", sep = "."),
palette = "Spectral", flagLegendRaster = TRUE, legendNameRaster = "estim") +
plot.decoration(xlab = "", ylab = "", title = paste0("Ordinary coK for Z2"))
p4 = ggDefaultGeographic() +
plot.grid(grd, nameRaster = paste("OK", "Z2", "stdev", sep = "."),
palette = "Spectral", flagLegendRaster = TRUE, legendNameRaster = "stdev") +
plot.decoration(xlab = "", ylab = "", title = paste0("Std. of Ordinary coK for Z2"))
ggarrange(p1, p2, p3, p4, ncol = 2, nrow = 2)