Stochastic Partial Derivative Equations¶
In this tutorial, we show how to use the SPDE. We compare some calculations performed by hand with the results obtained through gstlearn API interfaces. Note that we also consider (probably temporarily) the Old interface (instantiating the SPDE class and performing subsequent calculations within this class) and the new interface where individual functions are designed for Kriging, Simulating and calculating LogLikelihood.
import gstlearn as gl
import gstlearn.plot as gp
import gstlearn.document as gdoc
import numpy as np
import matplotlib.pyplot as plt
from sksparse.cholmod import cholesky
import scipy as sc
from scipy.sparse import *
from scipy.sparse.linalg import *
import numpy as np
gdoc.setNoScroll()
Parameters¶
# Data
np.random.seed(123)
ndat = 1000
# Model
rangev = 0.2
sill = 1.
nugget = 0.1
# Grid
nx = [50,50]
dx = [0.02,0.02]
x0 = [0,0]
#Grid meshing
nxm = [75,75]
dxm = [0.02,0.02]
x0m = [-0.25,-0.25]
figsize = [5,5]
dbfmt = gl.DbStringFormat.createFromFlags(flag_stats=True, names=["*SPDE"])
Grid and Meshing¶
The 'grid' object defines the grid where API calculations will be performed
grid = gl.DbGrid.create(nx,dx,x0)
The 'gridExt' object is a dilated field which is used for manual calculation and to establish the Unique Mesh. The selection 'newSel' masks off the cells which are not covering 'grid'. The vector 'indSel' gives the indices of the active cells.
gridExt = gl.DbGrid.create(nxm,dxm,x0m)
mesh = gl.MeshETurbo(gridExt)
meshes = gl.VectorMeshes()
meshes.push_back(mesh)
gl.dumpMeshes(meshes)
gridExt.addSelection((gridExt["x1"]>0) & (gridExt["x2"]>0) & (gridExt["x1"]<1.) & (gridExt["x2"]<1.))
indSel = np.where(gridExt["NewSel"])[0]
Contents of the VectorMeshes ---------------------------- It contains 1 mesh(es) Turbo Meshing ============= Grid characteristics: --------------------- Origin : -0.250 -0.250 Mesh : 0.020 0.020 Number : 75 75 Euclidean Geometry Space Dimension = 2 Number of Apices per Mesh = 3 Number of Meshes = 10952 Number of Apices = 5625 Bounding Box Extension ---------------------- Dim #1 - Min:-0.25 - Max:1.23 Dim #2 - Min:-0.25 - Max:1.23
Model¶
model = gl.Model.createFromParam(gl.ECov.MATERN,param=1,range=rangev,sill=sill)
model.addCovFromParam(gl.ECov.NUGGET,sill=nugget)
model.setDriftIRF()
Data¶
dat = gl.Db.create()
dat["x"] = np.random.uniform(size=ndat)
dat["y"] = np.random.uniform(size=ndat)
dat.setLocators(["x","y"],gl.ELoc.X)
SPDE non-conditional simulation¶
We perform a non-conditional simulation of the Model on the grid in order to get the graphic rendition of the corresponding texture.
Performed on Grid¶
gl.law_set_random_seed(131351)
gl.simulateSPDE(None, grid, model, 1, -1, meshes)
grid.display(dbfmt)
Data Base Grid Characteristics ============================== Data Base Summary ----------------- File is organized as a regular grid Space dimension = 2 Number of Columns = 4 Total number of samples = 2500 Grid characteristics: --------------------- Origin : 0.000 0.000 Mesh : 0.020 0.020 Number : 50 50 Data Base Statistics -------------------- 4 - Name SimuSPDE - Locator z1 Nb of data = 2500 Nb of active values = 2500 Minimum value = -3.342 Maximum value = 3.300 Mean value = -0.121 Standard Deviation = 1.007 Variance = 1.014 Variables --------- Column = 0 - Name = rank - Locator = NA Column = 1 - Name = x1 - Locator = x1 Column = 2 - Name = x2 - Locator = x2 Column = 3 - Name = SimuSPDE - Locator = z1
gp.init(figsize=figsize,flagEqual=True)
gp.raster(grid)
gp.decoration(title="Non Conditional Simulation")
Performed on Data¶
We also perform a non conditional simulation on the Data location, which will serve for future conditioning information (for Kriging and Conditional Simulations).
gl.law_set_random_seed(131351)
gl.simulateSPDE(None, dat, model, 1, -1, meshes)
dat.display(dbfmt)
Data Base Characteristics ========================= Data Base Summary ----------------- File is organized as a set of isolated points Space dimension = 2 Number of Columns = 3 Total number of samples = 1000 Data Base Statistics -------------------- 3 - Name SimuSPDE - Locator z1 Nb of data = 1000 Nb of active values = 1000 Minimum value = -2.990 Maximum value = 2.941 Mean value = 0.079 Standard Deviation = 0.969 Variance = 0.939 Variables --------- Column = 0 - Name = x - Locator = x1 Column = 1 - Name = y - Locator = x2 Column = 2 - Name = SimuSPDE - Locator = z1
Kriging¶
We perform an estimation by Kriging in the SPDE framework:
- either using the Matrix Free version
err = gl.krigingSPDE(dat, gridExt, model, useCholesky=0, meshesK = meshes, namconv=gl.NamingConvention("KrigingSPDE-Free"), verbose=False)
- or using the Matrix version and the Cholesky decomposition. This is the version that will serve for comparion with the manual demonstration of the methodology further.
err = gl.krigingSPDE(dat, gridExt, model, useCholesky=1, meshesK = meshes, namconv=gl.NamingConvention("KrigingSPDE-Chol"),verbose=False)
Comparison between the two ways to perform SPDE internal algorithm
ax = plt.scatter(gridExt["*Chol.*estim"],gridExt["*Free.*estim"],s=1)
p = plt.plot([-3,3],[-3,3],c="r")
gp.init(figsize=figsize,flagEqual=True)
gp.raster(gridExt)
gp.symbol(dat, c='black')
gp.decoration(title="Estimation")
gridExt.display(dbfmt)
Data Base Grid Characteristics ============================== Data Base Summary ----------------- File is organized as a regular grid Space dimension = 2 Number of Columns = 6 Total number of samples = 5625 Number of active samples = 2500 Grid characteristics: --------------------- Origin : -0.250 -0.250 Mesh : 0.020 0.020 Number : 75 75 Data Base Statistics -------------------- 1 - Name rank - Locator NA Nb of data = 5625 Nb of active values = 2500 Minimum value = 989.000 Maximum value = 4713.000 Mean value = 2851.000 Standard Deviation = 1082.411 Variance = 1171614.500 2 - Name x1 - Locator x1 Nb of data = 5625 Nb of active values = 2500 Minimum value = 0.010 Maximum value = 0.990 Mean value = 0.500 Standard Deviation = 0.289 Variance = 0.083 3 - Name x2 - Locator x2 Nb of data = 5625 Nb of active values = 2500 Minimum value = 0.010 Maximum value = 0.990 Mean value = 0.500 Standard Deviation = 0.289 Variance = 0.083 4 - Name NewSel - Locator sel Nb of data = 5625 Nb of active values = 2500 Minimum value = 1.000 Maximum value = 1.000 Mean value = 1.000 Standard Deviation = 0.000 Variance = 0.000 5 - Name KrigingSPDE-Free.SimuSPDE.estim - Locator NA Nb of data = 5625 Nb of active values = 2500 Minimum value = -2.550 Maximum value = 2.549 Mean value = 0.108 Standard Deviation = 0.867 Variance = 0.752 6 - Name KrigingSPDE-Chol.SimuSPDE.estim - Locator z1 Nb of data = 5625 Nb of active values = 2500 Minimum value = -2.550 Maximum value = 2.549 Mean value = 0.108 Standard Deviation = 0.867 Variance = 0.752 Variables --------- Column = 0 - Name = rank - Locator = NA Column = 1 - Name = x1 - Locator = x1 Column = 2 - Name = x2 - Locator = x2 Column = 3 - Name = NewSel - Locator = sel Column = 4 - Name = KrigingSPDE-Free.SimuSPDE.estim - Locator = NA Column = 5 - Name = KrigingSPDE-Chol.SimuSPDE.estim - Locator = z1
Manually¶
All calculations are performed on 'gridExt'... but the comparison with the API is performed on its restriction to 'grid'.
spdeRes = gl.SPDE(dat, model, True, False, 1)
spdeRes.setMeshes(True, meshes)
err = spdeRes.makeReady(True)
Input Mesh is provided and used as is - Mesh #1/1: NMeshes = 10952, NApices = 5625 Projection of 'dbin' for Kriging: Created from Db and Mesh(es) - Number of variables = 1 - Number of Latents = 1 - Number of Apices = 5625 - Number of Points = 1000
Projection Matrix: mesh to grid¶
Aprojg = gl.ProjMatrix(gridExt, mesh).toTL()
Simulation¶
Q = spdeRes.getQ().toTL()
cholQ = cholesky(Q)
u = np.random.normal(size = Q.shape[0])
gridExt["ManualSimu"] = cholQ.apply_Pt(cholQ.solve_Lt(1./np.sqrt(cholQ.D())*u))
gp.init(figsize=figsize,flagEqual=True)
gp.raster(gridExt, "ManualSimu", useSel=True)
gp.decoration(title="Simulation (Manual)")
print(f"Variance = {round(np.var(gridExt['ManualSimu'][np.where(gridExt['NewSel']==1)]),4)}")
Variance = 1.1733
Kriging¶
Aproj = spdeRes.getProj().toTL()
invnoise = spdeRes.getInvNoise().toTL()
Qop = Q + Aproj.T @ invnoise @ Aproj
cholQop = cholesky(Qop)
kriging = cholQop.solve_A(Aproj.T @ (invnoise @ dat["SimuSPDE"]))
gridExt["ManualKrig"] = kriging
ax = plt.scatter(gridExt["ManualKrig"][indSel],gridExt["*Chol.*estim"][indSel],s=1)
p = plt.plot([-3,3],[-3,3],c="r")
Likelihood¶
Manually with Cholesky vs. matrix-free approach with SPDE api.
def solveMat(cholQop,x):
return cholQop.solve_A(x)
def invSigma(invnoise,Aproj,cholQop,x):
# return 1./sigma2 * (x - 1./sigma2 * Aproj @ solveMat(cholQop, Aproj.T @ x))
return invnoise @ x - invnoise @ Aproj @ solveMat(cholQop, Aproj.T @ (invnoise @ x))
def detQ(cholQ):
return cholQ.logdet()
x = dat["SimuSPDE"]
ones = np.ones_like(x)
invSigmaOnes = invSigma(invnoise,Aproj,cholQop,ones)
mu = np.sum(x * invSigmaOnes) / np.sum(ones * invSigmaOnes)
print(f"Value for MU = {round(mu,4)}")
Value for MU = 0.065
spdeChol = gl.SPDE(dat, model, True, False, 1)
spdeChol.setMeshes(True, meshes)
err = spdeChol.makeReady(True)
invnoise = spdeChol.getInvNoise().toTL()
Q = spdeChol.getQ().toTL()
cholQ = cholesky(Q)
Aproj = spdeChol.getProj().toTL()
Qop = Q + Aproj.T @ invnoise @ Aproj
cholQop = cholesky(Qop)
Input Mesh is provided and used as is - Mesh #1/1: NMeshes = 10952, NApices = 5625 Projection of 'dbin' for Kriging: Created from Db and Mesh(es) - Number of variables = 1 - Number of Latents = 1 - Number of Apices = 5625 - Number of Points = 1000
a_quad = np.sum((x-mu)*invSigma(invnoise,Aproj,cholQop,x-mu))
b_quad = spdeChol.getSPDEOp().computeQuadratic(x - mu)
print(f"Quadratic (manual) = {round(a_quad,4)}")
print(f"Quadratic (spde) = {round(b_quad,4)}")
print(f"-> Relative difference quadratic = {round(100*(b_quad - a_quad) / a_quad,2)}%")
Quadratic (manual) = 1105.1477 Quadratic (spde) = 1105.1477 -> Relative difference quadratic = 0.0%
a_op = detQ(cholQop)
b_op = spdeChol.getSPDEOp().computeLogDetOp()
print(f"log_det_op (manual) = {round(a_op,4)}")
print(f"log_det_op (spde) = {round(b_op,4)}")
print(f"-> Relative difference = {round(100*(b_op-a_op)/a_op, 2)}%")
log_det_op (manual) = 12296.9389 log_det_op (spde) = 12296.9389 -> Relative difference = 0.0%
a_one = detQ(cholQ)
b_one = spdeChol.getPrecisionKrig().computeLogDet()
print(f"log_det_Q (manual) = {round(a_one,4)}")
print(f"log_det_Q (spde) = {round(b_one,4)}")
print(f"-> Relative difference = {round(100*(b_one-a_one)/a_one,2)}%")
log_det_Q (manual) = 11255.7151 log_det_Q (spde) = 11255.7151 -> Relative difference = 0.0%
Comparing the different outputs¶
- Manual calculation
logdetnoise = len(x) * np.log(nugget)
logdet = a_op - a_one + logdetnoise
a = -0.5 * (a_quad + logdet + len(x) * np.log(2. * np.pi))
print("Likelihood calculation (manual):")
print(f"log_det_op = {round(a_op,4)}")
print(f"log_det_Q = {round(a_one,4)}")
print(f"log_det_Noise = {round(logdetnoise,4)}")
print(f"log_determinant = {round(logdet,4)}")
print(f"Quadratic term = {round(a_quad,4)}")
print(f"-> Likelihood (manual) = {round(a,4)}")
Likelihood calculation (manual): log_det_op = 12296.9389 log_det_Q = 11255.7151 log_det_Noise = -2302.5851 log_determinant = -1261.3613 Quadratic term = 1105.1477 -> Likelihood (manual) = -840.8318
We define the set of meshes (using the alredy defined one).
meshes = gl.VectorMeshes()
meshes.push_back(mesh)
useCholesky = 1
meshes = gl.VectorMeshes()
meshes.push_back(spdeChol.getMeshesK()[0])
b2 = gl.logLikelihoodSPDE(dat,model,useCholesky=useCholesky, meshes=meshes, verbose=True)
print(f"-> likelihood (api) cholesky=1 {round(b2,4)}")
Log-likelhood calculation in SPDE framework( Cholesky=1) -------------------------------------------------------- Input Mesh is provided and used as is - Mesh #1/1: NMeshes = 10952, NApices = 5625 Projection of 'dbin' for Kriging: Created from Db and Mesh(es) - Number of variables = 1 - Number of Latents = 1 - Number of Apices = 5625 - Number of Points = 1000 Drift coefficients = 0.065 LogDet of Q + ADA': 12296.938911 LogDet of Q: 11255.715075 LogDet of InvNoise: 2302.585093 Likelihood calculation: Nb. active samples = 1000 Nb. Monte-Carlo = 10 Cholesky = 1 Log-Determinant = -1261.361258 Quadratic term = 1105.147718 Log-likelihood = -840.831763 -> likelihood (api) cholesky=1 -840.8318
useCholesky = 0
spdeMF = gl.SPDE(dat, model, True, False, 0)
spdeMF.setMeshes(True, meshes)
err = spdeMF.makeReady(True)
Input Mesh is provided and used as is - Mesh #1/1: NMeshes = 10952, NApices = 5625 Projection of 'dbin' for Kriging: Created from Db and Mesh(es) - Number of variables = 1 - Number of Latents = 1 - Number of Apices = 5625 - Number of Points = 1000
useCholesky = 0
nMC = 100
params = gl.SPDEParam.create(nMC=nMC)
b2 = gl.logLikelihoodSPDE(dat,model,useCholesky=useCholesky, meshes=meshes, params=params, verbose=True)
print(f"-> likelihood by New API with cholesky=0 {round(b2,4)}")
Log-likelhood calculation in SPDE framework( Cholesky=0) -------------------------------------------------------- Input Mesh is provided and used as is - Mesh #1/1: NMeshes = 10952, NApices = 5625 Projection of 'dbin' for Kriging: Created from Db and Mesh(es) - Number of variables = 1 - Number of Latents = 1 - Number of Apices = 5625 - Number of Points = 1000 Drift coefficients = 0.079 LogDet of Q + ADA': 11569.018180 LogDet of Q: 11585.782711 LogDet of InvNoise: 2302.585093 Likelihood calculation: Nb. active samples = 1000 Nb. Monte-Carlo = 100 Cholesky = 0 Log-Determinant = -2319.349624 Quadratic term = 1105.223469 Log-likelihood = -311.875456 -> likelihood by New API with cholesky=0 -311.8755