Stochastic Partial Derivative Equations¶

In this tutorial, we show how to use the SPDE. We compare some calculations performed by hand with the results obtained through gstlearn API interfaces. Note that we also consider (probably temporarily) the Old interface (instantiating the SPDE class and performing subsequent calculations within this class) and the new interface where individual functions are designed for Kriging, Simulating and calculating LogLikelihood.

In [1]:
import gstlearn as gl
import gstlearn.plot as gp
import gstlearn.document as gdoc
import numpy as np
import matplotlib.pyplot as plt

from sksparse.cholmod import cholesky
import scipy as sc
from scipy.sparse import *
from scipy.sparse.linalg import *
import numpy as np

gdoc.setNoScroll()

Parameters¶

In [2]:
# Data
np.random.seed(123)
ndat = 1000

# Model
rangev = 0.2
sill = 1.
nugget = 0.1

# Grid 
nx = [50,50]
dx = [0.02,0.02]
x0 = [0,0]

#Grid meshing
nxm = [75,75]
dxm = [0.02,0.02]
x0m = [-0.25,-0.25]

figsize = [5,5]

dbfmt = gl.DbStringFormat.createFromFlags(flag_stats=True, names=["*SPDE"])

Grid and Meshing¶

The 'grid' object defines the grid where API calculations will be performed

In [3]:
grid = gl.DbGrid.create(nx,dx,x0)

The 'gridExt' object is a dilated field which is used for manual calculation and to establish the Unique Mesh. The selection 'newSel' masks off the cells which are not covering 'grid'. The vector 'indSel' gives the indices of the active cells.

In [4]:
gridExt = gl.DbGrid.create(nxm,dxm,x0m)

mesh = gl.MeshETurbo(gridExt)
meshes = gl.VectorMeshes()
meshes.push_back(mesh)
gl.dumpMeshes(meshes)

gridExt.addSelection((gridExt["x1"]>0) & (gridExt["x2"]>0) & (gridExt["x1"]<1.) & (gridExt["x2"]<1.))
indSel = np.where(gridExt["NewSel"])[0]
Contents of the VectorMeshes
----------------------------
It contains 1 mesh(es)

Turbo Meshing
=============

Grid characteristics:
---------------------
Origin :     -0.250    -0.250
Mesh   :      0.020     0.020
Number :         75        75
Euclidean Geometry
Space Dimension           = 2
Number of Apices per Mesh = 3
Number of Meshes          = 10952
Number of Apices          = 5625

Bounding Box Extension
----------------------
Dim #1 - Min:-0.25 - Max:1.23
Dim #2 - Min:-0.25 - Max:1.23

Model¶

In [5]:
model = gl.Model.createFromParam(gl.ECov.MATERN,param=1,range=rangev,sill=sill)
model.addCovFromParam(gl.ECov.NUGGET,sill=nugget)
model.setDriftIRF()

Data¶

In [6]:
dat = gl.Db.create()
dat["x"] = np.random.uniform(size=ndat)
dat["y"] = np.random.uniform(size=ndat)
dat.setLocators(["x","y"],gl.ELoc.X)

SPDE non-conditional simulation¶

We perform a non-conditional simulation of the Model on the grid in order to get the graphic rendition of the corresponding texture.

Performed on Grid¶

In [7]:
gl.law_set_random_seed(131351)
gl.simulateSPDE(None, grid, model, 1, -1, meshes)
grid.display(dbfmt)
Data Base Grid Characteristics
==============================

Data Base Summary
-----------------
File is organized as a regular grid
Space dimension              = 2
Number of Columns            = 4
Total number of samples      = 2500

Grid characteristics:
---------------------
Origin :      0.000     0.000
Mesh   :      0.020     0.020
Number :         50        50

Data Base Statistics
--------------------
4 - Name SimuSPDE - Locator z1
 Nb of data          =       2500
 Nb of active values =       2500
 Minimum value       =     -3.342
 Maximum value       =      3.300
 Mean value          =     -0.121
 Standard Deviation  =      1.007
 Variance            =      1.014

Variables
---------
Column = 0 - Name = rank - Locator = NA
Column = 1 - Name = x1 - Locator = x1
Column = 2 - Name = x2 - Locator = x2
Column = 3 - Name = SimuSPDE - Locator = z1
In [8]:
gp.init(figsize=figsize,flagEqual=True)
gp.raster(grid)
gp.decoration(title="Non Conditional Simulation")
No description has been provided for this image

Performed on Data¶

We also perform a non conditional simulation on the Data location, which will serve for future conditioning information (for Kriging and Conditional Simulations).

In [9]:
gl.law_set_random_seed(131351)
gl.simulateSPDE(None, dat, model, 1, -1, meshes)
dat.display(dbfmt)
Data Base Characteristics
=========================

Data Base Summary
-----------------
File is organized as a set of isolated points
Space dimension              = 2
Number of Columns            = 3
Total number of samples      = 1000

Data Base Statistics
--------------------
3 - Name SimuSPDE - Locator z1
 Nb of data          =       1000
 Nb of active values =       1000
 Minimum value       =     -2.990
 Maximum value       =      2.941
 Mean value          =      0.079
 Standard Deviation  =      0.969
 Variance            =      0.939

Variables
---------
Column = 0 - Name = x - Locator = x1
Column = 1 - Name = y - Locator = x2
Column = 2 - Name = SimuSPDE - Locator = z1

Kriging¶

We perform an estimation by Kriging in the SPDE framework:

  • either using the Matrix Free version
In [10]:
err = gl.krigingSPDE(dat, gridExt, model, useCholesky=0, meshesK = meshes, namconv=gl.NamingConvention("KrigingSPDE-Free"), verbose=False)
  • or using the Matrix version and the Cholesky decomposition. This is the version that will serve for comparion with the manual demonstration of the methodology further.
In [11]:
err = gl.krigingSPDE(dat, gridExt, model, useCholesky=1, meshesK = meshes, namconv=gl.NamingConvention("KrigingSPDE-Chol"),verbose=False)

Comparison between the two ways to perform SPDE internal algorithm

In [12]:
ax = plt.scatter(gridExt["*Chol.*estim"],gridExt["*Free.*estim"],s=1)
p = plt.plot([-3,3],[-3,3],c="r")
No description has been provided for this image
In [13]:
gp.init(figsize=figsize,flagEqual=True)
gp.raster(gridExt)
gp.symbol(dat, c='black')
gp.decoration(title="Estimation")
No description has been provided for this image
In [14]:
gridExt.display(dbfmt)
Data Base Grid Characteristics
==============================

Data Base Summary
-----------------
File is organized as a regular grid
Space dimension              = 2
Number of Columns            = 6
Total number of samples      = 5625
Number of active samples     = 2500

Grid characteristics:
---------------------
Origin :     -0.250    -0.250
Mesh   :      0.020     0.020
Number :         75        75

Data Base Statistics
--------------------
1 - Name rank - Locator NA
 Nb of data          =       5625
 Nb of active values =       2500
 Minimum value       =    989.000
 Maximum value       =   4713.000
 Mean value          =   2851.000
 Standard Deviation  =   1082.411
 Variance            = 1171614.500
2 - Name x1 - Locator x1
 Nb of data          =       5625
 Nb of active values =       2500
 Minimum value       =      0.010
 Maximum value       =      0.990
 Mean value          =      0.500
 Standard Deviation  =      0.289
 Variance            =      0.083
3 - Name x2 - Locator x2
 Nb of data          =       5625
 Nb of active values =       2500
 Minimum value       =      0.010
 Maximum value       =      0.990
 Mean value          =      0.500
 Standard Deviation  =      0.289
 Variance            =      0.083
4 - Name NewSel - Locator sel
 Nb of data          =       5625
 Nb of active values =       2500
 Minimum value       =      1.000
 Maximum value       =      1.000
 Mean value          =      1.000
 Standard Deviation  =      0.000
 Variance            =      0.000
5 - Name KrigingSPDE-Free.SimuSPDE.estim - Locator NA
 Nb of data          =       5625
 Nb of active values =       2500
 Minimum value       =     -2.550
 Maximum value       =      2.549
 Mean value          =      0.108
 Standard Deviation  =      0.867
 Variance            =      0.752
6 - Name KrigingSPDE-Chol.SimuSPDE.estim - Locator z1
 Nb of data          =       5625
 Nb of active values =       2500
 Minimum value       =     -2.550
 Maximum value       =      2.549
 Mean value          =      0.108
 Standard Deviation  =      0.867
 Variance            =      0.752

Variables
---------
Column = 0 - Name = rank - Locator = NA
Column = 1 - Name = x1 - Locator = x1
Column = 2 - Name = x2 - Locator = x2
Column = 3 - Name = NewSel - Locator = sel
Column = 4 - Name = KrigingSPDE-Free.SimuSPDE.estim - Locator = NA
Column = 5 - Name = KrigingSPDE-Chol.SimuSPDE.estim - Locator = z1

Manually¶

All calculations are performed on 'gridExt'... but the comparison with the API is performed on its restriction to 'grid'.

In [15]:
spdeRes = gl.SPDE(dat, model, True, False, 1)
spdeRes.setMeshes(True, meshes)
err = spdeRes.makeReady(True)
Input Mesh is provided and used as is
- Mesh #1/1: NMeshes = 10952, NApices = 5625
Projection of 'dbin' for Kriging: Created from Db and Mesh(es)
- Number of variables = 1
- Number of Latents   = 1
- Number of Apices    = 5625
- Number of Points    = 1000

Projection Matrix: mesh to grid¶

In [16]:
Aprojg = gl.ProjMatrix(gridExt, mesh).toTL()

Simulation¶

In [17]:
Q = spdeRes.getQ().toTL()
cholQ = cholesky(Q)
In [18]:
u = np.random.normal(size = Q.shape[0])
gridExt["ManualSimu"] = cholQ.apply_Pt(cholQ.solve_Lt(1./np.sqrt(cholQ.D())*u))

gp.init(figsize=figsize,flagEqual=True)
gp.raster(gridExt, "ManualSimu", useSel=True)
gp.decoration(title="Simulation (Manual)")
print(f"Variance = {round(np.var(gridExt['ManualSimu'][np.where(gridExt['NewSel']==1)]),4)}")
Variance = 1.1733
No description has been provided for this image

Kriging¶

In [19]:
Aproj = spdeRes.getProj().toTL()
invnoise = spdeRes.getInvNoise().toTL()

Qop = Q + Aproj.T @ invnoise @ Aproj
cholQop =  cholesky(Qop)

kriging = cholQop.solve_A(Aproj.T @ (invnoise @ dat["SimuSPDE"]))
gridExt["ManualKrig"] = kriging
In [20]:
ax = plt.scatter(gridExt["ManualKrig"][indSel],gridExt["*Chol.*estim"][indSel],s=1)
p = plt.plot([-3,3],[-3,3],c="r")
No description has been provided for this image

Likelihood¶

Manually with Cholesky vs. matrix-free approach with SPDE api.

In [21]:
def solveMat(cholQop,x):
    return cholQop.solve_A(x)

def invSigma(invnoise,Aproj,cholQop,x):
#    return 1./sigma2 * (x - 1./sigma2 * Aproj @ solveMat(cholQop, Aproj.T @ x))
    return invnoise @ x - invnoise @ Aproj @ solveMat(cholQop, Aproj.T @ (invnoise @ x)) 

def detQ(cholQ):
    return cholQ.logdet()

x = dat["SimuSPDE"]
ones = np.ones_like(x)
invSigmaOnes = invSigma(invnoise,Aproj,cholQop,ones)
mu  = np.sum(x * invSigmaOnes) / np.sum(ones * invSigmaOnes) 
print(f"Value for MU = {round(mu,4)}")
Value for MU = 0.065
In [22]:
spdeChol = gl.SPDE(dat, model, True, False, 1)
spdeChol.setMeshes(True, meshes)
err = spdeChol.makeReady(True)
invnoise = spdeChol.getInvNoise().toTL()

Q = spdeChol.getQ().toTL()
cholQ = cholesky(Q)

Aproj = spdeChol.getProj().toTL()

Qop = Q + Aproj.T @ invnoise @ Aproj
cholQop =  cholesky(Qop)
Input Mesh is provided and used as is
- Mesh #1/1: NMeshes = 10952, NApices = 5625
Projection of 'dbin' for Kriging: Created from Db and Mesh(es)
- Number of variables = 1
- Number of Latents   = 1
- Number of Apices    = 5625
- Number of Points    = 1000
In [23]:
a_quad = np.sum((x-mu)*invSigma(invnoise,Aproj,cholQop,x-mu))
b_quad = spdeChol.getSPDEOp().computeQuadratic(x - mu)
print(f"Quadratic (manual) = {round(a_quad,4)}")
print(f"Quadratic (spde)   = {round(b_quad,4)}")
print(f"-> Relative difference quadratic = {round(100*(b_quad - a_quad) / a_quad,2)}%")
Quadratic (manual) = 1105.1477
Quadratic (spde)   = 1105.1477
-> Relative difference quadratic = 0.0%
In [24]:
a_op = detQ(cholQop)
b_op = spdeChol.getSPDEOp().computeLogDetOp()
print(f"log_det_op (manual) = {round(a_op,4)}")
print(f"log_det_op (spde)   = {round(b_op,4)}")
print(f"-> Relative difference = {round(100*(b_op-a_op)/a_op, 2)}%")
log_det_op (manual) = 12296.9389
log_det_op (spde)   = 12296.9389
-> Relative difference = 0.0%
In [25]:
a_one = detQ(cholQ)
b_one = spdeChol.getPrecisionKrig().computeLogDet()
print(f"log_det_Q (manual) = {round(a_one,4)}")
print(f"log_det_Q (spde)   = {round(b_one,4)}")
print(f"-> Relative difference = {round(100*(b_one-a_one)/a_one,2)}%")
log_det_Q (manual) = 11255.7151
log_det_Q (spde)   = 11255.7151
-> Relative difference = 0.0%

Comparing the different outputs¶

  • Manual calculation
In [26]:
logdetnoise = len(x) * np.log(nugget)
logdet = a_op - a_one + logdetnoise
a = -0.5 * (a_quad + logdet + len(x) * np.log(2. * np.pi))
print("Likelihood calculation (manual):")
print(f"log_det_op      = {round(a_op,4)}")
print(f"log_det_Q       = {round(a_one,4)}")
print(f"log_det_Noise   = {round(logdetnoise,4)}")
print(f"log_determinant = {round(logdet,4)}")
print(f"Quadratic term  = {round(a_quad,4)}")
print(f"-> Likelihood (manual) = {round(a,4)}")
Likelihood calculation (manual):
log_det_op      = 12296.9389
log_det_Q       = 11255.7151
log_det_Noise   = -2302.5851
log_determinant = -1261.3613
Quadratic term  = 1105.1477
-> Likelihood (manual) = -840.8318

We define the set of meshes (using the alredy defined one).

In [27]:
meshes = gl.VectorMeshes()
meshes.push_back(mesh)
In [28]:
useCholesky = 1
meshes = gl.VectorMeshes()
meshes.push_back(spdeChol.getMeshesK()[0])
b2 = gl.logLikelihoodSPDE(dat,model,useCholesky=useCholesky, meshes=meshes, verbose=True)
print(f"-> likelihood (api) cholesky=1 {round(b2,4)}")
Log-likelhood calculation in SPDE framework( Cholesky=1)
--------------------------------------------------------
Input Mesh is provided and used as is
- Mesh #1/1: NMeshes = 10952, NApices = 5625
Projection of 'dbin' for Kriging: Created from Db and Mesh(es)
- Number of variables = 1
- Number of Latents   = 1
- Number of Apices    = 5625
- Number of Points    = 1000
Drift coefficients = 
     0.065
LogDet of Q + ADA': 12296.938911
LogDet of Q: 11255.715075
LogDet of InvNoise: 2302.585093
Likelihood calculation:
Nb. active samples = 1000
Nb. Monte-Carlo    = 10
Cholesky           = 1
Log-Determinant    = -1261.361258
Quadratic term     = 1105.147718
Log-likelihood     = -840.831763
-> likelihood (api) cholesky=1 -840.8318
In [29]:
useCholesky = 0

spdeMF = gl.SPDE(dat, model, True, False, 0)
spdeMF.setMeshes(True, meshes)
err = spdeMF.makeReady(True)
Input Mesh is provided and used as is
- Mesh #1/1: NMeshes = 10952, NApices = 5625
Projection of 'dbin' for Kriging: Created from Db and Mesh(es)
- Number of variables = 1
- Number of Latents   = 1
- Number of Apices    = 5625
- Number of Points    = 1000
In [30]:
useCholesky = 0
nMC = 100
params = gl.SPDEParam.create(nMC=nMC)
b2 = gl.logLikelihoodSPDE(dat,model,useCholesky=useCholesky, meshes=meshes, params=params, verbose=True)
print(f"-> likelihood by New API with cholesky=0 {round(b2,4)}")
Log-likelhood calculation in SPDE framework( Cholesky=0)
--------------------------------------------------------
Input Mesh is provided and used as is
- Mesh #1/1: NMeshes = 10952, NApices = 5625
Projection of 'dbin' for Kriging: Created from Db and Mesh(es)
- Number of variables = 1
- Number of Latents   = 1
- Number of Apices    = 5625
- Number of Points    = 1000
Drift coefficients = 
     0.079
LogDet of Q + ADA': 11569.018180
LogDet of Q: 11585.782711
LogDet of InvNoise: 2302.585093
Likelihood calculation:
Nb. active samples = 1000
Nb. Monte-Carlo    = 100
Cholesky           = 0
Log-Determinant    = -2319.349624
Quadratic term     = 1105.223469
Log-likelihood     = -311.875456
-> likelihood by New API with cholesky=0 -311.8755