Kriging with Inequalities¶

This script is to provide an answer to a Dicussion posted on the web site of gstlearn.

"I have a dataset with a trend and I am using UK for interpolation (after having identified the variogram). I would like now to add constraints (line segments along which the value is constrained , =, <= or >=). Does gstlearn allow that ?"

We are going to generate a small data set and simulate a random variable with a trend. Then we add a set of control points where we wish the estimated value to lie within an interval.

In [1]:
import gstlearn as gl
import gstlearn.plot as gp
import matplotlib.pyplot as plt
import numpy as np

import gstlearn.document as gdoc

gdoc.setNoScroll()
figsize = (5, 5)

We first generate a data set (samples located randomly) over a square field.

In [2]:
ndat = 300
db = gl.Db.createFillRandom(ndat=ndat, ndim=2, nvar=1)
db.display()
Data Base Characteristics
=========================

Data Base Summary
-----------------
File is organized as a set of isolated points
Space dimension              = 2
Number of Columns            = 4
Total number of samples      = 300

Variables
---------
Column = 0 - Name = rank - Locator = NA
Column = 1 - Name = x-1 - Locator = x1
Column = 2 - Name = x-2 - Locator = x2
Column = 3 - Name = z - Locator = z1
In [3]:
fix, ax = plt.subplots(figsize=figsize)
ax.symbol(db, nameSize="z", flagCst=True)
ax.decoration(title="Sample locations")
plt.show()
No description has been provided for this image

For defining a drift, we simply add a trend along the vertical coordinate

In [4]:
db["z"] = db["z"] + db["x-2"] * 10.0
In [5]:
fix, ax = plt.subplots(figsize=figsize)
ax.symbol(db, nameSize="z")
ax.decoration(title="Variable with vertical Trend")
plt.show()
No description has been provided for this image

Some statistics on the target variable

In [6]:
db.getStatsAsTable(names=["z"]).toTL()
Out[6]:
Number Minimum Maximum Mean St. Dev. Variance
z 300.0 -1.095538 11.374783 5.145939 2.965865 8.796356

We select all the samples whose second coordinate lie within the interval [0.4; 0.6]. The selection is called Band.

In [7]:
iuid = db.addSelectionByVariable("x-2", 0.4, 0.6, "Band")

In the next figure, the samples within the Band are represented with large circles.

In [8]:
fix, ax = plt.subplots(figsize=figsize)
db.clearSelection()
ax.symbol(db, nameColor="z")
db.setLocator("Band", gl.ELoc.SEL)
ax.symbol(db, nameSize="z", flagCst=True, c="black", s=100)
ax.decoration(title="Variable with Band Selection (in black)")
plt.show()
No description has been provided for this image

For the samples lying within the band, we define the lower and upper bounds of the limit within which the target variable should lie. These bounds are called BLow and Bup.

As a summary:

  • at samples within the Band, we have defined an interval where the target variable should lie. This interval is defined by variables Top and Bottom. The values of the target variable are set to "NA"
  • otherwise, the bounds are set to "NA".
In [9]:
db.clearSelection()
sel = db.getColumn("Band")
zval = db.getColumn("z")

db["z"] = np.where(sel == 0, zval, np.nan)
Blow = 13
Bup = 14
db["Bot"] = np.where(sel == 0, np.nan, Blow)
db["Top"] = np.where(sel == 0, np.nan, Bup)
db.setLocator("Top", gl.ELoc.U)
db.setLocator("Bot", gl.ELoc.L)

Some statistics

In [10]:
db.getStatsAsTable(names=["z", "Top", "Bot"]).display()
         Number     Minimum     Maximum        Mean    St. Dev.    Variance
  z     242.000      -1.096      11.375       5.155       3.267      10.672
Top      58.000      14.000      14.000      14.000       0.000       0.000
Bot      58.000      13.000      13.000      13.000       0.000       0.000

Traditional estimation¶

In this first step, let us perform an interpolation by Kriging at the nodes of a regular grid. We also need a Model describing the spatial characteristics of the target variable: here this model is given without any control, choosing a basic structure which tends to provide smooth contour lines (e.g. Cubic model).

Note that the Kriging step will only consider the samples where the target variable is defined

In [11]:
grid = gl.DbGrid.createCoveringDb(db, nx=[100, 100])
model = gl.Model.createFromParam(gl.ECov.CUBIC, range=0.4)
neigh = gl.NeighUnique()

err = gl.kriging(db, grid, model, neigh)

The previous step has created a variable in the output grid file corresponding to the estimation of the target variable and called Kriging.z.estim. We represent this variable graphically. The two isolines (e.g. Blow and Bup) are represented for illsutration.

In [12]:
fix, ax = plt.subplots(figsize=figsize)
ax.raster(grid, name="Kriging.z.estim", flagLegend=True)
ax.symbol(db, nameSize="z")
ax.isoline(grid, name="Kriging.z.estim", levels=[Blow, Bup])
ax.decoration(title="Traditional estimation")
plt.show()
No description has been provided for this image

Estimation with Inequality constraints¶

We are going to use the Gibbs sampler in order to provide a value at the samples where the target variable is missing, while honoring the definition interval. This requires a Model to be defined beforehand.

In [13]:
err = gl.gibbs_sampler(
    db,
    model,
    nbsimu=100,
    flag_ce=True,
    flag_cstd=True,
    namconv=gl.NamingConvention("Gibbs"),
)

We can now perform the Kriging estimation using **Gibbs.CE"" as target variable.

In [14]:
err = gl.kriging(db, grid, model, neigh)
In [15]:
fix, ax = plt.subplots(figsize=figsize)
ax.raster(grid, name="Kriging.Gibbs.CE.estim", flagLegend=True)
ax.symbol(db, nameSize="Gibbs.CE", flagCst=True)
db.setLocator("Band", gl.ELoc.SEL)
ax.symbol(db, nameSize="Gibbs.CE", flagCst=True, s=200, c="white")
db.clearSelection()
ax.isoline(grid, name="Kriging.Gibbs.CE.estim", levels=[Blow, Bup])
ax.decoration(title="Estimation with Inequality constraints")
plt.show()
No description has been provided for this image

We can compare both estimation maps side by side.

In [16]:
db
Out[16]:
Data Base Characteristics
=========================

Data Base Summary
-----------------
File is organized as a set of isolated points
Space dimension              = 2
Number of Columns            = 9
Total number of samples      = 300

Variables
---------
Column = 0 - Name = rank - Locator = NA
Column = 1 - Name = x-1 - Locator = x1
Column = 2 - Name = x-2 - Locator = x2
Column = 3 - Name = z - Locator = NA
Column = 4 - Name = Band - Locator = NA
Column = 5 - Name = Bot - Locator = lower1
Column = 6 - Name = Top - Locator = upper1
Column = 7 - Name = Gibbs.CE - Locator = z1
Column = 8 - Name = Gibbs.STD - Locator = NA
In [17]:
db.clearSelection()
fix, ax = plt.subplots(1, 2, figsize=(12, 6))

ax[0].raster(grid, name="Kriging.z.estim", vmin=0, vmax=15)
ax[0].symbol(db, nameSize="z", sizmax=50)
ax[0].isoline(grid, name="Kriging.z.estim", levels=[Blow, Bup])
ax[0].decoration(title="Traditional Estimation")

ax[1].raster(grid, name="Kriging.Gibbs.CE.estim", vmin=0, vmax=15)
ax[1].symbol(db, nameSize="Gibbs.CE", sizmax=50)
db.setLocator("Band", gl.ELoc.SEL)
ax[1].symbol(db, nameSize="Gibbs.CE", flagCst=True, s=200, c="white", sizmax=50)
db.clearSelection()
ax[1].isoline(grid, name="Kriging.Gibbs.CE.estim", levels=[Blow, Bup])
ax[1].decoration(title="Estimation with Inequality constraints")

plt.show()
No description has been provided for this image