# Kriging¶

## Preamble¶

In this preamble, we load the **gstlearn** library.

```
import gstlearn as gl
import gstlearn.plot as gp
import gstlearn.document as gdoc
import matplotlib.pyplot as plt
import numpy as np
import os
gdoc.setNoScroll()
```

Then we download the data base `dat`

.

```
temp_nf = gdoc.loadData("Scotland", "Scotland_Temperatures.NF")
dat = gl.Db.createFromNF(temp_nf)
```

We also create a `Db`

object containing a grid covering the data points in the data base `dat`

. To do so, we start by displaying the minimal and maximal coordinates of the points in `dat`

using the `getExtremas()`

method from the `Db`

class.

```
dat.getExtremas()
```

array([[ 78.2, 460.7], [ 530.4, 1208.9]])

The first (resp. second) element of the list contains the min and max coordinates in the first (resp. second) space dimension. Based on this information, we create a grid covering all the points using the `DbGrid_create`

function. We specify the coordinates of the origin (i.e. lower left corner) of the grid (argument `x0`

), the step sizes in each dimension (argument `dx`

) and the number of points in each dimension (argument `nx`

).

```
grid = gl.DbGrid.create(x0=[65,530],dx=[4.94, 4.96],nx=[82,138])
```

We then print a summary of the content of the grid using the `display`

method of `Db`

class, which we supply with a `DbStringFormat`

object specifying that we would like information about the extent of the grid (argument `flag_extend`

in the `DbStringFormat_createFromFlags`

function).

```
dbfmt = gl.DbStringFormat.createFromFlags(flag_extend=True)
grid.display(dbfmt)
```

Data Base Grid Characteristics ============================== Data Base Summary ----------------- File is organized as a regular grid Space dimension = 2 Number of Columns = 3 Total number of samples = 11316 Grid characteristics: --------------------- Origin : 65.000 530.000 Mesh : 4.940 4.960 Number : 82 138 Data Base Extension ------------------- Coor #1 - Min = 65.000 - Max = 465.140 - Ext = 400.14 Coor #2 - Min = 530.000 - Max = 1209.520 - Ext = 679.52 Variables --------- Column = 0 - Name = rank - Locator = NA Column = 1 - Name = x1 - Locator = x1 Column = 2 - Name = x2 - Locator = x2

## Experimental Variogram and fitted Model¶

We compute the experimental variogram **vario2dir** (in 2 directions) (cf. Variography for more details).

```
varioParamMulti = gl.VarioParam.createMultiple(ndir=2, npas=15, dpas=15.)
vario2dir = gl.Vario(varioParamMulti)
err = vario2dir.compute(dat)
```

We then the fit a model **fitmod**

```
fitmod = gl.Model()
types=[gl.ECov.NUGGET, gl.ECov.EXPONENTIAL, gl.ECov.GAUSSIAN]
err = fitmod.fit(vario2dir,types=types)
```

```
ax = gp.varmod(vario2dir, fitmod)
```

## Simple kriging prediction¶

To perform simple kriging, we use the function called `kriging`

.
We specify:

- the
`Db`

object containing the data points (argument`dbin`

) : the variable used for kriging is the (unique) variable of the data base with a`z`

locator (i.e. it should have locator`z1`

and the other variables should not have a locator starting with`z`

) - the
`Db`

object containing the target points, i.e. the points where the kriging predictor will be computed (argument`dbout`

) - the
`Model`

object containing the model used to define the kriging predictor (argument`model`

): in particular, the mean used to define the predictor is the one set in the`Model`

object - the type of neighborhood used in the prediction (argument
`neigh`

), eg. unique neighborhood (to use all the data points for each predictor) or moving neighborhood (to use only the data points in the vicinity of the target point in the prediction). This argument is defined using a "neighborhood" object (see example below).

Additionally, it is possible to specify whether we wish to compute, at each target point, the kriging predictor (argument `flag_est`

, default=`TRUE`

), the kriging standard-deviation (argument `flag_std`

, default=`TRUE`

) and the kriging variance (argument `flag_varz`

, default=`FALSE`

).

The `kriging`

function then adds new variables to the `Db`

entered in the `dbout`

argument corresponding to these variables. The names of these newly created variables will start by `Kriging`

, but this prefix can be changed using the `namconv`

argument of the `kriging function`

In the next example, we perform a simple kriging prediction (with unique neighborhood) of the variable `January_temp`

in the `dat`

data base, on the grid defined in the `grid`

data base. To do so, we start by selecting the variable `January_temp`

in the `dat`

data base (i.e. we make ensure that it is the only variable with a `z`

locator).

```
dat.setLocator("January_temp",gl.ELoc.Z)
dat
```

Data Base Characteristics ========================= Data Base Summary ----------------- File is organized as a set of isolated points Space dimension = 2 Number of Columns = 5 Total number of samples = 236 Variables --------- Column = 0 - Name = rank - Locator = NA Column = 1 - Name = Longitude - Locator = x1 Column = 2 - Name = Latitude - Locator = x2 Column = 3 - Name = Elevation - Locator = NA Column = 4 - Name = January_temp - Locator = z1

We then create a "neighborhood" object corresponding to the specification of a unique neighborhood: this is done using the `NeighUnique`

function.

```
uniqueNeigh = gl.NeighUnique.create()
```

We now call the `kriging`

function to perform the kriging prediction. We use the model `fitmod`

that we previously fitted on our data, require to compute the kriging predictor and its standard-deviation (but not its variance), and change the prefix of the newly created variables to "SK".

```
err = gl.kriging(dbin=dat, dbout=grid, model=fitmod,
neigh=uniqueNeigh,
flag_est=True, flag_std=True, flag_varz=False,
namconv=gl.NamingConvention("SK")
)
```

We see that the kriging predictor and its standard deviation have been added to the `grid`

data base.

```
grid
```

Data Base Grid Characteristics ============================== Data Base Summary ----------------- File is organized as a regular grid Space dimension = 2 Number of Columns = 5 Total number of samples = 11316 Grid characteristics: --------------------- Origin : 65.000 530.000 Mesh : 4.940 4.960 Number : 82 138 Variables --------- Column = 0 - Name = rank - Locator = NA Column = 1 - Name = x1 - Locator = x1 Column = 2 - Name = x2 - Locator = x2 Column = 3 - Name = SK.January_temp.estim - Locator = z1 Column = 4 - Name = SK.January_temp.stdev - Locator = NA

Finally, we plot the kriging prediction over the grid and the data points.

```
gp.setDefaultGeographic(dims=[8,8])
fig, ax = gp.initGeographic()
ax.raster(grid, flagLegend=True)
ax.symbol(dat, c='black')
ax.decoration(title="Simple Kriging over whole Grid")
plt.show()
```

By default, the plotting function plots the variable with locator `z1`

, which in our case corresponds to the kriging predictor (as the `kriging`

function automatically assigns the locator `z1`

to it). To plot another variable, we can simply specify their name.

For instance, we can plot the kriging standard deviation using the following code.

```
fig, ax = gp.initGeographic()
ax.raster(grid, name="SK.January_temp.stdev", flagLegend=True)
ax.symbol(dat, c='black')
ax.decoration(title="Simple Kriging std-dev over whole Grid")
plt.show()
```

As mentioned above, the mean used in the simple kriging predictor is the one set in the `Model`

object supplied in the `kriging`

function. By default, this mean is zero. It can be changed using the `setMean`

method of the `Model`

object.

For instance, considering the model `fitmod`

previously fitted on the data, we can clone it (using the `clone`

method), and assign it a new mean (equal to 4) as follows.

```
fitmodSK = fitmod.clone()
err = fitmodSK.setMean(mean=4)
```

Then, simple kriging is performed using the same command as before, but with the newly created model.

```
err = gl.kriging(dbin=dat, dbout=grid, model=fitmodSK,
neigh=uniqueNeigh,
flag_est=True, flag_std=True, flag_varz=False,
namconv=gl.NamingConvention("Mean4_SK")
)
```

Finally, we plot the new kriging prediction over the grid and the data points.

```
grid
```

Data Base Grid Characteristics ============================== Data Base Summary ----------------- File is organized as a regular grid Space dimension = 2 Number of Columns = 7 Total number of samples = 11316 Grid characteristics: --------------------- Origin : 65.000 530.000 Mesh : 4.940 4.960 Number : 82 138 Variables --------- Column = 0 - Name = rank - Locator = NA Column = 1 - Name = x1 - Locator = x1 Column = 2 - Name = x2 - Locator = x2 Column = 3 - Name = SK.January_temp.estim - Locator = NA Column = 4 - Name = SK.January_temp.stdev - Locator = NA Column = 5 - Name = Mean4_SK.January_temp.estim - Locator = z1 Column = 6 - Name = Mean4_SK.January_temp.stdev - Locator = NA

```
fig, ax = gp.initGeographic()
ax.raster(grid, name="Mean4_SK.January_temp.estim", flagLegend=True)
ax.symbol(dat, c='black')
ax.decoration(title="Simple Kriging over whole Grid: Mean=4")
plt.show()
```

## Ordinary kriging¶

In gstlearn, ordinary kriging is seen as the formulation of a kriging predictor on a model with a constant drift. Hence, to perform ordinary kriging, we use the `kriging`

function with the same syntax as for simple kriging, but call it with a model that includes a constant drift. Adding a constant drift to a model is done with the `addDrift()`

method.

Note: Removing a drift from a model can be done using the `delDrift`

method (while specifying the index of the drift we wish to remove) or using the `delAllDrifts()`

(to remove all the drifts at once).

Let us go back to our running example. Considering the model `fitmod`

previously fitted on the data, we can clone it (using the `clone`

method), and add a constant drift as follows.

```
fitmodOK = fitmod.clone()
err = fitmodOK.addDrift(gl.DriftM())
```

Then, ordinary kriging is performed using the same command as before, but with the newly created model.

```
err = gl.kriging(dbin=dat, dbout=grid, model=fitmodOK,
neigh=uniqueNeigh,
flag_est=True, flag_std=True, flag_varz=False,
namconv=gl.NamingConvention("OK")
)
```

Finally, we plot the new kriging prediction over the grid and the data points.

```
fig, ax = gp.initGeographic()
ax.raster(grid, name="OK.January_temp.estim", flagLegend=True)
ax.symbol(dat, c='black')
ax.decoration(title="Ordinary Kriging over whole Grid")
plt.show()
```

We also plot the kriging standard deviations.

```
fig, ax = gp.initGeographic()
ax.raster(grid, name="OK.January_temp.stdev", flagLegend=True)
ax.symbol(dat, c='black')
ax.decoration(title="Ordinary Kriging std-dev over whole Grid")
plt.show()
```

Let us compare the results from the simple and ordinary kriging predictors. To do so, we create a correlation plot between the two predictors.

```
ax = gp.correlation(grid,namex="OK.January_temp.estim", namey="SK.January_temp.estim",
bissLine=True, bins=100, flagSameAxes=True, cmin=1)
ax.decoration(title="Estimation Simple vs. Ordinary",
xlabel="Ordinary Kriging", ylabel="Simple Kriging")
```

We also compare the kriging standard-deviations obtained in both cases.

```
ax = gp.correlation(grid,namex="OK.January_temp.stdev", namey="SK.January_temp.stdev",
bissLine=True, bins=100, flagSameAxes=True, cmin=1)
ax.decoration(title="Estimation Simple vs. Ordinary",
xlabel="Ordinary Kriging", ylabel="Simple Kriging")
```

## Working with selections¶

We now load new grid.

```
elev_nf = gdoc.loadData("Scotland", "Scotland_Elevations.NF")
grid = gl.DbGrid.createFromNF(elev_nf)
grid.display(dbfmt)
```

Data Base Grid Characteristics ============================== Data Base Summary ----------------- File is organized as a regular grid Space dimension = 2 Number of Columns = 4 Total number of samples = 11097 Number of active samples = 3092 Grid characteristics: --------------------- Origin : 65.000 535.000 Mesh : 4.938 4.963 Number : 81 137 Data Base Extension ------------------- Coor #1 - Min = 65.000 - Max = 455.123 - Ext = 390.123 Coor #2 - Min = 535.000 - Max = 1200.109 - Ext = 665.109 Variables --------- Column = 0 - Name = Longitude - Locator = x1 Column = 1 - Name = Latitude - Locator = x2 Column = 2 - Name = Elevation - Locator = f1 Column = 3 - Name = inshore - Locator = sel

This new grid data base contains a variable called `inshore`

, with a locator `sel`

. This indicates that this variable is a selection, i.e. a binary variable allowing to select some points, called *active cells*, in a data base (by setting them to 1, while the other points will be 0). Selections allow to restrict computations and operations made on a data base to only the active cells. For, when plotting a data base with a selection, only the active cells are represented.

```
fig, ax = gp.initGeographic()
ax.raster(grid, name="Longitude", flagLegend=True)
ax.symbol(dat, c='black')
ax.decoration(title="Longitude values (restricted to the selection)")
plt.show()
```