Pluri-Gaussian¶

In [1]:
import pandas as pd
import gstlearn as gl
import gstlearn.plot as gp
import gstlearn.document as gdoc

gdoc.setNoScroll()

Prepare the basic gstlearn objects¶

Initial objects are located in a specific Directory which is defined in the next Container operation. Note that his operation will modify automatically all the names of the Files retreived using Serialize / Deserialize operation (not when reading using CSV). Also note that the Container name must be ended using a "/" (as its serves as a Directory).

In [2]:
ndim = 2
gl.defineDefaultSpace(gl.ESpaceType.RN, ndim)

Load the data file¶

This Load is performed starting from a CSV file.

In [3]:
filename = gdoc.loadData("BRGM", "Nitrates_LANU.csv")
datCat = pd.read_csv(filename, sep=";")
datCat.head()
Out[3]:
X Y LANU HyGeo NO3 BG
0 428575.6629 5.541444e+06 2 1 0.5 -99.0
1 431573.6549 5.542124e+06 2 1 150.0 -99.0
2 431565.6549 5.542160e+06 2 1 150.0 -99.0
3 435820.6447 5.543920e+06 3 1 78.0 -99.0
4 435955.6443 5.543910e+06 3 1 64.0 -99.0

Loading polygon from a file¶

The polygon is created by deserializing the Neutral Polygon File

In [4]:
filename = gdoc.loadData("BRGM", "poly_LANU.ascii")
poly = gl.Polygons.createFromNF(filename)
poly
Out[4]:
Polygons
--------
Number of Polygon Sets = 1

Creation of the gstlearn data base¶

In [5]:
dat = gl.Db()
fields = ["X", "Y", "LANU"]
dat[fields] = datCat[fields].values

Specification of the role of each variable (named "locators" in gstlearn)¶

In [6]:
dat.setLocators(["X", "Y"], gl.ELoc.X)  # Coordinates
dat.setLocator("LANU", gl.ELoc.Z)  # Variable of interest
dat
Out[6]:
Data Base Characteristics
=========================

Data Base Summary
-----------------
File is organized as a set of isolated points
Space dimension              = 2
Number of Columns            = 3
Total number of samples      = 1691

Variables
---------
Column = 0 - Name = X - Locator = x1
Column = 1 - Name = Y - Locator = x2
Column = 2 - Name = LANU - Locator = z1

Creation of the output grid¶

The output grid will contain 47 x 101 nodes. It is built to cover the data file plus an extension of 10000 x 10000.

In [7]:
Result = gl.DbGrid.createCoveringDb(dat, [47, 101], [], [], [50000, 50000])

Add a selection (mask the cells outside the polygon)¶

The initial selection (based on the application of the Polygon to the grid data base) must be dilated in order to avoid edge effect.

In [8]:
gl.db_polygon(Result, poly)
Result.setLocator("Polygon", gl.ELoc.Z)
Result.morpho(gl.EMorpho.DILATION, 0.5, 1.5, option=0, verbose=False, radius=[1, 1])
Result["Polygon"] = Result["Morpho.Polygon.*"]
Result.deleteColumn("Morpho.Polygon.*")
Result.setLocator("Polygon", gl.ELoc.SEL)
In [9]:
fig, ax = gp.init(figsize=[10, 10], flagEqual=True)
gp.plot(Result, "Polygon", useSel=False)
gp.plot(dat, nameColor="LANU", s=2)
gp.polygon(poly, linewidth=1, edgecolor="r")
gp.decoration(title="Initial information")
No description has been provided for this image

Computation of the proportions¶

Compute global proportions (for information)¶

In [10]:
propGlob = gl.dbStatisticsFacies(dat)
ncat = len(propGlob)
for i in range(ncat):
    print("Proportion of facies " + str(i + 1), "=", round(propGlob[i], 3))
Proportion of facies 1 = 0.273
Proportion of facies 2 = 0.47
Proportion of facies 3 = 0.046
Proportion of facies 4 = 0.205
Proportion of facies 5 = 0.007

Compute local proportions¶

The next parts will be simplified in a future dedicated API

2.2.1 Creation of the spatial regularization model for proportions

In [11]:
model = gl.Model.createFromDb(Result)
cova = gl.CovAniso.createIsotropic(
    model.getContext(),
    gl.ECov.MATERN,
    range=50000.0,
    param=2.0,
    sill=1.0,
)
model.addCov(cova)
In [12]:
err = gl.db_proportion_estimate(dat, Result, model)
In [13]:
dbfmt = gl.DbStringFormat()
dbfmt.setFlags(flag_stats=True)
dbfmt.setNames(["Prop.*"])
Result.display(dbfmt)
Data Base Grid Characteristics
==============================

Data Base Summary
-----------------
File is organized as a regular grid
Space dimension              = 2
Number of Columns            = 8
Total number of samples      = 4747
Number of active samples     = 752

Grid characteristics:
---------------------
Origin :  306764.2835218227.694
Mesh   :    4787.785   3803.924
Number :          47        101

Data Base Statistics
--------------------
4 - Name Prop.1 - Locator p1
 Nb of data          =        4747
 Nb of active values =         752
 Minimum value       =       0.014
 Maximum value       =       0.867
 Mean value          =       0.267
 Standard Deviation  =       0.153
 Variance            =       0.023
5 - Name Prop.2 - Locator p2
 Nb of data          =        4747
 Nb of active values =         752
 Minimum value       =       0.002
 Maximum value       =       0.937
 Mean value          =       0.481
 Standard Deviation  =       0.211
 Variance            =       0.044
6 - Name Prop.3 - Locator p3
 Nb of data          =        4747
 Nb of active values =         752
 Minimum value       =       0.000
 Maximum value       =       0.427
 Mean value          =       0.053
 Standard Deviation  =       0.052
 Variance            =       0.003
7 - Name Prop.4 - Locator p4
 Nb of data          =        4747
 Nb of active values =         752
 Minimum value       =       0.002
 Maximum value       =       0.975
 Mean value          =       0.193
 Standard Deviation  =       0.184
 Variance            =       0.034
8 - Name Prop.5 - Locator p5
 Nb of data          =        4747
 Nb of active values =         752
 Minimum value       =       0.000
 Maximum value       =       0.129
 Mean value          =       0.007
 Standard Deviation  =       0.014
 Variance            =       0.000

Variables
---------
Column = 0 - Name = x1 - Locator = x1
Column = 1 - Name = x2 - Locator = x2
Column = 2 - Name = Polygon - Locator = sel
Column = 3 - Name = Prop.1 - Locator = p1
Column = 4 - Name = Prop.2 - Locator = p2
Column = 5 - Name = Prop.3 - Locator = p3
Column = 6 - Name = Prop.4 - Locator = p4
Column = 7 - Name = Prop.5 - Locator = p5

Display the results¶

In [14]:
for i in range(ncat):
    fig, ax = gp.init(figsize=(10, 10), flagEqual=True)
    ax.raster(Result, name="Prop." + str(i + 1))
    ax.decoration(title="Proportion Facies #" + str(i + 1))
    ax.symbol(dat, nameColor="LANU", s=2, c="black")

    dat.addSelectionByLimit("LANU", gl.Limits((i + 1, i + 1)), "SelPoint")
    ax.symbol(dat, nameColor="LANU", s=0.8, c="red")
    dat.deleteColumn("SelPoint")
    ax.polygon(poly, linewidth=1, edgecolor="r")
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Creating the environment to infer the Rule. It uses a variogram calculated over very few lags close to the origin.

In [15]:
varioParam = gl.VarioParam()
dirparam = gl.DirParam.create(nlag=2, dlag=100)
varioParam.addDir(dirparam)
ruleprop = gl.RuleProp.createFromDb(Result)
ruleprop.fit(dat, varioParam, 1)
ngrf = ruleprop.getRule().getNGRF()
print("Number of GRF =", ngrf)
Number of GRF = 1
In [16]:
res = gp.plot(ruleprop.getRule())
No description has been provided for this image
In [17]:
dirparam = gl.DirParam.create(nlag=19, dlag=1000)
covparam = gl.VarioParam()
covparam.addDir(dirparam)
cov = gl.variogram_pgs(dat, covparam, ruleprop);
In [18]:
cov.display()
Non-centered Covariance characteristics
=======================================
Number of variable(s)       = 1
Number of direction(s)      = 1
Space dimension             = 2
Variance-Covariance Matrix      1.000

Direction #1
------------
Number of lags              = 19
Direction coefficients      =       1.000      0.000
Direction angles (degrees)  =       0.000
Tolerance on direction      =      90.000 (degrees)
Calculation lag             =    1000.000
Tolerance on distance       =      50.000 (Percent of the lag value)

For variable 1
       Rank     Npairs   Distance      Value
        -19   9732.000  17998.608      0.017
        -18   9506.000  17006.901     -0.017
        -17   8885.000  16006.927     -0.017
        -16   8815.000  15002.837      0.000
        -15   8068.000  14003.176      0.000
        -14   8019.000  12999.786      0.000
        -13   7684.000  12002.331     -0.052
        -12   7363.000  11000.802     -0.017
        -11   6935.000  10007.370     -0.039
        -10   6310.000   9006.739     -0.039
         -9   5975.000   8003.635     -0.039
         -8   5615.000   7013.830     -0.017
         -7   5035.000   6014.805     -0.052
         -6   4397.000   5019.156     -0.017
         -5   3682.000   4012.782     -0.073
         -4   3011.000   3019.326      0.000
         -3   2345.000   2038.996      0.094
         -2   1290.000   1055.802      0.219
         -1    238.000    307.621      0.781
          0   1691.000      0.000      1.000
          1    238.000   -307.621      0.781
          2   1290.000  -1055.802      0.219
          3   2345.000  -2038.996      0.094
          4   3011.000  -3019.326      0.000
          5   3682.000  -4012.782     -0.073
          6   4397.000  -5019.156     -0.017
          7   5035.000  -6014.805     -0.052
          8   5615.000  -7013.830     -0.017
          9   5975.000  -8003.635     -0.039
         10   6310.000  -9006.739     -0.039
         11   6935.000 -10007.370     -0.039
         12   7363.000 -11000.802     -0.017
         13   7684.000 -12002.331     -0.052
         14   8019.000 -12999.786      0.000
         15   8068.000 -14003.176      0.000
         16   8815.000 -15002.837      0.000
         17   8885.000 -16006.927     -0.017
         18   9506.000 -17006.901     -0.017
         19   9732.000 -17998.608      0.017

We extract the experimental variograms of each GRF.

In [19]:
vario1 = gl.Vario.createReduce(cov, [0], [], True)
if ngrf > 1:
    vario2 = gl.Vario(cov)
    vario2.resetReduce([1], [], True)
In [20]:
vario1.display()
if ngrf > 1:
    vario2.display()
Variogram characteristics
=========================
Number of variable(s)       = 1
Number of direction(s)      = 1
Space dimension             = 2
Variance-Covariance Matrix      1.000

Direction #1
------------
Number of lags              = 19
Direction coefficients      =       1.000      0.000
Direction angles (degrees)  =       0.000
Tolerance on direction      =      90.000 (degrees)
Calculation lag             =    1000.000
Tolerance on distance       =      50.000 (Percent of the lag value)

For variable 1
       Rank     Npairs   Distance      Value
          0    238.000    307.621      0.219
          1   1290.000   1055.802      0.781
          2   2345.000   2038.996      0.906
          3   3011.000   3019.326      1.000
          4   3682.000   4012.782      1.073
          5   4397.000   5019.156      1.017
          6   5035.000   6014.805      1.052
          7   5615.000   7013.830      1.017
          8   5975.000   8003.635      1.039
          9   6310.000   9006.739      1.039
         10   6935.000  10007.370      1.039
         11   7363.000  11000.802      1.017
         12   7684.000  12002.331      1.052
         13   8019.000  12999.786      1.000
         14   8068.000  14003.176      1.000
         15   8815.000  15002.837      1.000
         16   8885.000  16006.927      1.017
         17   9506.000  17006.901      1.017
         18   9732.000  17998.608      0.983

We now fit the model of each GRF considered as independent.

The fit is performed under the constraint that the sill should be 1.

In [21]:
ctxt = gl.CovContext(1, 2)  # use default space
constraints = gl.Constraints()
constraints.setConstantSillValue(1.0)
covs = [gl.ECov.MATERN]

modelPGS1 = gl.Model(ctxt)
modelPGS1.fit(vario1, covs, constraints)
modelPGS1.display()

if ngrf > 1:
    modelPGS2 = gl.Model(ctxt)
    modelPGS2.fit(vario2, covs, constraints)
    modelPGS2.display()
else:
    modelPGS2 = None
Model characteristics
=====================
Space dimension              = 2
Number of variable(s)        = 1
Number of basic structure(s) = 1
Number of drift function(s)  = 0
Number of drift equation(s)  = 0

Covariance Part
---------------
Matern (Third Parameter = 1.04026)
- Sill         =       1.000
- Range        =    1652.384
- Theo. Range  =     467.681
Total Sill     =       1.000
Known Mean(s)      0.000

For each GRF, we can plot the experimental variogram as well as the fitted model.

In [22]:
res = gp.varmod(vario1, modelPGS1)
if ngrf > 1:
    res = gp.varmod(vario2, modelPGS2)
No description has been provided for this image

In this paragraph, we compare the experimental indicator variogram to the one derived from the Model of the underlying GRFs.

In [23]:
dirparamindic = gl.DirParam.create(nlag=19, dlag=1000)
varioparamindic = gl.VarioParam()
varioparamindic.addDir(dirparamindic)
varioindic = gl.Vario(varioparamindic)
err = varioindic.computeIndic(dat)
In [24]:
varioindic2 = gl.model_pgs(dat, varioparamindic, ruleprop, modelPGS1, modelPGS2);
In [25]:
gp.varmod(varioindic, varioLinestyle="solid")
gp.geometry(dims=[10, 10])
gp.varmod(varioindic2, varioLinestyle="dashed")
gp.close()
No description has been provided for this image
In [26]:
neigh = gl.NeighUnique.create()
err = gl.simpgs(dat, Result, ruleprop, modelPGS1, modelPGS2, neigh)
In [27]:
gp.plot(Result)
gp.geometry(dims=[10, 10])
No description has been provided for this image