Neighborhood¶
This file is meant to demonstrate the use of gstlearn for Moving Neighborhood search
import numpy as np
import pandas as pd
import sys
import os
import gstlearn as gl
import gstlearn.plot as gp
import gstlearn.document as gdoc
import matplotlib.pyplot as plt
import random as rnd
gdoc.setNoScroll()
Setting some global variables
# Set the Global Options
verbose = True
flagGraphic = True
# Define the Space Dimension
ndim = 2
gl.defineDefaultSpace(gl.ESpaceType.RN, ndim)
# Set the Seed for the Random Number generator
gl.law_set_random_seed(5584)
rnd.seed(13155)
In this paragraph, we generate a Poisson data set and check various neighborhoods around one specific node of a regular grid.
dxref = 0.1
grid = gl.DbGrid.create(nx=[10,10],dx=[dxref,dxref])
xlim = grid.getExtrema(0)
ylim = grid.getExtrema(1)
Random Point data set¶
coormin = grid.getCoorMinimum()
coormax = grid.getCoorMaximum()
nech = 100
data = gl.Db.createFromBox(nech, coormin, coormax)
gp.setDefaultGeographic(xlim=xlim, ylim=ylim, dims=[7,7])
cmap = gp.getColorMap(100)
ax = data.plot()
Checking standard neighborhood¶
Defining a standard Moving Neighborhood
nmini = 1
nmaxi = 15
radius = 0.3
nsect = 8
nsmax = 3
neigh = gl.NeighMoving.create(False, nmaxi, radius, nmini, nsect, nsmax)
neigh
Moving Neighborhood =================== Minimum number of samples = 1 Maximum number of samples = 15 Number of angular sectors = 8 Maximum number of points per sector = 3 Maximum horizontal distance = 0.3
Checking the neighborhood around a central grid node
node = 55
neigh.attach(data, grid)
ranks = gl.VectorInt()
neigh.select(node, ranks)
dataSel = data.clone()
dum = dataSel.addSelectionByRanks(ranks)
ax = data.plot()
ax = dataSel.plot(color='blue')
ax.neigh(neigh, grid, node, flagCell=False)
ax.decoration("Standard Neighborhood")
Defining variable block extensions¶
In this section, we will generate variables in the Grid which contain the cell extension: this cell extension replaces the constant value of the mesh.
nech = grid.getSampleNumber()
mini = 0.5
maxi = 2.5
blx = []
bly = []
for i in range(nech):
blx.append(dxref * rnd.uniform(mini, maxi))
bly.append(dxref * rnd.uniform(mini, maxi))
dum = grid.addColumns(blx, "X-ext", gl.ELoc.BLEX, 0)
dum = grid.addColumns(bly, "Y-ext", gl.ELoc.BLEX, 1)
The following display shows each block with its center and its cell extension.
ax = data.plot()
ax.decoration(title="Variable Block Size")
for node in range(nech):
ax = gp.sample(grid.getSampleCoordinates(node))
ax = gp.curve(grid.getCellEdges(node))
We choose a specific cell again and determine the standard block neighborhood
node = 56
neigh.attach(data, grid)
ranks = gl.VectorInt()
neigh.select(node, ranks)
dataSel = data.clone()
dum = dataSel.addSelectionByRanks(ranks)
ax = data.plot()
ax = dataSel.plot(color='blue')
ax.neigh(neigh, grid, node, flagCell=True)
ax.decoration(title="Standard Neighborhood")
Use the Cell neighborhood to force the selection of all samples belonging to the block
nmini = 1
neigh = gl.NeighCell.create(False, nmini)
neigh
Cell Neighborhood ================= Reject samples which do not belong to target Block
node = 56
neigh.attach(data, grid)
ranks = gl.VectorInt()
neigh.select(node, ranks)
dataSel = data.clone()
dum = dataSel.addSelectionByRanks(ranks)
ax = data.plot()
ax = dataSel.plot(color='blue')
ax.neigh(neigh, grid, node, flagCell=True)
ax.decoration(title="Neighborhood forced to the Cell")
Data gathered along lines¶
This chapter is meant to illustrate the specific problem when data are located along lines.
We assume that the distance between lines is much larger that the distance within each line. For illustrating this case, we generate the data as the nodes of a regular 2D grid with a grid mesh which is very different along X (across line) and Y (along line). The file is ultimately converted into a Point file to make the rest of the procedure more flexible.µ
grid = gl.DbGrid.create(nx=[20,100], dx=[10,1])
iuid = grid.addColumns(gl.VectorHelper.simulateUniform(grid.getSampleNumber()), "z", gl.ELoc.Z)
point = gl.Db.createReduce(grid)
gp.setDefaultGeographic(xlim='reset',ylim='reset',dims=[14,9])
ax = point.plot(size=2)
Check the neighborhood search for a target located close to one line, say at coordinates [98, 50]
target = gl.Db.createFromOnePoint([98, 50])
Create a standard neighborhood, say with 10 samples per neighborhood and a radius of 7.
nmini = 1
nmaxi = 10
radius = 7
neigh = gl.NeighMoving.create(False, nmaxi, radius, nmini)
neigh
Moving Neighborhood =================== Minimum number of samples = 1 Maximum number of samples = 10 Maximum horizontal distance = 7
neigh.attach(point, target)
ranks = gl.VectorInt()
neigh.select(0, ranks)
pointSel = point.clone()
dum = pointSel.addSelectionByRanks(ranks)
ax = point.plot(size=2)
ax = pointSel.plot(color='blue')
ax.neigh(neigh, target, 0)
ax.decoration("Standard Neighborhood - Small Radius")
With no surprise, all samples are gathered on the closest line. This may be a problem when performing the Estimation (using Kriging) of a variable which is characterized by a First Order Random Function: as a matter of fact, the Linear System is singular if all samples are located on a single line (whatever its orientation). Note that in the case of a Second Order Random Function, we must avoid having samples located on a second order Variety, e.g. a circle, an ellipse or 2 lines).
The first idea is to use the angular sectors: here we consider a new neighborhood adding this constraints when selecting the samples.
nmini = 1
nmaxi = 10
radius = 7
nsect = 8
nsmax = 3
neigh = gl.NeighMoving.create(False, nmaxi, radius, nmini, nsect, nsmax)
neigh
Moving Neighborhood =================== Minimum number of samples = 1 Maximum number of samples = 10 Number of angular sectors = 8 Maximum number of points per sector = 3 Maximum horizontal distance = 7
neigh.attach(point, target)
ranks = gl.VectorInt()
neigh.select(0, ranks)
pointSel = point.clone()
dum = pointSel.addSelectionByRanks(ranks)
ax = point.plot(size=2)
ax = pointSel.plot(color='blue')
ax.neigh(neigh, target, 0)
ax.decoration("Small Radius - Few Sectors")
Despite the use of angular sectors, the samples selected in the neighborhood still belong to a single line. The obvious reason is the size of the neighborhood radius which must be enlarged.
nmini = 1
nmaxi = 10
radius = 20
nsect = 8
nsmax = 3
neigh = gl.NeighMoving.create(False, nmaxi, radius, nmini, nsect, nsmax)
neigh
Moving Neighborhood =================== Minimum number of samples = 1 Maximum number of samples = 10 Number of angular sectors = 8 Maximum number of points per sector = 3 Maximum horizontal distance = 20
neigh.attach(point, target)
ranks = gl.VectorInt()
neigh.select(0, ranks)
pointSel = point.clone()
dum = pointSel.addSelectionByRanks(ranks)
ax = point.plot(size=2)
ax = pointSel.plot(color='blue')
ax.neigh(neigh, target, 0)
ax.decoration("Large Radius - Few Sectors")
The rule is now to consider each sector in turn and to select the sample closest to the target. This time, selected samples belong to more than a single line. The solution then is to increase the number of angular sectors so as to make it equal to the number of samples per neighborhood.
In that case, we will have better tendency to reach a line located further from the target, although this is not guaranteed.
nmini = 1
nmaxi = 20
radius = 20
nsect = 20
nsmax = 1
neigh = gl.NeighMoving.create(False, nmaxi, radius, nmini, nsect, nsmax)
neigh
Moving Neighborhood =================== Minimum number of samples = 1 Maximum number of samples = 20 Number of angular sectors = 20 Maximum number of points per sector = 1 Maximum horizontal distance = 20
neigh.attach(point, target)
ranks = gl.VectorInt()
neigh.select(0, ranks)
pointSel = point.clone()
dum = pointSel.addSelectionByRanks(ranks)
ax = point.plot(size=2)
ax = pointSel.plot(color='blue')
ax.neigh(neigh, target, 0)
ax.decoration("Large Radius - Many Sectors")
A idea could be to introduce a search ellipse (rather than a circle). The following example is the proof that this does not provide a valuable solution: it modifies the selected samples but does not extend the neighbrohood pattern.
nmini = 1
nmaxi = 20
radius = 20
nsect = 20
nsmax = 1
neigh = gl.NeighMoving.create(False, nmaxi, radius, nmini, nsect, nsmax, coeffs=[2,1])
neigh
Moving Neighborhood =================== Minimum number of samples = 1 Maximum number of samples = 20 Number of angular sectors = 20 Maximum number of points per sector = 1 Anisotropic Ranges : 40.000 20.000
neigh.attach(point, target)
ranks = gl.VectorInt()
neigh.select(0, ranks)
pointSel = point.clone()
dum = pointSel.addSelectionByRanks(ranks)
ax = point.plot(size=2)
ax = pointSel.plot(color='blue')
ax.neigh(neigh, target, 0)
ax.decoration("Ellipse Radius - Many Sectors")
Obviously, to reach samples located on more remote lines, we should extend the anisotropy dramatically (in the horizontal direction). Note that this would correspond to an exaggeration which would be exactly opposite to the direction of large density of the information (vertical direction).