Introduction

The gstlearn R package is a cross-platform R package wrapping the gstlearn C++ Library. It offers to R users all famous Geostatistical methodologies developed and/or invented by the Geostatistic Team of the Geosciences Research Center! It is the successor of the RGeostats R package.

To install the gstlearn R Package, you need R 4.2 (or higher). You can then execute the following R command:

install.packages("gstlearn",repos="https://soft.mines-paristech.fr/cran")

About C++ & R

The gstlearn R package is generated using a particular version of SWIG See here. As a R user, you probably know more about Rcpp package. We have chosen SWIG in order to mutualize the wrapper code of gstlearn C++ library for several different target languages.

The classes and functions documentation is provided with the gstlearn C++ library as html files generated by Doxygen. Please, refer to gstlearn C++ library API See here for more details. Only the public methods are exported by SWIG and must be considered in the R package.

Their is currently no R documentation for the gstlearn R package. The user can refer to the C++ documentation and have to learn how to adapt the code into R language following these “conversion rules”:

  • C++ classes are automatically converted into S4 classes. After creating an instance of an S4 class, methods (i.e. class function) must be called using $ slot applied to the instance (i.e. an object) of that class (see db$display() in the example below).

  • If you ask for the class type of a gstlearn object under R (e.g. class(mygrid)), you will obtain the C++ class name prefixed with ‘p’ (e.g. _p_DbGrid).

  • In gstlearn functions, when indices are mentioned, they start at 0 (following the C++ convention). Eg. the first variable is the variable with index 0.

  • Static C++ methods (e.g. createFromNF method in DbGrid class) defined in a class (e.g. DbGrid) are renamed by joining the class name and the method name (e.g. DbGrid_createFromNF). Note: Static methods do not apply to object instances (e.g. mygrid$createFromNF() has no sense)

  • Static C++ variables (e.g. X locator) defined in a class (e.g. ELoc ‘enum’ class) must be accessed in R using special functions named following the same rules as static methods (e.g. ELoc_X())

  • All basic C++ types (double, int, bool, etc…) are automatically converted to/from R native types (numeric, integer, logical,…)

  • The C++ classes VectorDouble, VectorInt, etc… are automatically converted to/from R vectors (e.g. c(1,2,3))

  • The C++ classes VectorVectorDouble, VectorVectorInt, etc… are automatically converted to/from lists of R vectors (e.g. list(c(1,2,3), c(4,5,6)))

  • Some classes of the gstlearn C++ library have been extended in R:

    • Almost all classes are ‘stringable’ (those which inherit from AStringable), that means that you can type the object name in the R console prompt and hit ‘Enter’ key to obtain a detailed description of the object content. The same output text is obtained using the display method (e.g. mygrid$display())
    • Some classes have an additional R method named toTL (i.e. ‘to Target Language’) that permits to convert an object into the corresponding R type. For example, the instruction df = mygrid$toTL() permits to create a R data.frame from a Db object. In that case, the newly created data.frame will contain all variables from the Db (but locators and grid parameters (for DbGrid) will be lost)

Loading the package

library(gstlearn)
library(ggplot2)

Calling the next function (acknowledge_gstlearn) at startup is a good practice in order to check the version of gstlearn you are currently running:

#acknowledge_gstlearn()

First code: Create and display a database

We create a regular 2-D grid and simulate a variable using a geostatistical Model

db = DbGrid_create(nx=c(100,100))
model = Model_createFromParam(type = ECov_CUBIC(), range = 30)
err = simtub(NULL, db, model, nbtuba=1000)

The simulated result is plotted

p = ggDefaultGeographic()
p = p + plot.grid(db)
p = p + plot.decoration(title="Check is successful!")
ggPrint(p)

If you obtain a nice looking image corresponding to the simulation result on the grid … the installation of gstlearn is successfull. Here is the description of your grid database content:

db$display()
## 
## Data Base Grid Characteristics
## ==============================
## 
## Data Base Summary
## -----------------
## File is organized as a regular grid
## Space dimension              = 2
## Number of Columns            = 4
## Maximum Number of UIDs       = 4
## Total number of samples      = 10000
## 
## Grid characteristics:
## ---------------------
## Origin :      0.000     0.000
## Mesh   :      1.000     1.000
## Number :        100       100
## 
## Variables
## ---------
## Column = 0 - Name = rank - Locator = NA
## Column = 1 - Name = x1 - Locator = x1
## Column = 2 - Name = x2 - Locator = x2
## Column = 3 - Name = Simu - Locator = z1
## NULL

RGeostats to gstlearn

For people who were using RGeostats, here is the description of some new concepts introduced in gstlearn. We are currently working hard to improve the user experience. Some of the following differences are drawbacks that will be fixed in a close future.

# Create a Neutral File storing mygrid content
db$dumpToNF("toto.backup") # This could produce an error - see below
#...
# Restart the R session
#...
# Load the Neutral File and recreate mygrid
db = DbGrid_createFromNF("mygrid.backup")

Known caveats

When executing a method using the $ slot, if you experience the following error…

## Erreur dans validObject(.Object) : 
##   objet de classe “MethodWithNext” incorrect: Error : C stack usage  7972404 is too close to the limit

… there are two possible reasons: * You made a mistake in the method name * R cannot find the method because it is defined in an inherited class (for example dumpToNF is a method from ASerializable and currently mygrid$dumpToNF produces this error. In that case, you still can use the generic function behind the method by calling:

ASerializable_dumpToNF(db, "mygrid.backup")