In this preamble, we load the gstlearn library and clean the workspace.
rm(list=ls())
library(gstlearn)
library(ggplot2)
library(ggpubr)
library(ggnewscale)
plot.setDefaultGeographic(dims=c(8,8))
We load two data bases:
dat containing point observations of two variables across Scotland: the elevation (Elevation) and the temperature (January_temp, then renamed as Temperature)target containing a grid of points covering Scotland with a selection variable (inshore) selecting the points that are on land, and a variable (Elevation) giving the elevation at every point on land## Data points
fileNF = loadData("Scotland", "Scotland_Temperatures.NF")
dat = Db_createFromNF(fileNF)
### Rename the temperature variable
dat$setName("*temp", "Temperature")
## Target grid
fileNF = loadData("Scotland", "Scotland_Elevations.NF")
target = DbGrid_createFromNF(fileNF)
The target data base a (grid) map of the elevation across Scotland.
target
##
## Data Base Grid Characteristics
## ==============================
##
## Data Base Summary
## -----------------
## File is organized as a regular grid
## Space dimension = 2
## Number of Columns = 4
## Total number of samples = 11097
## Number of active samples = 3092
##
## Grid characteristics:
## ---------------------
## Origin : 65.000 535.000
## Mesh : 4.938 4.963
## Number : 81 137
##
## Variables
## ---------
## Column = 0 - Name = Longitude - Locator = x1
## Column = 1 - Name = Latitude - Locator = x2
## Column = 2 - Name = Elevation - Locator = f1
## Column = 3 - Name = inshore - Locator = sel
Let us plot its content.
ggDefaultGeographic() + plot(target, nameRaster="Elevation",
flagLegendRaster=TRUE,palette="Spectral",
legendNameRaster="Elevation (m)")
The dat data base contains 236 (point) samples of two variables across Scotland: elevation and temperature.
dat
##
## Data Base Characteristics
## =========================
##
## Data Base Summary
## -----------------
## File is organized as a set of isolated points
## Space dimension = 2
## Number of Columns = 5
## Total number of samples = 236
##
## Variables
## ---------
## Column = 0 - Name = rank - Locator = NA
## Column = 1 - Name = Longitude - Locator = x1
## Column = 2 - Name = Latitude - Locator = x2
## Column = 3 - Name = Elevation - Locator = NA
## Column = 4 - Name = Temperature - Locator = z1
We can use the dbStatisticsPrint function to compute statistics on variables of a Db. We specify
Db of interest (argument db),names),opers). This last argument is set through a EStatOption_fromKeys function, which we call with a vector containing the (abbreviated) names of the statistics (run EStatOption_printAll() for the full list)flagIso allowing to specify whether we want to compute the statistics for each variable separately (flagIso=TRUE), or whether we want to compute "filtered" statistics (flagIso=FALSE). In the latter case, we compute the statistics while only using the points where all the variables are defined.For instance, let us count the number of observations of each variable using the dbStatisticsPrint.
dbStatisticsPrint(dat, names = c("Elevation", "Temperature"),
opers=EStatOption_fromKeys(c("NUM")),
flagIso = FALSE, title="Number of observations", radix="")
##
## Number of observations
## ----------------------
## Number
## Elevation 236
## Temperature 151
## NULL
Since the data base dat contains 236 samples, we can conclude that the Elevation variable is defined at every point, but not the Temperature variable. Similarly, we can compute the mean and the extreme values of each variable in the observation data base.
dbStatisticsPrint(dat, names = c("Elevation", "Temperature"),
opers=EStatOption_fromKeys(c("MEAN","MINI","MAXI")),
flagIso = FALSE, title="Statistics of observations", radix="")
##
## Statistics of observations
## --------------------------
## Minimum Maximum Mean
## Elevation 2.000 800.000 146.441
## Temperature 0.600 5.200 2.815
## NULL
Finally, we compute the mean and the extreme values of the elevation in the target data base.
dbStatisticsPrint(target, names = c("Elevation"),
opers=EStatOption_fromKeys(c("MEAN","MINI","MAXI")),
flagIso = FALSE, title="Statistics of target", radix="")
##
## Statistics of target
## --------------------
## Minimum Maximum Mean
## Elevation 0.000 1270.000 241.152
## NULL
We can then compute the filtered mean, minimum and maximum for the Elevation and Temperature variables as follows:
dbStatisticsPrint(dat, names = c("Elevation", "Temperature"),
opers=EStatOption_fromKeys(c("MEAN","MINI","MAXI")),
flagIso = TRUE, title="Filtered statistics of observations", radix="")
##
## Filtered statistics of observations
## -----------------------------------
## Minimum Maximum Mean
## Elevation 3.000 387.000 87.974
## Temperature 0.600 5.200 2.815
## NULL
As explained above, the first row of the table contains contains the mean, the minimum and the maximum of the observations of the Elevation variable, over all the locations where both the Elevation na Temperature variables are defined (i.e. in our case, all the points where the temperature is defined). Hence, we see that the points where the temperature is observed are located at relatively low altitudes within the Scottish landscape.
To confirm that, we plot the points where the temperature is sampled on top of the elevation map of Scotland.
p = ggDefaultGeographic()
p = p + plot.grid(target, nameRaster="Elevation",
flagLegendRaster=TRUE,
palette="Spectral",
legendNameRaster="Elevation (m)")
#p = p + new_scale_color()
#p = p + new_scale("linetype")
p = p + plot.point(dat, nameSize="Temperature",
sizmin = 0.5, sizmax = 3,
flagLegendSize=TRUE,legendNameSize="°C")
ggPrint(p)