Title: | Significance Tests for Palaeoenvironmental Reconstructions |
---|---|
Description: | Several tests of quantitative palaeoenvironmental reconstructions from microfossil assemblages, including the null model tests of the statistically significant of reconstructions developed by Telford and Birks (2011) <doi:10.1016/j.quascirev.2011.03.002>, and tests of the effect of spatial autocorrelation on transfer function model performance using methods from Telford and Birks (2009) <doi:10.1016/j.quascirev.2008.12.020> and Trachsel and Telford (2016) <doi:10.5194/cp-12-1215-2016>. Age-depth models with generalized mixed-effect regression from Heegaard et al (2005) <doi:10.1191/0959683605hl836rr> are also included. |
Authors: | Richard Telford [aut, cre, cph], Mathias Trachsel [ctb] |
Maintainer: | Richard Telford <[email protected]> |
License: | GPL-3 |
Version: | 2.1-2 |
Built: | 2024-11-24 04:32:46 UTC |
Source: | https://github.com/richardjtelford/palaeosig |
Significance tests for quantitative palaeoenvironmental reconstructions derived from transfer functions. Functions from the autocorTF package are now included in palaeoSig.
This package includes:
significance tests for quantitative palaeoenvironmental reconstructions (randomTF
, obs.cor
)
graphical methods to show autocorrelation in transfer functions (rne
)
null model test of transfer functions performance in a spatially autocorrelated environment - see vignette.
Several functions have from autocorTF version 1.0 and palaeoSig version 1.0 have been rewritten or replaced with more flexible or user friendly functions. See news(package="palaeoSig")
for details.
See also my blog at https://quantpalaeo.wordpress.com/
Package: | palaeoSig |
Type: | Package |
Version: | 2.0-7 |
Date: | 2022-11-29 |
License: | GPL - 3 |
LazyLoad: | yes |
Richard Telford [email protected]
Telford, R. J. and Birks, H. J. B. (2009) Evaluation of transfer functions in spatially structured environments. Quaternary Science Reviews 28: 1309–1316. doi:10.1016/j.quascirev.2008.12.020
Telford, R. J. and Birks, H. J. B. (2011) A novel method for assessing the statistical significance of quantitative reconstructions inferred from biotic assemblages. Quaternary Science Reviews 30: 1272–1278. doi:10.1016/j.quascirev.2011.03.002
Generates species abundances based on species response functions and environmental variables.
abundances(env, spp, nc)
abundances(env, spp, nc)
env |
Environmental variables. Usually generated by |
spp |
Species parameters. Usually generated by |
nc |
Number of counts to be simulated. If omitted no simulation of the counting process is carried out. |
spp |
Data frame containing species abundances. |
env |
Data frame containing environmental variables. |
Mathias Trachsel and Richard J Telford
Minchin, P.R. (1987) Multidimensional Community Patterns: Towards a Comprehensive Model. Vegetatio, 71, 145-156.
spec <- species(nspp = 30,ndim = 10,Amax = runif,fun = runif, xpar = c(-50,150), srange = 200, alpha = 4, gamma = 4) env.var <- make.env(100,elen =rep(100,10),emean = rep(50,10), edistr ='uniform', ndim = 10) spec.abun <- abundances(env.var,spec,200)
spec <- species(nspp = 30,ndim = 10,Amax = runif,fun = runif, xpar = c(-50,150), srange = 200, alpha = 4, gamma = 4) env.var <- make.env(100,elen =rep(100,10),emean = rep(50,10), edistr ='uniform', ndim = 10) spec.abun <- abundances(env.var,spec,200)
Estimates the relationship of Calibrated age and depth for palaeorecords. The function uses a smooth spline of the mgcv library by Simon Wood. It produces predicted confidence interval for the relationship approximating a mixed effect model, as there are two levels of uncertainty, i.e. within dated object and between dated objects.
agelme(depup, depdo, bpup, bpdo, use, weights=c(1,rep(0,length(depup)-1)), vspan=1, k=length(depup)-1, m=2, diagnostic=FALSE)
agelme(depup, depdo, bpup, bpdo, use, weights=c(1,rep(0,length(depup)-1)), vspan=1, k=length(depup)-1, m=2, diagnostic=FALSE)
depup |
The upper depths of the dated slides |
depdo |
The lower depths of the dated slides |
bpup |
The younger calibrated ages of the dated slides |
bpdo |
The older calibrated ages of the dated slides |
use |
Logical vector of dates to include in the model. Default is to use all. |
weights |
Weights to be used for the estimation, default is fixed top-layer followed by inverse variance of within dated object |
vspan |
The span to be used for the diagnostic plots, default span = 1 |
k |
Number of base function to start the shrinkage in the gam estimation procedure |
m |
The order of penalty for the term, i.e. the degree of continuity at the knots (default, m = 2 gives cubic smooth spline) |
diagnostic |
Logical, should diagnostic plots be made. |
Note that the fixation of the top layer is done by a weight = 1, whereas the other weights follows inverse variance within object.
The diagnostic plots is used to check the quality of the estimation and to see if there is a need for an assumption of between object variance proportional to mean. The latter however is rarely encountered for palaeodata.
tdf |
Degrees of freedom used by the cubic smooth spline, a vector with first value for constant variance and second vector for variance equal to mu. |
weights |
A vector of the weights used by the cubic smooth spline |
RES |
A vector of the Residual sum of squares |
Models |
A list with the models from the cubic smooth spline, constant and mu variance, respectively |
Data |
A data.frame including the data used for the estimation |
Einar Heegaard <[email protected]>
Heegaard, E., Birks, HJB. & Telford, RJ. 2005. Relationships between calibrated ages and depth in stratigraphical sequences: an estimation procedure by mixed-effect regression. The Holocene 15: 612-618
data(STOR) fit.mod <- with(STOR,agelme(depthup,depthdo,cageup,cagedo)) #Predicting using the constant variance model, #for each cm between 70 and 400 cm. fit.pre <- predict(fit.mod,1,70:400) plot(fit.pre)
data(STOR) fit.mod <- with(STOR,agelme(depthup,depthdo,cageup,cagedo)) #Predicting using the constant variance model, #for each cm between 70 and 400 cm. fit.pre <- predict(fit.mod,1,70:400) plot(fit.pre)
Creates functions that transform arbitrary distributions into a Gaussian distributions, and vice versa.
anamorph(x, k, plot = FALSE)
anamorph(x, k, plot = FALSE)
x |
data |
k |
number of Hermite polynomials |
plot |
plot |
Increasing k can give a better fit.
Returns two function in a list
xtog |
Function to transform arbitrary variable x into a Gaussian distribution |
gtox |
The back transformation |
Richard Telford [email protected]
Wackernagel, H. (2003) Multivariate Geostatistics. 3rd edition, Springer-Verlag, Berlin.
set.seed(42) x <- c(rnorm(50, 0, 1), rnorm(50, 6, 1)) hist(x) ana.fun <- anamorph(x, 30, plot = TRUE) xg <- ana.fun$xtog(x) qqnorm(xg) qqline(xg) all.equal(x, ana.fun$gtox(xg))
set.seed(42) x <- c(rnorm(50, 0, 1), rnorm(50, 6, 1)) hist(x) ana.fun <- anamorph(x, 30, plot = TRUE) xg <- ana.fun$xtog(x) qqnorm(xg) qqline(xg) all.equal(x, ana.fun$gtox(xg))
Arctic pollen percent data and associated environmental data
data(arctic.pollen) data(arctic.env)
data(arctic.pollen) data(arctic.env)
A data frame with 828 observations on the percentage of 39 pollen taxa
Environmental data for the pollen sites
Data extracted from North American Pollen Database and New et al. (2002) by Fréchette et al. (2008). Following Fréchette (Pers. Comm.), three duplicate sites have been deleted.
Fréchette, B., de Vernal, A., Guiot, J., Wolfe, A. P., Miller, G. H., Fredskild, B., Kerwin, M. W. and Richard, P. J. H. (2008) Methodological basis for quantitative reconstruction of air temperature and sunshine from pollen assemblages in Arctic Canada and Greenland. Quaternary Science Reviews 27, 1197–1216 doi:10.1016/j.quascirev.2008.02.016
A dataset containing over 1000 foram assemblages from the Atlantic from Kucera et al (2005) and the 50m SST for the warmest season. Rare taxa and co-located assemblages are removed.
data(Atlantic)
data(Atlantic)
A data frame with 1093 rows and 33 variables. summ50 is 50m water temperature of the warmest season
doi:10.1594/PANGAEA.227322 https://www.nodc.noaa.gov/cgi-bin/OC5/woa13/woa13.pl?parameter=t
Plot of species WA optima and tolerance
centipede_plot(x, spp, minN2 = 1, mult = 1)
centipede_plot(x, spp, minN2 = 1, mult = 1)
x |
A tolerance weighted weighted-average model from
|
spp |
data.frame of species data used to train the WA model |
minN2 |
numeric giving minimum N2 for inclusion in plot |
mult |
numeric multiplier for the tolerances |
Extracts and sorts WA
optima and tolerances and
generates a ggplot.
Tends only to work well when there are a reasonable number of taxa,
otherwise it is difficult to read the names on the axis.
Rare taxa can be excluded with the minN2
argument.
The tol.cut
argument in WA
may need to be set to
prevent very small tolerances in rare taxa.
This function is very similar to the caterpillar
plot, but produces a ggplot
A ggplot
object.
library(rioja) data(SWAP) mod <- WA(SWAP$spec, SWAP$pH, tolDW = TRUE) coef(mod) centipede_plot(mod, spp = SWAP$spec, minN2 = 20)
library(rioja) data(SWAP) mod <- WA(SWAP$spec, SWAP$pH, tolDW = TRUE) coef(mod) centipede_plot(mod, spp = SWAP$spec, minN2 = 20)
Generates a correlation matrix for the environmental variables generated in make.env
and for correlated species optima in species
. Only used when correlated environmental variables or optima are generated.
cor.mat.fun(ndim, cors)
cor.mat.fun(ndim, cors)
ndim |
Number of environmental variables that are subsequently generated with |
cors |
List of correlations between environmental variables. Each element of the list consists of three numbers, the first two numbers indicate the variables that are correlated, the third number is the correlation coefficient. If correlations between two variables are omitted the correlation remains 0. |
Correlation matrix
Mathias Trachsel
correlations <- list(c(1, 2, 0.5), c(1, 4, 0.1), c(2, 5, 0.6)) cor.mat <- cor.mat.fun(5, correlations)
correlations <- list(c(1, 2, 0.5), c(1, 4, 0.1), c(2, 5, 0.6)) cor.mat <- cor.mat.fun(5, correlations)
A simple diagnostic plot showing the coverage of fossil taxa in modern calibration set
coverage_plot(spp, fos, n2_rare = 5, label = NULL)
coverage_plot(spp, fos, n2_rare = 5, label = NULL)
spp |
data.frame of modern species abundances |
fos |
data.frame of fossil species abundances |
n2_rare |
numeric value of Hill's N2 below which species are highlighted as rare |
label |
numeric label taxa where maximum fossil abundance - maximum modern abundance > label. Defaults to NULL which does not add labels |
Finds the maximum abundance of fossil taxa and plots this against
the maximum abundance the taxa in the modern calibration set.
Taxa with a Hill's N2 less than rare
in the calibration set are
highlighted in blue.
Taxa absent from the calibration set are highlighted in red.
If there are many taxa above the 1:1 line, or important fossil taxa have a
low N2 in the calibration set,
reconstructions should be interpreted with caution.
A ggplot
object.
data("SWAP", package = "rioja") data("RLGH", package = "rioja") coverage_plot(spp = SWAP$spec, fos = RLGH$spec, n2_rare = 5, label = 0)
data("SWAP", package = "rioja") data("RLGH", package = "rioja") coverage_plot(spp = SWAP$spec, fos = RLGH$spec, n2_rare = 5, label = 0)
Gives a measure of the species diversity in the fossil data.
Hill.N2.core(spp)
Hill.N2.core(spp)
spp |
Species data |
Uses Hill.N2
from the rioja package
Minimum, first quartile and median effective number of species
If the effective number of species is small, WA based reconstructions are unlikely to be significant, and MAT based reconstructions should be tested instead.
Richard Telford
Hill, M. O. (1973) Diversity and evenness: a unifying notation and its consequences. Ecology 54: 427–432.
require(rioja) data(RLGH) Hill.N2.core(RLGH$spec)
require(rioja) data(RLGH) Hill.N2.core(RLGH$spec)
Generates synthetic variables with different proportion of two environmental variables, and tests how much variance in the fossil data reconstructions of these synthetic variables explain.
jointsig(spp, fos, var1, var2, method = "randomTF", n = 99, r = 32, ...) ## S3 method for class 'js' plot(x, names.v1, names.v2, ...)
jointsig(spp, fos, var1, var2, method = "randomTF", n = 99, r = 32, ...) ## S3 method for class 'js' plot(x, names.v1, names.v2, ...)
spp |
Data frame of modern training set species data, transformed as required, for example with sqrt |
fos |
Data frame of fossil species data, with same species codes and transformations as spp |
var1 |
Training set environmental variable 1 |
var2 |
Training set environmental variable 2 |
method |
Which significance test to use. Current option are randomTF and obs.cor. The latter may give strange results - use with caution. |
n |
number of random training sets used to generate the null model |
r |
How many synthetic variables to make. More is better but slower |
... |
Other arguments to plot |
x |
Output from jointsig |
names.v1 |
Vector length 2 with names of the end members of the first environmental variable, e.g., c("cold", "warm") for temperature. |
names.v2 |
Ditto for the second variable. |
With method="randomTF"
, the function calculates the proportion of
variance in the fossil data explained by transfer function reconstructions of
synthetic variables.
The synthetic variables are composed of two environmental variables, weighted
between -1 and +1, so to represent a circle.
This is compared with a null distribution of the proportion of variance
explained by reconstructions based on random environmental variables.
Any transfer function in the rioja library can be used.
With method="obs.cor", the aim is the same, but the function reports the
correlation between the species weighted average optima on the synthetic
variables and the species first axis scores.
This option has some pathological behaviour and should probably be avoided.
A list with components
PCA The unconstrained ordination of the fossil data.
preds A list of the containing the reconstructions for each environmental variable.
MAX Proportion of the variance explained by the first axis of the unconstrained ordination. This is the maximum amount that a reconstruction of a single variable can explain.
EX The proportion of the variance in the fossil data explained by each reconstruction.
sim.ex The proportion of variance explained by each of the random environmental variables.
sig The p-value of each reconstruction.
plot(js)
: Plot js object
Richard Telford [email protected]
Unpublished method - use with caution. Can give spurious results with weighted averaging.
require(rioja) data(SWAP) data(RLGH) rlgh.js <- jointsig( spp = sqrt(SWAP$spec), fos = sqrt(RLGH$spec), var1 = SWAP$pH, var2 = sample(SWAP$pH), method = "randomTF", n = 49, r = 32, fun = WA, col = 1 ) # nonsense second variable plot(rlgh.js, c("acid", "alkaline"), c("down", "up"))
require(rioja) data(SWAP) data(RLGH) rlgh.js <- jointsig( spp = sqrt(SWAP$spec), fos = sqrt(RLGH$spec), var1 = SWAP$pH, var2 = sample(SWAP$pH), method = "randomTF", n = 49, r = 32, fun = WA, col = 1 ) # nonsense second variable plot(rlgh.js, c("acid", "alkaline"), c("down", "up"))
Generates environmental variables used for generating species abundances. Environmental variables may be correlated, and may follow different distributions.
make.env(n, elen, emean, edistr, ecor, ndim)
make.env(n, elen, emean, edistr, ecor, ndim)
n |
Number of samples to be generated. |
elen |
Range of the environmental variables. Single number or vector of length ndim. |
emean |
Mean of the environmental variables. Single number or vector of length ndim. |
edistr |
Distribution of the environmental variables. Currently 'uniform' and 'Gaussian' are supported. |
ecor |
Correlation matrix of the environmental variables supplied by user. Object generated by |
ndim |
Number of environmental variables to generate. |
Matrix of environmental variables. n rows and ndim columns.
Mathias Trachsel and Richard J. Telford
Minchin, P.R. (1987) Multidimensional Community Patterns: Towards a Comprehensive Model. Vegetatio, 71, 145-156.
env.vars <- make.env(100,elen =rep(100,10),emean = rep(50,10), edistr ='uniform',ndim = 10)
env.vars <- make.env(100,elen =rep(100,10),emean = rep(50,10), edistr ='uniform',ndim = 10)
Function to simulate species data following Minchin (1987). This functions generates species response functions, simulates environmental variables and simulates species assemblages based on species response functions and environmental variables. Users can as well supply own species parameters (e.g. when simulating calibration and fossil datasets) and own environmental variables.
make.set(ndim, n, elen, emean, edistr, ecor, cnt, spec, env,...)
make.set(ndim, n, elen, emean, edistr, ecor, cnt, spec, env,...)
ndim |
Number of environmental variables to generate. |
n |
Number of samples to be generated. |
elen |
Range of the environmental variables. Single number or vector of length ndim. |
emean |
Mean of the environmental variables. Single number or vector of length ndim. |
edistr |
Distribution of the environmental variables. Currently 'uniform' and 'Gaussian' are supported. |
ecor |
Correlation matrix of the environmental variables supplied by user. Object generated by |
cnt |
Number of counts to be simulated. |
spec |
Users may supply their own species parameters. |
env |
Users may supply their own environmental variables. |
... |
Arguments passed to |
spp |
Species abundance data. |
env |
Environmental variables used to simulate species abundance data. |
spec |
Species parameters. |
Mathias Trachsel and Richard J. Telford
Minchin, P.R. (1987) Multidimensional Community Patterns: Towards a Comprehensive Model. Vegetatio, 71, 145-156.
make.env
, species
, cor.mat.fun
calib <- make.set(nspp = 90,ndim = 3,Amax = runif,fun = runif, xpar = c(-50,150), srange = 400, alpha = 4, gamma = 4,n = 100,elen =rep(100,3),emean = rep(50,3), edistr ='uniform', cnt = 1000) # Provide species parameters generated above, so that the fossil data use the # same species parameters. fos <- make.set(ndim = 3,n = 100,elen =rep(100,3),emean = rep(50,3), edistr ='uniform', cnt = 1000, spec = calib$spec) # Supplying own environmental variables and species parameters. env.vars <- make.env(100,elen =rep(100,3),emean = rep(50,3), edistr ='uniform',ndim = 3) fos <- make.set(cnt = 1000, spec = calib$spec, env = env.vars)
calib <- make.set(nspp = 90,ndim = 3,Amax = runif,fun = runif, xpar = c(-50,150), srange = 400, alpha = 4, gamma = 4,n = 100,elen =rep(100,3),emean = rep(50,3), edistr ='uniform', cnt = 1000) # Provide species parameters generated above, so that the fossil data use the # same species parameters. fos <- make.set(ndim = 3,n = 100,elen =rep(100,3),emean = rep(50,3), edistr ='uniform', cnt = 1000, spec = calib$spec) # Supplying own environmental variables and species parameters. env.vars <- make.env(100,elen =rep(100,3),emean = rep(50,3), edistr ='uniform',ndim = 3) fos <- make.set(cnt = 1000, spec = calib$spec, env = env.vars)
MAT for many environmental variables simultaneously. More efficient than calculating them separately for each variable.
multi.mat(training.spp, envs, core.spp, noanalogues = 10, method = "sq-chord", run = "both")
multi.mat(training.spp, envs, core.spp, noanalogues = 10, method = "sq-chord", run = "both")
training.spp |
Community data |
envs |
Environmental variables - or simulations |
core.spp |
Optional fossil data to make predictions for |
noanalogues |
Number of analogues to use |
method |
distance metric to use |
run |
Return LOO predictions or predictions for fossil data |
Matrix of predictions
Richard Telford [email protected]
Telford, R. J. and Birks, H. J. B. (2009) Evaluation of transfer functions in spatially structured environments. Quaternary Science Reviews 28: 1309–1316. doi:10.1016/j.quascirev.2008.12.020
data(arctic.env) data(arctic.pollen) mMAT <- multi.mat(arctic.pollen, arctic.env[,9:67], noanalogues = 5)
data(arctic.env) data(arctic.pollen) mMAT <- multi.mat(arctic.pollen, arctic.env[,9:67], noanalogues = 5)
obs.cor calculates the weighted correlation between the species weighted average optima and the axis one species scores of an ordination constrained by the WA reconstruction.
obs.cor( spp, env, fos, ord = rda, n = 99, min.occur = 1, autosim, permute = FALSE ) ## S3 method for class 'obscor' plot( x, xlab, ylab, f = 5, which = 1, variable_names = "env", abun = "abun.calib", p_val = 0.05, ... ) ## S3 method for class 'obscor' identify(x, labels, ...) ## S3 method for class 'obscor' autoplot( object, which = 1, variable_names = "env", abun = "abun.calib", p_val = 0.05, nbins = 20, top = 0.7, ... )
obs.cor( spp, env, fos, ord = rda, n = 99, min.occur = 1, autosim, permute = FALSE ) ## S3 method for class 'obscor' plot( x, xlab, ylab, f = 5, which = 1, variable_names = "env", abun = "abun.calib", p_val = 0.05, ... ) ## S3 method for class 'obscor' identify(x, labels, ...) ## S3 method for class 'obscor' autoplot( object, which = 1, variable_names = "env", abun = "abun.calib", p_val = 0.05, nbins = 20, top = 0.7, ... )
spp |
Data frame of modern training set species data, transformed if
required, for example with |
env |
Vector of a single environmental variable |
fos |
Data frame of fossil species data. Species codes and transformations should match those in spp. |
ord |
Constrained ordination method to use.
|
n |
number of random training sets. More is better. |
min.occur |
Minimum number of occurrences of species in the species and fossil data. |
autosim |
Optional data frame of random values. This is useful if the
training set is spatially autocorrelated and the supplied data frame contains
autocorrelated random variables.
If |
permute |
logical value. Generate random environmental variables by
permuting existing variable.
Only possible if there is only one environmental variable and |
x |
An obscor object. |
xlab |
X-axis label if the default is unsatisfactory. |
ylab |
Y-axis label if the default is unsatisfactory. |
f |
Scale factor for the abundances, the maximum cex of points for the which=1 plot. |
which |
Which type of plot. which = 1 gives a plot of RDA scores against species optima. which = 2 gives a histogram showing the null distribution of correlations between RDA scores and species optima, together with the observed correlation. |
variable_names |
Name of environmental variable (only 1 currently) for the label on the observed correlation with which = 2 |
abun |
Which species weighting required for plots. See details |
p_val |
P value to draw a line vertical line at (with which=2) |
... |
Other arguments to plot or identify |
labels |
Labels for the points in identify. By default, the species names from intersection of colnames(spp) and colnames(fos) are used. |
object |
An obscor object. |
nbins |
integer giving number of bins for the histogram |
top |
Proportion of the figure below the environmental name labels. |
Obs.cor calculates the (weighted) correlation between the species WA optima in the calibration set and their ordination axis one scores in the fossil data. Seven different weights for the species are implemented.
"abun.fos" - weight by the mean abundance in the fossil data.
"abun.calib" - weight by the mean abundance in the calibration data
"abun.joint" - weight by the product of the mean abundance in the fossil and calibration data
"n2.fos" - weight by the effective number of occurrences (Hill's N2) of each species in the fossil data
"n2.calib" - weight by the effective number of occurrences (Hill's N2) of each species in the calibration data
"n2.joint" - weight by the product of n2.calib and n2.fos
"unweighted" - all species receive same weight. This is unlikely to be the best option but is included for completeness.
It is unclear which of these weights is likely to be best: research is in progress. A square root transformation of the species data is often useful. n = 99 is too small in practice to give a smooth histogram of the null model. n = 999 is better.
obs.cor returns an obscor object, which is a list
ob Observed correlation. Data.frame with columns Optima, RDA1 and abun containing the species optima, ordination axis 1 scores, and abundance used to weight the species respectively and a vector containing the weighted and unweighted correlations between species optima and ordination scores.
sim Matrix with the correlation between species weighted average optima and ordination scores in the first column and the weighted correlation in the second column. Each row represents a different random environmental variable.
sigs p-value for the observed correlation between species weighted average optima and ordination scores for each of the weights.
plot(obscor)
: Plots for obscor object
identify(obscor)
: Identify species on obs.cor plot
autoplot(obscor)
: autoplot for obscor object
The test of the weighted correlation between species optima and
ordination axis scores is more powerful, especially with a small number of
fossil observations, that the test of variance explained in
randomTF
but is only applicable to WA and will have a large
type II error if there are few species.
Richard Telford [email protected]
Telford, R. J. and Birks, H. J. B. (2011) A novel method for assessing the statistical significance of quantitative reconstructions inferred from biotic assemblages. Quaternary Science Reviews 30: 1272–1278. doi:10.1016/j.quascirev.2011.03.002
require(rioja) data(SWAP) data(RLGH) rlgh.obs <- obs.cor( spp = sqrt(SWAP$spec), env = SWAP$pH, fos = sqrt(RLGH$spec), n = 49 # low number for speed ) rlgh.obs$sig plot(rlgh.obs, which = 1) plot(rlgh.obs, which = 2) require(ggplot2) autoplot(rlgh.obs, which = 1) autoplot(rlgh.obs, which = 2, variable_names = "pH")
require(rioja) data(SWAP) data(RLGH) rlgh.obs <- obs.cor( spp = sqrt(SWAP$spec), env = SWAP$pH, fos = sqrt(RLGH$spec), n = 49 # low number for speed ) rlgh.obs$sig plot(rlgh.obs, which = 1) plot(rlgh.obs, which = 2) require(ggplot2) autoplot(rlgh.obs, which = 1) autoplot(rlgh.obs, which = 2, variable_names = "pH")
Plots fitted agelme model and dates
## S3 method for class 'fittedAgelme' plot(x, main, xlab = "Depth", ylab = "Calibrated Age", ...)
## S3 method for class 'fittedAgelme' plot(x, main, xlab = "Depth", ylab = "Calibrated Age", ...)
x |
Fitted agelme model. |
main |
Title of the plot. |
xlab |
x axis label of the plot. |
ylab |
y axis label of the plot. |
... |
Other arguments to plot. |
data(STOR) fit.mod <- with(STOR, agelme(depthup, depthdo, cageup, cagedo)) #Predicting using the constant variance model, #for each cm between 70 and 400 cm. fit.pre <- predict(fit.mod, 1, 70:400) plot(fit.pre)
data(STOR) fit.mod <- with(STOR, agelme(depthup, depthdo, cageup, cagedo)) #Predicting using the constant variance model, #for each cm between 70 and 400 cm. fit.pre <- predict(fit.mod, 1, 70:400) plot(fit.pre)
Calculates effect of deleting sites from training set at random, from a geographic neighbourhood, or from an environmental neighbourhood. A simple graphical technique for gauging the effect of spatial autocorrelation on the transfer function model.
## S3 method for class 'RNE' plot(x, which = 1, ylim, ...) rne( y, env, geodist, fun, neighbours, subsets = c(1, 0.75, 0.5, 0.25, 0.1), ... )
## S3 method for class 'RNE' plot(x, which = 1, ylim, ...) rne( y, env, geodist, fun, neighbours, subsets = c(1, 0.75, 0.5, 0.25, 0.1), ... )
x |
RNE object to be plotted |
which |
Which column of the results to plot eg if more than one WAPLS component is calculated |
ylim |
Y-limits of the plot |
... |
Arguments passed to fun |
y |
Community data, or distance object, or distance matrix |
env |
Environmental variable |
geodist |
Matrix of geographical distances between sites |
fun |
Transfer function |
neighbours |
Neighbourhood radii |
subsets |
Proportion of sites to retain in random deletion |
Finds the leave-one-out transfer function performance if sites are deleted at random, from a neighbourhood zone, or by deleting environmentally close sites.
Prior to version 2.1, this function would repeat the random removal 10 times to reduce variance in results. This is no longer done as the variance is small for large training sets, it took a long time, and treats one treatment of the data differently.
Returns an RNE object, list with two components
random Performance with random deletion.
neighbour Performance with deletion by neighbourhood, or environment
plot(RNE)
: Plot RNE object
Richard Telford [email protected]
Telford, R. J. and Birks, H. J. B. (2009) Evaluation of transfer functions in spatially structured environments. Quaternary Science Reviews 28: 1309–1316. doi:10.1016/j.quascirev.2008.12.020
require(rioja) require(sf) data(arctic.env) data(arctic.pollen) # using just the first 100 sites so that code runs quickly (about 15 seconds for all 828 sites) # convert environmental data into an sf object arctic.env <- st_as_sf( x = arctic.env, coords = c("Longitude", "Latitude"), crs = 4326 ) # find great circle distances and remove units arctic.dist <- st_distance(arctic.env[1:100, ]) |> units::set_units("km") |> units::set_units(NULL) # rne arctic.rne <- rne( y = arctic.pollen[1:100, ], env = arctic.env$tjul[1:100], geodist = arctic.dist, fun = MAT, neighbours = c(0, 200), subsets = c(1, .5), k = 5 ) plot(arctic.rne)
require(rioja) require(sf) data(arctic.env) data(arctic.pollen) # using just the first 100 sites so that code runs quickly (about 15 seconds for all 828 sites) # convert environmental data into an sf object arctic.env <- st_as_sf( x = arctic.env, coords = c("Longitude", "Latitude"), crs = 4326 ) # find great circle distances and remove units arctic.dist <- st_distance(arctic.env[1:100, ]) |> units::set_units("km") |> units::set_units(NULL) # rne arctic.rne <- rne( y = arctic.pollen[1:100, ], env = arctic.env$tjul[1:100], geodist = arctic.dist, fun = MAT, neighbours = c(0, 200), subsets = c(1, .5), k = 5 ) plot(arctic.rne)
This function uses the output from 'agelme' to predict the Calibrated ages for specified depths.
## S3 method for class 'agelme' predict(object, v = 1, depth,...)
## S3 method for class 'agelme' predict(object, v = 1, depth,...)
object |
An 'agelme' model |
v |
Using constant (1) or mu (2) variance |
depth |
A vector of the depths to be predicted |
... |
Other arguments, currently unused. |
A list with three items
v Whether constant variance or mu variance used.
fit A data.frame of the predictions including 95% confidence interval.
DepthThe depths for the predicted ages
EstagePredicted age
LowlimLower 95% confidence interval
UpplimUpper 95% confidence interval
TsdTotal standard deviation
data A data.frame containing the age and depth information of the radiocarbon dates.
Einar Heegaard <[email protected]
Heegaard, E., Birks, HJB. & Telford, RJ. 2005. Relationships between calibrated ages and depth in stratigraphical sequences: an estimation procedure by mixed-effect regression. The Holocene 15: 612-618
data(STOR) fit.mod <- with(STOR,agelme(depthup,depthdo,cageup,cagedo)) #Predicting using the constant variance model, #for each cm between 70 and 400 cm. fit.pre <- predict(fit.mod,1,70:400) plot(fit.pre)
data(STOR) fit.mod <- with(STOR,agelme(depthup,depthdo,cageup,cagedo)) #Predicting using the constant variance model, #for each cm between 70 and 400 cm. fit.pre <- predict(fit.mod,1,70:400) plot(fit.pre)
Calculate the proportion of variance in the fossil data explained by an environmental reconstruction with a constrained ordination. This value is compared with a null distribution calculated as the proportion of variance in the fossil data explained by reconstructions from transfer functions trained on random data.
randomTF( spp, env, fos, n = 99, fun, col, condition = NULL, autosim, ord = rda, permute = FALSE, models, make_models = FALSE, ... ) ## S3 method for class 'palaeoSig' plot(x, variable_names, top = 0.7, adj = c(0, 0.5), p_val = 0.05, ...) ## S3 method for class 'palaeoSig' autoplot(object, variable_names, nbins = 20, top = 0.7, p_val = 0.05)
randomTF( spp, env, fos, n = 99, fun, col, condition = NULL, autosim, ord = rda, permute = FALSE, models, make_models = FALSE, ... ) ## S3 method for class 'palaeoSig' plot(x, variable_names, top = 0.7, adj = c(0, 0.5), p_val = 0.05, ...) ## S3 method for class 'palaeoSig' autoplot(object, variable_names, nbins = 20, top = 0.7, p_val = 0.05)
spp |
Data frame of modern training set species data, transformed as
required for example with |
env |
Data frame of training set environmental variables or vector with single environmental variable |
fos |
Data frame of fossil species data, with same species codes and
transformations as |
n |
number of random training sets. More is better. |
fun |
Transfer function method.
Additional arguments to |
col |
Some transfer functions return more than one column of results,
for example with different |
condition |
Optional data frame of reconstructions to partial out when testing if multiple independent reconstructions are possible. |
autosim |
Optional data frame of random values.
This is useful if the training set is spatially autocorrelated and the
supplied data frame contains autocorrelated random variables.
If |
ord |
Constrained ordination method to use. |
permute |
logical value. Generate random environmental variables by
permuting existing variable. Only possible if there is only one environmental
variable and |
models |
list of models made by |
make_models |
logical, should a list of transfer functions trained on random data be returned |
... |
Other arguments to the transfer function. For example to change
the distance metric in |
x |
Output from randomTF |
variable_names |
Names of environmental variables. If missing, taken
from |
top |
Proportion of the figure below the environmental name labels. |
adj |
Adjust the position that the environmental names are plotted at. |
p_val |
P value to draw a line vertical line at (with which=2) |
object |
Output from randomTF |
nbins |
integer giving number of bins for the histogram |
The function calculates the proportion of variance in the fossil data explained by the transfer function reconstruction. This is compared with a null distribution of the proportion of variance explained by reconstructions based on random environmental variables. Reconstructions can be partialled out to test if multiple reconstructions are statistically significant. If the environment is spatially autocorrelated, a red-noise null should be used instead of the default white noise null. Red noise environmental variables can be generated with the gstat package.
Any transfer function in the rioja package can be used. Other methods (e.g. random forests) can be used by making a wrapper function.
If reconstructions from several sites are to be tested using the same
training set it can be much faster to train the models on random
environmental data once and then use them repeatedly.
This can be done with make_models = TRUE
and then running
randomTF
again giving the resultant models to the models
argument.
make_models
does not work with MAT.
For some transfer function methods, including WA
, the code can be made
somewhat faster by coercing the modern and fossil species data to matrices
(spp <- as.matrix(spp)
), otherwise WA
has to do this repeatedly.
With MAT
, this should not be done as it might cause an error.
A list with components
PCA The unconstrained ordination of the fossil data.
preds A list of the containing the reconstructions for each environmental variable.
MAX Proportion of the variance explained by the first axis of the unconstrained ordination. This is the maximum amount that a reconstruction of a single variable can explain.
EX The proportion of the variance in the fossil data explained by each reconstruction.
sim.ex The proportion of variance explained by each of the random environmental variables.
sig The p-value of each reconstruction.
If make_models = TRUE
, a list of transfer function models is returned.
autoplot.palaeoSig
returns a ggplot2
object
plot(palaeoSig)
: Plot palaeoSig object
autoplot(palaeoSig)
: autoplot function for palaeoSig object
If there are only a few fossil levels, obs.cor
might have
more power.
If there are few taxa, tests on MAT
reconstructions have
more statistical power than those based on WA
.
Richard Telford [email protected]
Telford, R. J. and Birks, H. J. B. (2011) A novel method for assessing the statistical significance of quantitative reconstructions inferred from biotic assemblages. Quaternary Science Reviews 30: 1272–1278. doi:10.1016/j.quascirev.2011.03.002
obs.cor
, WA
,
MAT
, WAPLS
,
rda
, cca
require(rioja) data(SWAP) data(RLGH) rlghr <- randomTF( spp = sqrt(SWAP$spec), env = data.frame(pH = SWAP$pH), fos = sqrt(RLGH$spec), n = 49, fun = WA, col = "WA.inv" ) rlghr$sig plot(rlghr, "pH") require("ggplot2") autoplot(rlghr, "pH")
require(rioja) data(SWAP) data(RLGH) rlghr <- randomTF( spp = sqrt(SWAP$spec), env = data.frame(pH = SWAP$pH), fos = sqrt(RLGH$spec), n = 49, fun = WA, col = "WA.inv" ) rlghr$sig plot(rlghr, "pH") require("ggplot2") autoplot(rlghr, "pH")
Generates species response parameters to n environmental variables following Minchin (1987).
species(nspp = 30, Amax, fun, xpar, srange, alpha = 4, gamma= 4, ndim, sdistr, ocor, odistr)
species(nspp = 30, Amax, fun, xpar, srange, alpha = 4, gamma= 4, ndim, sdistr, ocor, odistr)
nspp |
Number of species to be generated. |
Amax |
Maximum abundance of a species. 'Amax' currently allows three options: i) a function how to generate maximum abundances (e.g. 'runif', 'rgamma') ii) a vector of length 'nspp' iii) a single number that is used as maximum abundance for all the species. |
fun |
Function to generate species optima (e.g. 'rnorm', 'runif'). The two parameters in 'xpar' are passed to function 'fun'. If omitted species optima are generated at regular intervals between the two values in 'xpar'. |
xpar |
Two numbers describing a distribution e.g mu and sigma for a normal distribution, lower and upper bound for a random uniform distribution. |
srange |
Length of the ecological gradient to which individual species respond. Either one number or a matrix with 'nspp' rows and 'ndim' columns. If 'srange' should be different for different environmental variables a simpler solution is to change argument elen in |
alpha |
Shape parameter of the beta distribution. One number or a matrix with 'nspp' rows and 'ndim' columns. |
gamma |
Shape parameter of the beta distribution. One number or a matrix with 'nspp' rows and 'ndim' columns. |
ndim |
Number of environmental variables to which generated species should respond. |
sdistr |
Users may supply own distributions of species optima. Matrix with 'nspp' rows and 'ndim' columns (as well in the special case of 'ndim = 1'). |
ocor |
Correlation matrix of the species optima. May be generated by code cor.mat.fun. |
odistr |
Distribution of the correlated optima either 'uniform' or 'Gaussian' |
Details on the exact generation of species response functions from parameters 'Amax', 'm', 'r', 'gamma' and 'alpha' are given in Minchin (1987). Species response curves are determined by five parameters: a parameter determining the maximum abundance ('Amax') and one describing the location ('m') of this mode. A parameter determining to which environmental range the species respond ('srange' in the input 'r' in the output) and two parameters ('alpha', 'gamma') describing the shape of the species response function. If 'alpha' = 'gamma' the response curve is symmetric ('alpha' = 'gamma' = 4, yields Gaussian distributions). Additionally, species optima for several environmental variables may be correlated. Currently this is only possible for gaussian or uniform distributions of species optima. Users may as well supply previously generated optima (e.g. optima similar to a real dataset).
List with 'ndim' elements. Each list contains the species response parameters to one environmental gradient.
Mathias Trachsel and Richard J. Telford
Minchin, P.R. (1987) Multidimensional Community Patterns: Towards a Comprehensive Model. Vegetatio, 71, 145-156.
spec.par <- species(nspp = 30, Amax = runif, srange = 200, fun = runif, xpar = c(-50, 150), ndim = 5, alpha = 4, gamma = 4) spec.par <- species(nspp = 30, ndim = 3, Amax = runif, xpar = c(-50, 150), srange = 200, alpha = 4, gamma = 4) # example where srange, alpha and gamma are different for each species and environmental gradient. spec.par <- species(nspp = 30, ndim = 3, Amax = runif, xpar = c(-50, 150), srange = matrix(ncol = 3, runif(90, 100, 200)), alpha = matrix(ncol = 3, runif(90, 1, 5)), gamma = matrix(ncol = 3, runif(90, 1, 5))) # example where species optima are correlated correlations <- list(c(1, 2, 0.5), c(1, 3, 0.3), c(2, 3, 0.1)) spec.cor.mat <- cor.mat.fun(3, correlations) spec.par <- species(nspp = 30, ndim = 3, Amax = runif, xpar = c(50, 50), srange = 200, alpha = 4, gamma = 4,ocor = spec.cor.mat, odistr = 'Gaussian') # example for species response curves (users should alter alpha and gamma) spec.par <- species(nspp = 1, Amax = 200, srange = 200, fun = runif, xpar = c(50, 50), ndim = 1, alpha = 3, gamma = 1) env <- -50:150 response <- palaeoSig:::make.abundances(env = -50:150, param = spec.par[[1]]$spp) plot(env, response, type='l')
spec.par <- species(nspp = 30, Amax = runif, srange = 200, fun = runif, xpar = c(-50, 150), ndim = 5, alpha = 4, gamma = 4) spec.par <- species(nspp = 30, ndim = 3, Amax = runif, xpar = c(-50, 150), srange = 200, alpha = 4, gamma = 4) # example where srange, alpha and gamma are different for each species and environmental gradient. spec.par <- species(nspp = 30, ndim = 3, Amax = runif, xpar = c(-50, 150), srange = matrix(ncol = 3, runif(90, 100, 200)), alpha = matrix(ncol = 3, runif(90, 1, 5)), gamma = matrix(ncol = 3, runif(90, 1, 5))) # example where species optima are correlated correlations <- list(c(1, 2, 0.5), c(1, 3, 0.3), c(2, 3, 0.1)) spec.cor.mat <- cor.mat.fun(3, correlations) spec.par <- species(nspp = 30, ndim = 3, Amax = runif, xpar = c(50, 50), srange = 200, alpha = 4, gamma = 4,ocor = spec.cor.mat, odistr = 'Gaussian') # example for species response curves (users should alter alpha and gamma) spec.par <- species(nspp = 1, Amax = 200, srange = 200, fun = runif, xpar = c(50, 50), ndim = 1, alpha = 3, gamma = 1) env <- -50:150 response <- palaeoSig:::make.abundances(env = -50:150, param = spec.par[[1]]$spp) plot(env, response, type='l')
Storsandsvatnet is a lake in western Norway. From the sediments a core was obtained, and 11 samples was submitted for radiocarbon dating. The data contain the depths of the slides dated and the younger and older calibrated ages for each slide.
data(STOR)
data(STOR)
A data.frame with 11 observations on the following 4 variables.
The upper border of the dated slide
The lower border of the dated slide
The younger calibrated age of the dated slide
The older calibrated age of the dated slide
The calibrated ages is obtained by calibration of the radiocarbon dates. The borders represent mean calibrated age +/- 1 SD of calibrated age.
The data are unpublished and provided by H. John B. Birks [email protected] and Sylvia M. Peglar
Heegaard, E., Birks, HJB. & Telford, RJ. 2005. Relationships between calibrated ages and depth in stratigraphical sequences: an estimation procedure by mixed-effect regression. The Holocene 15: 612-618 doi:10.1191/0959683605hl836rr