| Title: | Test for Spatial Bivariate Association Across Omics Types |
|---|---|
| Description: | The sbivar package implements a suite of tests for Spatial BIVARiate association across omics modalities, with possibly disjoint coordinate sets. Implemented tests are generalized additive models (GAMs), modified t-test, bivariate Moran's I and Gaussian processes (GPs). Both single images and replicated experiments can be analysed. |
| Authors: | Stijn Hawinkel [cre, aut] (ORCID: <https://orcid.org/0000-0002-4501-5180>), Wangjun Hu [aut] |
| Maintainer: | Stijn Hawinkel <[email protected]> |
| License: | GPL-2 |
| Version: | 0.99.19 |
| Built: | 2026-06-09 16:42:06 UTC |
| Source: | https://github.com/BiocStaging/sbivar |
Add columns with feature names to a matrix by splitting rownames, and remove rownames
addFeatureColumn(x)addFeatureColumn(x)
x |
A results matrix |
The matrix extended with two columns of feature names in front
Construct the nxnxg/2 array of derivatives for a nxn matrix to the g/2 covariance matrix parameters.
arrayDeriv(fittedGP, distMat, what)arrayDeriv(fittedGP, distMat, what)
fittedGP |
The fitted Gaussian process (vector of 4 parameters) |
distMat |
The distance matrix |
what |
For which parameter is the derivative required? |
The matrix of derivatives
Array resulting from all matrix products of slices of array A of size mxmxp, with matrix M of size mxm.
This is a faster version of vapply(seq_len(p), FUN.VALUE = mat, function(i){mat %*% arr[,,i]}),
although it may consume more memory.
arrayMatProd(A, M)arrayMatProd(A, M)
A |
An mxmxp array |
M |
An mxm matrix |
The speedup comes from a single call to %*%, very efficient in BLAS.
An mxmxp array
A faster version of vapply(seq_len(p), FUN.VALUE = double(p), function(i){
vapply(seq_len(p), FUN.VALUE = double(1), function(j){
tr(arr[,,i] %*% arr2[,,j])
})
})
arrayProd2tr(A, B)arrayProd2tr(A, B)
A, B
|
mxmxp arrays |
pxp matrix of traces
Returns the matrix resulting from the traces of all matrix products of slices of array A mxmxp
with those of array B of the same dimensions. It is a faster version of
vapply(seq_len(p), FUN.VALUE = double(p), function(i){
vapply(seq_len(p), FUN.VALUE = double(1), function(j){
tr(crossprod(arr2[,,i], arr[,,j]))
})
})
arrayProdTr(A, B)arrayProdTr(A, B)
A, B
|
mxmxp arrays |
The speedup comes from a single call to crossprod, very efficient in BLAS.
A pxp matrix of traces
Extract an assay, and transpose
assayT(x, assayName)assayT(x, assayName)
x |
The SummarizedExperiment object |
assayName |
The name of the assay |
The required assay, transposed
A wrapper for Matrix::bdiag maintaining names
bdiagn(A, B)bdiagn(A, B)
A, B
|
Matrix to be used in bdiag |
Same as bdiag but with dimnames
These matrices are used to test for bivariate association at different length scales.
Each alternative covariance matrix has the block structure
where is the Gaussian cross-covariance kernel evaluated at length scale .
Only the cross-blocks are returned; the identity diagonal is implicit.
buildAltSigmas(distMat, numLscAlts, Quants, idN, idM)buildAltSigmas(distMat, numLscAlts, Quants, idN, idM)
distMat |
The complete distance matrix of Cx and Ey stacked. Only needed when
|
numLscAlts |
Number of length scales (and thus number of cross-blocks to be built) |
Quants |
Most extreme quantiles of the distance distribution to be used as length scales. |
idN, idM
|
indices for x and y in the distance matrix |
An array of cross-blocks
Build a 3-slice array of covariance-parameter derivatives
buildDerivArray(fittedGP, distMat, suffix)buildDerivArray(fittedGP, distMat, suffix)
fittedGP |
The fitted Gaussian process (vector of 4 parameters) |
distMat |
Distance matrix for the relevant modality |
suffix |
Character suffix to append to parameter names ("X" or "Y") |
An array with 3rd-dim names
c("sigma<suffix>", "range<suffix>", "nugget<suffix>")
Given two coordinate matrices, concave hulls are found around them. The intersection between these two hulls is found, and in that area an evenly spaced, discrete grid is constructed.
buildNewGrid(Cx, Ey, n_points_grid)buildNewGrid(Cx, Ey, n_points_grid)
Cx, Ey
|
The coordinate matrices |
n_points_grid |
An integer, the number of points desired for the new grid |
The new grid will contain approximately the number of new points requested, depending on the size of the concave hull
A data frame of two columns with all points of the grid, with column names x and y.
This function is mainly used to create a grid on which two fitted GAMs can be evaluated to calculate correlations.
Cx <- matrix(runif(40, 0, 1), 20, 2) Ey <- matrix(runif(30, 0, 1), 15, 2) buildNewGrid(Cx, Ey, n_points_grid = 50)Cx <- matrix(runif(40, 0, 1), 20, 2) Ey <- matrix(runif(30, 0, 1), 15, 2) buildNewGrid(Cx, Ey, n_points_grid = 50)
Build the SAC matrix for a Gaussian process
buildSigmaGp(pars, distMat)buildSigmaGp(pars, distMat)
pars |
A vector of parameters |
distMat |
The complete distance matrix of Cx and Ey stacked. Only needed when
|
The covariance matrix
Build a weight matrix to be used in the calculation of the bivariate Moran's I statistic, normalized to sum to one.
buildWeightMat( Cx, Ey, wo, eta, numNN, distMat = spatstat.geom::crossdist(Cx[, 1], Cx[, 2], Ey[, 1], Ey[, 2]) )buildWeightMat( Cx, Ey, wo, eta, numNN, distMat = spatstat.geom::crossdist(Cx[, 1], Cx[, 2], Ey[, 1], Ey[, 2]) )
Cx, Ey
|
Two coordinate matrices |
wo |
The weighting option, either "nn" or "Gauss" |
eta |
parameter that controls the decay of the weights with distance, see details |
numNN |
An integer, the number of neighbours |
distMat |
The distance matrix |
For wo = "Gauss", the weight decays as exp(-d^2/eta) with d the distance between observations. For wo = "nn" the numNN nearest neighbours are given equal weight, all others are zero.
A weight matrix
The CCT function takes in a numeric vector of p-values,
and returns the aggregated p-value using Cauchy combination rule.
The code was taken from the xihaoli/STAAR github repo (Liu and Xie 2020), and adapted.
CCT(pvals)CCT(pvals)
pvals |
a numeric vector of p-values, where each of the elements is between 0 to 1, to be combined. |
The aggregated p-value
Liu Y, Xie J (2020). “Cauchy combination test: A powerful test with analytic p-value calculation under arbitrary dependency structures.” J. Am. Stat. Assoc., 115(529), 393 - 402. doi:10.1080/01621459.2018.1554485. https://pubmed.ncbi.nlm.nih.gov/33012899.
Checks dimensions of matrices and lists, to be used for all exported functions
checkInputSingle(X, Y, Cx, Ey)checkInputSingle(X, Y, Cx, Ey)
X, Y
|
Matrices of omics measurements |
Cx, Ey
|
Coordinate matrices of dimension two, belonging to X and Y respectively |
Throws error when a fault is found with the input, otherwise finishes silently
Find all raw cross-correlations between lists of observations matrices from different modalities.
correlationsMulti(Xl, Yl, featuresX, featuresY, verbose)correlationsMulti(Xl, Yl, featuresX, featuresY, verbose)
Xl, Yl
|
Lists of matrices of omics measurements |
featuresX, featuresY
|
Features to be tested. Defaults to all features, but specifying them allows to test a limited feature set, while using the whole matrix to calculate library sizes as offset or for normalization. |
verbose |
Should progress be printed? |
A list of named correlation vectors
Evaluate a variogram on a set of distances
evalVariogram(vg, distVec)evalVariogram(vg, distVec)
vg |
The variogram model resulting from a call to fit.variogram) |
distVec |
A vector of pairwise distances |
A vector of covariances
The weighting functions for calculating bivariate Moran's I resulting from different choices of decay parameters eta are visualized in a lineplot.
exploreWeights( etas, dists = seq(0, 0.2, length.out = 1000), palette = "paired", legend.position = "topright" )exploreWeights( etas, dists = seq(0, 0.2, length.out = 1000), palette = "paired", legend.position = "topright" )
etas |
A vector of positive decay parameters |
dists |
A set of distances, smaller than the square root of 2 since all coordinates are scaled to the unit square. |
palette |
The colour palette |
legend.position |
Legend position, passed onto legend as x |
Plots the weighting functions
exploreWeights(10^c(-5, -4, -3))exploreWeights(10^c(-5, -4, -3))
Is there any double underscore in the character vector?
findDoubleUnderScore(charVec)findDoubleUnderScore(charVec)
charVec |
The character vector |
A boolean
Fit a generalized additive model (GAM) captures spatial dependence through a bivariate spatial spline.
fitGAM(df, outcome, family = gaussian(), offset = NULL, includeGPsmooth)fitGAM(df, outcome, family = gaussian(), offset = NULL, includeGPsmooth)
df |
The dataframe containing outcome and coordinates |
outcome |
A character vector indicating the outcome variable |
family |
A character string indicating the family, see family. |
offset |
A numeric vector with the offset, e.g. the library sizes for spatial transcriptomic data, already on the scale of the regressor, so log-transformed for count models. |
includeGPsmooth |
Should a Gaussian random field smoother for stochastic neighbourhood similarity be included? |
If a gamma fit is attempted and fails, which frequently happens for sparse data, a negative binomial fit is attempted instead
A fitted GAM model, or try-error when the fit fails
A Gaussian process (GP) models the spatial autocovariance using a kernel function that describes the covariance as a decreasing function of distance between observations.
fitGP(x, coord, GPmethod, correlation, optControl)fitGP(x, coord, GPmethod, correlation, optControl)
x |
outcome vector |
coord |
Coordinate matrix |
GPmethod |
The method by which to fit the Gaussian processes, passed onto gls as "method". |
correlation |
A corStruct object, see corStruct. At this point, corGaus is hard-coded, |
optControl |
List of control values, see glsControl. |
A vector of length 4 with components range, nugget, sigma and mean
Fitting GPs can be computation and memory intensive!
Fit a linear model for an individual feature pair
fitLinModel(ff, y, Control, Terms, modMat, MM, Assign, weights = NULL)fitLinModel(ff, y, Control, Terms, modMat, MM, Assign, weights = NULL)
ff |
The prepared frame |
y |
outcome vector |
Control |
A control list for lmerTest::lmer |
modMat |
Design matrix of the fixed effects model |
MM |
a Boolean, should a mixed model fit be attempted? |
Assign, Terms
|
Added to fitted fixed effects model |
weights |
weights vector |
The code is based on fitSingleLmmModel, but may diverge
A fitted model
A fitted lmer or lm model
Given measures estimated from replicated images using sbivarMulti, fit linear models to determine significance. To maintain interpretability of the intercept, continuous fixed effect variables are centered, and sum coding is used for the categorical ones.
extractResultsMulti() returns the results as matrix, including adjusted p-values, and sorted by p-value.
fitLinModels( result, designDf, Formula, verbose = TRUE, inverseWeigh = FALSE, scaleByMax = TRUE, Control = lmerControl(check.conv.grad = .makeCC("ignore", tol = 0.002, relTol = NULL), check.conv.singular = .makeCC(action = "ignore", tol = 1e-04), check.conv.hess = .makeCC(action = "ignore", tol = 1e-06)) ) extractResultsMulti(result, designDf, method = "BH")fitLinModels( result, designDf, Formula, verbose = TRUE, inverseWeigh = FALSE, scaleByMax = TRUE, Control = lmerControl(check.conv.grad = .makeCC("ignore", tol = 0.002, relTol = NULL), check.conv.singular = .makeCC(action = "ignore", tol = 1e-04), check.conv.hess = .makeCC(action = "ignore", tol = 1e-06)) ) extractResultsMulti(result, designDf, method = "BH")
result |
A result with the measures of bivariate spatial association, from a call to the sbivar function with multiple images |
designDf |
A design dataframe |
Formula |
A formula for the linear model to be fitted, can contain random effects. |
verbose |
Should a message with number of linear models and cores be printed? |
inverseWeigh |
A boolean, should estimates be inverse weighed by variance for GAMs and Moran's I? |
scaleByMax |
A boolean, should Moran's I be scaled by maximum values before plugging into the linear model? |
Control |
A control list for lmerTest::lmer |
method |
Multiplicity correction method passed onto p.adjust |
The left hand side of "Formula" can be provided or not, but it will be overridden by "out" for downstream analysis
For fitLinModels(), a list of linear models
For extractResultsMulti() a list of matrices, all containing estimate, standard error, p-value and adjusted p-value
lmer, lm, sbivarMulti, p.adjust
# Multi-image analysis on Vicari data, using GAMs data(Vicari) # Subset to 5 images and 500 spots limit computing time VicariMultiTest <- lapply(Vicari, function(x) lapply(x[1:5], function(y) y[1:500, ])) VicariRes <- sbivar(VicariMultiTest$TranscriptOutcomes, VicariMultiTest$MetaboliteOutcomes, VicariMultiTest$TranscriptCoords, VicariMultiTest$MetaboliteCoords, normX = "rel", normY = "rel", method = "GAM" ) mouse <- substr(names(Vicari$TranscriptOutcomes)[1:5], 1, 10) designDf <- data.frame("mouse" = mouse) # The design matrix multiGAMLmms <- fitLinModels(VicariRes, designDf, Formula = ~ (1 | mouse)) # Extract the results resGAMsMulti <- extractResultsMulti(multiGAMLmms, designDf = designDf) head(resGAMsMulti$result$Intercept)# Multi-image analysis on Vicari data, using GAMs data(Vicari) # Subset to 5 images and 500 spots limit computing time VicariMultiTest <- lapply(Vicari, function(x) lapply(x[1:5], function(y) y[1:500, ])) VicariRes <- sbivar(VicariMultiTest$TranscriptOutcomes, VicariMultiTest$MetaboliteOutcomes, VicariMultiTest$TranscriptCoords, VicariMultiTest$MetaboliteCoords, normX = "rel", normY = "rel", method = "GAM" ) mouse <- substr(names(Vicari$TranscriptOutcomes)[1:5], 1, 10) designDf <- data.frame("mouse" = mouse) # The design matrix multiGAMLmms <- fitLinModels(VicariRes, designDf, Formula = ~ (1 | mouse)) # Extract the results resGAMsMulti <- extractResultsMulti(multiGAMLmms, designDf = designDf) head(resGAMsMulti$result$Intercept)
Fit GAMs to all columns of a dataframe, as a wrapper for fitGAM
fitManyGAMs( mat, coord, family = gaussian(), modality, features, pseudoCount = 1e-08, ... )fitManyGAMs( mat, coord, family = gaussian(), modality, features, pseudoCount = 1e-08, ... )
mat |
The matrix of outcomes |
coord |
The coordinate matrix |
family |
A character string indicating the family, see family. |
modality |
Character vector indicating which modality is being fit. For debugging purposes mainly |
features |
Features to be fit, others are only used to estimate the offset |
pseudoCount |
Pseudocount added to avoid zeroes for gamma distribution |
... |
Passed onto fitGAM |
A list of GAM models
A wrapper to fit GPs on all columns of a matrix
fitManyGPs(mat, coord, features, ...)fitManyGPs(mat, coord, features, ...)
mat |
Matrix of observations |
coord |
Matrix of coordinates |
features |
Features to be fit, others are only used to estimate the offset |
... |
passed onto fitGP |
Matrix of fitted GP components
Wraps GAMsSingle for lists
GAMsMulti( Xl, Yl, Cxl, Eyl, families, n_points_grid, verbose, includeGPsmooth, testSmooth, featuresX, featuresY, findVariances = FALSE )GAMsMulti( Xl, Yl, Cxl, Eyl, families, n_points_grid, verbose, includeGPsmooth, testSmooth, featuresX, featuresY, findVariances = FALSE )
Xl, Yl
|
Lists of matrices of omics measurements |
Cxl, Eyl
|
Lists of corresponding coordinate matrices of dimension two |
families, n_points_grid, includeGPsmooth, testSmooth
|
See GAMsSingle |
verbose |
Should info on type of analysis be printed? |
featuresX, featuresY
|
Features to be tested. Defaults to all features, but specifying them allows to test a limited feature set, while using the whole matrix to calculate library sizes as offset or for normalization. |
findVariances |
Should variances be calculated? For internal use only |
A list named like Xl, containing all results
Fit univariate GAMs with 2D cubic smoothing splines for both modalities, and test for correlation between all bivariate combinations.
GAMsSingle( X, Y, Cx, Ey, families, n_points_grid, verbose, featuresX, featuresY, includeGPsmooth, testSmooth, findVariances = TRUE )GAMsSingle( X, Y, Cx, Ey, families, n_points_grid, verbose, featuresX, featuresY, includeGPsmooth, testSmooth, findVariances = TRUE )
X, Y
|
Matrices of omics measurements |
Cx, Ey
|
Coordinate matrices of dimension two, belonging to X and Y respectively |
families |
A vector of length 2 giving the distributional families for the outcome values. See details of sbivarSingle. |
n_points_grid |
The number of points in the new grid for the GAMs to be evaluated on. |
verbose |
Should info on type of analysis be printed? |
featuresX, featuresY
|
Features to be tested. Defaults to all features, but specifying them allows to test a limited feature set, while using the whole matrix to calculate library sizes as offset or for normalization. |
includeGPsmooth |
Should a Gaussian random field smoother for stochastic neighbourhood similarity be included? |
testSmooth |
A character string indicating which smooth factor should be tested for, either "trend" for a deterministic process or "field" for the Gaussian random field |
findVariances |
Should variances be calculated? For internal use only |
A named list of results
Construct the SAC part of the covariance matrix using the Gaussian covariance kernel,
GaussKernel(distMat, range)GaussKernel(distMat, range)
distMat |
The complete distance matrix of Cx and Ey stacked. Only needed when
|
range |
Range (or length-scale) parameter of the Gaussian covariance kernel |
The SAC covariance matrix
Computes using the factored form
, where is the basis matrix and
. This avoids materialising the
prediction covariance matrix.
getApproxVar(predInfo, cen, x, link)getApproxVar(predInfo, cen, x, link)
predInfo |
List returned by vcovPredGam (must contain
|
cen |
Centered predictions of the other modality |
x |
Raw predictions (needed for non-identity links) |
link |
Link function name |
Scalar approximate variance
Get all categrocial variables from a dataframe
getDiscreteVars(df)getDiscreteVars(df)
df |
The data frane |
A character vector of variable names
Return feature names from a list
getFeaturesList(Xl)getFeaturesList(Xl)
Xl |
A list of feature matrices |
A vector of feature names
Construct the size variable
getSize(X, Y, normX, normY, size, scaleBySampleSums)getSize(X, Y, normX, normY, size, scaleBySampleSums)
X, Y
|
Matrices of omics measurements |
normX, normY
|
Character strings, indicating what normalization is required for X and Y matrices, respectively, before plotting, see details. |
size |
The maximum output size |
scaleBySampleSums |
A boolean, should the size of the spots be scaled by their sample sum, e.g. library size or total ion count? Recommended to reflect differences in certainty depending on sample sums. |
The final point size
Extract coordinate matrix
getSpatialCoords(X, Cx)getSpatialCoords(X, Cx)
X |
The matrix or SpatialExperiment object |
Cx |
The coordinate matrix |
A coordinate matrix
Extract data matrix
getX(X, assay)getX(X, assay)
X |
The matrix or SpatialExperiment object |
assay |
The name of the assay |
A matrix
Fit all univariate GPs on both modalities, and perform all bivariate tests across them
GPsSingle( X, Y, Cx, Ey, gpParams, numLscAlts, Quants, GPmethod, correlation, optControl, verbose, featuresX, featuresY )GPsSingle( X, Y, Cx, Ey, gpParams, numLscAlts, Quants, GPmethod, correlation, optControl, verbose, featuresX, featuresY )
X, Y
|
Matrices of omics measurements |
Cx, Ey
|
Coordinate matrices of dimension two, belonging to X and Y respectively |
gpParams |
Parameters of the Gaussian processes, see details |
numLscAlts |
Number of length scales to be tested for bivariate association |
Quants |
Most extreme quantiles of the distance distribution to take as length scales |
GPmethod |
The method by which to fit the Gaussian processes, passed onto gls as "method". |
correlation |
A corStruct object, see corStruct. At this point, corGaus is hard-coded, |
optControl |
List of control values, see glsControl. |
verbose |
Should info on type of analysis be printed? |
featuresX, featuresY
|
Features to be tested. Defaults to all features, but specifying them allows to test a limited feature set, while using the whole matrix to calculate library sizes as offset or for normalization. |
gpParams must be a list of length 2 with names 'X' and 'Y', consisting of matrices with rownames "mean", "nugget", "range" and "sigma", and column names as in X and Y. This argument allows to pass parameters of the Gaussian processes estimated with other software (e.g. with GPU acceleration) to perform the score test.
A named list of results
Make unique names
makeNames(featX, featY)makeNames(featX, featY)
featX, featY
|
vectors of feature names |
A vector of names
Make a list of offsets
makeOffset(X, family)makeOffset(X, family)
X |
The data matrix |
family |
The distribution family |
A list of length two with offsets
Convert z-value to p-value
makePval(z)makePval(z)
z |
The z-value to be converted |
The p-value
Estimate variograms using Matheron's binning estimator for many features at once, and evaluate
matheronVariograms(X, Cx, width, cutoff, variogramModels)matheronVariograms(X, Cx, width, cutoff, variogramModels)
X |
Outcome matrix |
Cx |
Coordinate matrix |
cutoff, width
|
Cutoff and width of the variogram estimation, passed onto vgm |
variogramModels |
A character vector, indicating the variogram model passed onto vgm. Currently, only "Exp" and "Lin" are implemented for computational reasons. |
The best fitting variogram model, measured by the squared error, will be used.
A list of evaluated variograms
The modified t-test is applied to all pairs, which tests for the significance of the Pearson correlation while accounting for spatial autocorrelation (Clifford et al. 1989).
ModTtestSingle(X, Y, Cx, verbose)ModTtestSingle(X, Y, Cx, verbose)
X, Y
|
Matrices of omics measurements |
Cx |
The shared coordinate matrix |
verbose |
Should info on type of analysis be printed? |
A dataframe of results sorted by p-value, also containing effective sample size (ESS) and correlation estimate.
Clifford P, Richardson S, Hemon D (1989). “Assessing the Significance of the Correlation between Two Spatial Processes.” Biometrics, 45(1), 123 - 134. ISSN 0006341X, 15410420. http://www.jstor.org/stable/2532039.
Find all Moran's I values for lists of observations matrices from different modalities, by calling the MoransISingle function.
MoransIMulti( Xl, Yl, Cxl, Eyl, findVariances, verbose, featuresX, featuresY, findMaxW, ... )MoransIMulti( Xl, Yl, Cxl, Eyl, findVariances, verbose, featuresX, featuresY, findMaxW, ... )
Xl, Yl
|
Lists of matrices of omics measurements |
Cxl, Eyl
|
Lists of corresponding coordinate matrices of dimension two |
findVariances |
Should variances be calculated? For internal use only |
verbose |
Should info on type of analysis be printed? |
featuresX, featuresY
|
Features to be tested. Defaults to all features, but specifying them allows to test a limited feature set, while using the whole matrix to calculate library sizes as offset or for normalization. |
findMaxW |
Is the maximum bivariate Moran's I needed? |
... |
passed onto MoransISingle |
A list of Moran's I estimates, standard errors and maximum values
The variance calculation requires estimation of the spatial autocorrelation structure of every feature separately, using Matheron's variogram estimator (Matheron 1963).
MoransISingle( X, Y, Cx, Ey, wo, etas, numNNs, cutoff, width, verbose, findMaxW, variogramModels, returnSEsMoransI, featuresX, featuresY, findVariances = TRUE, ... )MoransISingle( X, Y, Cx, Ey, wo, etas, numNNs, cutoff, width, verbose, findMaxW, variogramModels, returnSEsMoransI, featuresX, featuresY, findVariances = TRUE, ... )
X, Y
|
Matrices of omics measurements |
Cx, Ey
|
Coordinate matrices of dimension two, belonging to X and Y respectively |
wo |
type of weight parameter, passed onto buildWeightMat |
numNNs, etas
|
Vectors of weight matrix parameters, whose elements are passed onto buildWeightMat |
cutoff, width
|
Cutoff and width of the variogram estimation, passed onto vgm |
verbose |
Should info on type of analysis be printed? |
findMaxW |
Is the maximum bivariate Moran's I needed? |
variogramModels |
A character vector, indicating the variogram model passed onto vgm. Currently, only "Exp" and "Lin" are implemented for computational reasons. |
returnSEsMoransI |
A boolean, are standard errors of Moran's I to be returned? |
featuresX, featuresY
|
Features to be tested. Defaults to all features, but specifying them allows to test a limited feature set, while using the whole matrix to calculate library sizes as offset or for normalization. |
findVariances |
Should variances be calculated? For internal use only |
... |
passed onto variogram |
By default, a number of range parameters and corresponding weight matrices are screened for spatial association, and their p-value combined using the Cauchy combination rule by (Liu and Xie 2020). The maximum value of the bivariate Moran's I statistics are returned conditionally, as it is computation intensive and not always needed.
A dataframe of results sorted by p-value, also containing the estimated Moran's I statistic and its variance. In addition, the maximum value of the Moran's I statistic, and the parameters of the weight matrix
No multithreading is implemented for the variance calculation, as the matrix calculations involved may use inherent multithreading with OpenBLAS.
Liu Y, Xie J (2020).
“Cauchy combination test: A powerful test with analytic p-value calculation under arbitrary dependency structures.”
J. Am. Stat. Assoc., 115(529), 393 - 402.
doi:10.1080/01621459.2018.1554485.
https://pubmed.ncbi.nlm.nih.gov/33012899.
Matheron G (1963).
“Principles of geostatistics.”
Economic Geology, 58(8), 1246 - 1266.
doi:10.2113/gsecongeo.58.8.1246.
Move two sets of coordinates to 0-1. without shifting them with respect to each other
moveTwoCoords(Cx, Ey)moveTwoCoords(Cx, Ey)
Cx, Ey
|
Coordinate matrices |
A list with the shifted and scaled coordinates
Normalize to relative expression, and potentially add pseudocount and log-normalize.
normMat(x, norm, pseudoCount = 1e-08)normMat(x, norm, pseudoCount = 1e-08)
x |
The matrix |
norm |
A character string, either "none", "log" or "rel" |
pseudoCount |
A pseudocount added prior to log-normalization to avoid taking the log of zero |
norm = "none" is pass-through, norm = "rel" divides by sample sums, "log" adds a pseudocount, divides by sample sums and log-normalizes.
A normalized matrix
mat <- matrix(rpois(2000, lambda = 3), 40, 50) nMat <- normMat(mat, norm = "rel")mat <- matrix(rpois(2000, lambda = 3), 40, 50) nMat <- normMat(mat, norm = "rel")
Plot the coordinates of two modalities onto the same coordinate framework, in two different colours. This is a useful check of the alignment and overlap.
plotCoords(Cx, Ey, pchX = 1, pchY = 3, cex = 0.8, ...) plotCoordsMulti(Cxl, Eyl, ...)plotCoords(Cx, Ey, pchX = 1, pchY = 3, cex = 0.8, ...) plotCoordsMulti(Cxl, Eyl, ...)
Cx, Ey
|
Coordinate matrices of dimension two, belonging to X and Y respectively |
pchX, pchY
|
Point shapes for x and y |
cex |
Expansion factor |
... |
passed onto plot() |
Cxl, Eyl
|
Lists of corresponding coordinate matrices of dimension two |
plotCoordsMulti() is a wrapper for plotCoords for lists of coordinates, and requires the user to set par(mar = ) appropriately, such that all plots are shown.
Plots to the plotting device
data(Vicari) plotCoords(Vicari$TranscriptCoords[[1]], Vicari$MetaboliteCoords[[1]]) # For multiple coordinates par(mfrow = c(2, 3)) foo <- lapply(names(Vicari$TranscriptCoords), function(nam) { plotCoords(Vicari$TranscriptCoords[[nam]], Vicari$MetaboliteCoords[[nam]], main = nam) }) par(mfrow = c(1, 1))data(Vicari) plotCoords(Vicari$TranscriptCoords[[1]], Vicari$MetaboliteCoords[[1]]) # For multiple coordinates par(mfrow = c(2, 3)) foo <- lapply(names(Vicari$TranscriptCoords), function(nam) { plotCoords(Vicari$TranscriptCoords[[nam]], Vicari$MetaboliteCoords[[nam]], main = nam) }) par(mfrow = c(1, 1))
Spawns a three-panel plot with the splines fitted for the two variables, plus a visualization of regions with positive and negative correlations between those splines. The splines are refitted using fitGAM, so no GAM-objects can be provided. This function takes entire outcome matrices X and Y as argument, to be able to account for offsets in refitting the GAMs. plotGAMsTopResults() plots the feature with the smallest p-value in the 'results' object.
plotGAMs( X, Y, Cx, Ey, features, offsets = list(), scaleFun = "scaleMinusOne", families = list(X = gaussian(), Y = gaussian()), addTitle = TRUE, normX = c("none", "rel", "log"), normY = c("none", "rel", "log"), n_points_grid = 600, includeGPsmooth = TRUE, smooth = "trend", ... ) plotGAMsTopResults( results, X, Y, Cx, Ey, topRank = 1, parameter = "Intercept", families = results$families, ... )plotGAMs( X, Y, Cx, Ey, features, offsets = list(), scaleFun = "scaleMinusOne", families = list(X = gaussian(), Y = gaussian()), addTitle = TRUE, normX = c("none", "rel", "log"), normY = c("none", "rel", "log"), n_points_grid = 600, includeGPsmooth = TRUE, smooth = "trend", ... ) plotGAMsTopResults( results, X, Y, Cx, Ey, topRank = 1, parameter = "Intercept", families = results$families, ... )
X, Y
|
Matrices of omics measurements, or lists thereof |
Cx, Ey
|
Corresponding coordinate matrices of dimension two, or lists thereof |
features |
The features to plot |
offsets |
List of length two with offsets |
scaleFun |
The scaling function to be applied before plotting |
families |
A vector of length 2 giving the distributional families for the outcome values. See details of sbivarSingle. |
addTitle |
A boolean, should a title be plotted |
normX, normY
|
Character strings, indicating what normalization is required for X and Y matrices, respectively, before plotting, see details. |
n_points_grid |
The number of points in the new grid for the GAMs to be evaluated on. |
includeGPsmooth |
Should a Gaussian random field smoother for stochastic neighbourhood similarity be included? |
smooth |
Which smooth to plot, either "trend" for the deterministic surface, or "field" for the Gaussian random field |
... |
passed onto fitGAM |
results |
Result of a call to sbivar (single-image) or to fitLinModels (multi-image) |
topRank |
An integer, the feature pair with the rank-th smallest p-value is plotted |
parameter |
The linear model parameter used to find the feature with the strongest effect. The default is the intercept, i.e. the overall effect. |
A ggplot object
Both spline surfaces are scaled to the [-1,1] range, the same as the correlation has naturally, for legibility.
# Single image example(sbivar, "sbivar") plotGAMs(X, Y, Cx, Ey, features = c("X1", "Y2")) plotGAMsTopResults(resGAMs, X, Y, Cx = Cx, Ey = Ey) # Multi image, arbitrary pair data(Vicari) plotGAMs(Vicari$TranscriptOutcomes, Vicari$MetaboliteOutcomes, Vicari$TranscriptCoords, Vicari$MetaboliteCoords, features = c("Pcp4", "Dopamine") )# Single image example(sbivar, "sbivar") plotGAMs(X, Y, Cx, Ey, features = c("X1", "Y2")) plotGAMsTopResults(resGAMs, X, Y, Cx = Cx, Ey = Ey) # Multi image, arbitrary pair data(Vicari) plotGAMs(Vicari$TranscriptOutcomes, Vicari$MetaboliteOutcomes, Vicari$TranscriptCoords, Vicari$MetaboliteCoords, features = c("Pcp4", "Dopamine") )
Plot a chosen feature pair, or the highest ranking feature pair, for a single image or multiple images.
plotTopPair( results, ..., normX = results$normX, normY = results$normY, topRank = 1, parameter = "Intercept", scaleBySampleSums = FALSE ) plotPairSingle( X, Y, Cx, Ey, features, normX = c("none", "rel", "log"), normY = c("none", "rel", "log"), assayX, assayY, scaleBySampleSums = TRUE, size = 1.5, ... ) plotPairMulti( Xl, Yl, Cxl, Eyl, features, normX = c("none", "rel", "log"), scaleBySampleSums = FALSE, normY = c("none", "rel", "log"), size = 1.25, assayX, assayY, theme = theme_bw() ) plotPairSingleVectors( x, y, Cx, Ey, size, modalityNames = c("Modality X", "Modality Y"), theme = theme_bw(), ... )plotTopPair( results, ..., normX = results$normX, normY = results$normY, topRank = 1, parameter = "Intercept", scaleBySampleSums = FALSE ) plotPairSingle( X, Y, Cx, Ey, features, normX = c("none", "rel", "log"), normY = c("none", "rel", "log"), assayX, assayY, scaleBySampleSums = TRUE, size = 1.5, ... ) plotPairMulti( Xl, Yl, Cxl, Eyl, features, normX = c("none", "rel", "log"), scaleBySampleSums = FALSE, normY = c("none", "rel", "log"), size = 1.25, assayX, assayY, theme = theme_bw() ) plotPairSingleVectors( x, y, Cx, Ey, size, modalityNames = c("Modality X", "Modality Y"), theme = theme_bw(), ... )
results |
Results returned by sbivarSingle or extractResultsMulti |
... |
passed onto lower level functions |
normX, normY
|
Character strings, indicating what normalization is required for X and Y matrices, respectively, before plotting, see details. |
topRank |
An integer, the feature pair with the rank-th smallest p-value is plotted |
parameter |
The linear model parameter used to find the feature with the strongest effect. The default is the intercept, i.e. the overall effect. |
scaleBySampleSums |
A boolean, should the size of the spots be scaled by their sample sum, e.g. library size or total ion count? Recommended to reflect differences in certainty depending on sample sums. |
X |
the input object, containing measurements of the first modality, see methods('sbivar') |
Y |
Matrix or SpatialExperiment object of second modality |
Cx, Ey
|
Coordinate matrices of dimension two, belonging to X and Y respectively |
features |
Feature vector of length 2 to be plotted |
assayX, assayY
|
Assay names to be used in the analysis, see assay |
size |
Point size |
Xl, Yl
|
Lists of matrices of omics measurements |
Cxl, Eyl
|
Lists of corresponding coordinate matrices of dimension two |
theme |
the ggplot2 theme |
x, y
|
Outcome vectors |
modalityNames |
Names to be given to the modalities, appearing in the strip text of the columns. For plotTopPair() and plotPairSingle(), the feature names are used. |
For sequence count data, such as transcriptomics, normalization may be indicated to achieve clear plots (normX = "rel" or "log", see normMat). The normalization used for plotting is not necessarily the same as the one used for the analysis.
A ggplot object
extractResultsMulti, fitLinModels
### Single image # Single image analysis on synthetic data n <- 8e1 m <- 9e1 p <- 3 k <- 2 X <- matrix(rnorm(n * p), n, p, dimnames = list(paste0("sampleX", seq_len(n)), paste0("X", seq_len(p))) ) Y <- matrix(rnorm(m * k), m, k, dimnames = list(paste0("sampleY", seq_len(m)), paste0("Y", seq_len(k))) ) Cx <- matrix(runif(n * 2), n, 2, dimnames = list(rownames(X), c("x", "y"))) Ey <- matrix(runif(m * 2), m, 2, dimnames = list(rownames(Y), c("x", "y"))) resMoransI <- sbivar(X, Y, Cx, Ey, method = "Moran's I") # Plot the feature pair with the most significant signal plotTopPair(resMoransI, X, Y, Cx, Ey) # Plot an arbitrary feature pair plotPairSingle(X, Y, Cx, Ey, features = c("X1", "Y1")) ### Multi image data(Vicari) # Plot an arbitrary feature pair plotPairMulti(Vicari$TranscriptOutcomes, Vicari$MetaboliteOutcomes, Vicari$TranscriptCoords, Vicari$MetaboliteCoords, normX = "rel", normY = "rel", features = c("Gnas", "Tocopherol") )### Single image # Single image analysis on synthetic data n <- 8e1 m <- 9e1 p <- 3 k <- 2 X <- matrix(rnorm(n * p), n, p, dimnames = list(paste0("sampleX", seq_len(n)), paste0("X", seq_len(p))) ) Y <- matrix(rnorm(m * k), m, k, dimnames = list(paste0("sampleY", seq_len(m)), paste0("Y", seq_len(k))) ) Cx <- matrix(runif(n * 2), n, 2, dimnames = list(rownames(X), c("x", "y"))) Ey <- matrix(runif(m * 2), m, 2, dimnames = list(rownames(Y), c("x", "y"))) resMoransI <- sbivar(X, Y, Cx, Ey, method = "Moran's I") # Plot the feature pair with the most significant signal plotTopPair(resMoransI, X, Y, Cx, Ey) # Plot an arbitrary feature pair plotPairSingle(X, Y, Cx, Ey, features = c("X1", "Y1")) ### Multi image data(Vicari) # Plot an arbitrary feature pair plotPairMulti(Vicari$TranscriptOutcomes, Vicari$MetaboliteOutcomes, Vicari$TranscriptCoords, Vicari$MetaboliteCoords, normX = "rel", normY = "rel", features = c("Gnas", "Tocopherol") )
Print a message for the current iteration
printIteration(current, all)printIteration(current, all)
current |
current state of the iterator |
all |
vector of all iterators |
prints message to output
Print feature progress
printProgress(feat, allFeats, verbose)printProgress(feat, allFeats, verbose)
feat |
The current feature |
allFeats |
Vector of all features |
verbose |
Boolean, should output be printed? |
Prints progress message to output
Replace the left hand side of a formula by a fixed string
replaceLhs(x, repl = "out")replaceLhs(x, repl = "out")
x |
a formula |
repl |
the replacement string |
A formula
Perform a bivariate spatial association analysis, either on a single or multiple images. Depending on the input, the workhorse functions sbivarSingle (single-image) or sbivarMulti (multi-image) are called.
sbivar(X, ...) ## S4 method for signature 'matrix' sbivar(X, Y, Cx, Ey, ...) ## S4 method for signature 'list' sbivar(X, Y, Cx, Ey, assayX = NULL, assayY = NULL, ...) ## S4 method for signature 'SpatialExperiment' sbivar(X, Y, assayX, assayY, sample_id_x, sample_id_y = sample_id_x, ...) ## S4 method for signature 'MultiAssayExperiment' sbivar(X, experimentX, experimentY, assayX, assayY, ...)sbivar(X, ...) ## S4 method for signature 'matrix' sbivar(X, Y, Cx, Ey, ...) ## S4 method for signature 'list' sbivar(X, Y, Cx, Ey, assayX = NULL, assayY = NULL, ...) ## S4 method for signature 'SpatialExperiment' sbivar(X, Y, assayX, assayY, sample_id_x, sample_id_y = sample_id_x, ...) ## S4 method for signature 'MultiAssayExperiment' sbivar(X, experimentX, experimentY, assayX, assayY, ...)
X |
the input object, containing measurements of the first modality, see methods('sbivar') |
... |
additional constructor and analysis arguments |
Y |
Matrix or SpatialExperiment object of second modality |
Cx, Ey
|
Coordinate matrices of dimension two, belonging to X and Y respectively |
assayX, assayY
|
Assay names to be used in the analysis, see assay |
sample_id_x, sample_id_y
|
If provided, these are used to discriminate between different images included in the same SpatialExperiment objects X and Y. By default, they are assumed to be the same for both X and Y. |
experimentX, experimentY
|
Names of the experiments in X and Y to be used in the analysis |
A list containing the analysis result, along with parameters used in the analysis
# Single image analysis on synthetic data n <- 8e1 m <- 9e1 p <- 3 k <- 2 X <- matrix(rnorm(n * p), n, p, dimnames = list(paste0("sampleX", seq_len(n)), paste0("X", seq_len(p))) ) Y <- matrix(rnorm(m * k), m, k, dimnames = list(paste0("sampleY", seq_len(m)), paste0("Y", seq_len(k))) ) Cx <- matrix(runif(n * 2), n, 2, dimnames = list(rownames(X), c("x", "y"))) Ey <- matrix(runif(m * 2), m, 2, dimnames = list(rownames(Y), c("x", "y"))) resMoransI <- sbivar(X, Y, Cx, Ey, method = "Moran's I") resGAMs <- sbivar(X, Y, Cx, Ey, method = "GAMs") Y2 <- matrix(rnorm(n * k), n, k, dimnames = list(paste0("sampleX", seq_len(n)), paste0("Y", seq_len(k))) ) resModtTestJoint <- sbivar(X, Y2, Cx, method = "Modified") # Single image analysis on synthetic data, converted to SpatialExperiment if (require(SpatialExperiment)) { seX <- SpatialExperiment( assays = list("transcripts" = t(X)), spatialCoords = Cx ) seY <- SpatialExperiment( assays = list("metabolites" = t(Y)), spatialCoords = Ey ) resModtGPs <- sbivar(seX, seY, assayX = "transcripts", assayY = "metabolites", method = "GPs" ) }# Single image analysis on synthetic data n <- 8e1 m <- 9e1 p <- 3 k <- 2 X <- matrix(rnorm(n * p), n, p, dimnames = list(paste0("sampleX", seq_len(n)), paste0("X", seq_len(p))) ) Y <- matrix(rnorm(m * k), m, k, dimnames = list(paste0("sampleY", seq_len(m)), paste0("Y", seq_len(k))) ) Cx <- matrix(runif(n * 2), n, 2, dimnames = list(rownames(X), c("x", "y"))) Ey <- matrix(runif(m * 2), m, 2, dimnames = list(rownames(Y), c("x", "y"))) resMoransI <- sbivar(X, Y, Cx, Ey, method = "Moran's I") resGAMs <- sbivar(X, Y, Cx, Ey, method = "GAMs") Y2 <- matrix(rnorm(n * k), n, k, dimnames = list(paste0("sampleX", seq_len(n)), paste0("Y", seq_len(k))) ) resModtTestJoint <- sbivar(X, Y2, Cx, method = "Modified") # Single image analysis on synthetic data, converted to SpatialExperiment if (require(SpatialExperiment)) { seX <- SpatialExperiment( assays = list("transcripts" = t(X)), spatialCoords = Cx ) seY <- SpatialExperiment( assays = list("metabolites" = t(Y)), spatialCoords = Ey ) resModtGPs <- sbivar(seX, seY, assayX = "transcripts", assayY = "metabolites", method = "GPs" ) }
This function calculates measures of spatial association for every image. The resulting estimates can then be analysed further using the fitLinModels function.
sbivarMulti( Xl, Yl, Cxl, Eyl, families = list(X = gaussian(), Y = gaussian()), method = c("Moran's I", "GAMs", "Correlation"), wo = c("Gauss", "nn"), numNNs = c(4, 8, 24), etas = c(5e-06, 2e-04, 0.02), featuresX = getFeaturesList(Xl), featuresY = getFeaturesList(Yl), normX = c("none", "rel", "log"), normY = c("none", "rel", "log"), variogramModels = c("Exp", "Lin"), width = cutoff/15, cutoff = sqrt(2)/3, pseudoCount = 1e-08, n_points_grid = 600, verbose = TRUE, findVariances = FALSE, findMaxW = TRUE, includeGPsmooth = TRUE, testSmooth = c("trend", "field") )sbivarMulti( Xl, Yl, Cxl, Eyl, families = list(X = gaussian(), Y = gaussian()), method = c("Moran's I", "GAMs", "Correlation"), wo = c("Gauss", "nn"), numNNs = c(4, 8, 24), etas = c(5e-06, 2e-04, 0.02), featuresX = getFeaturesList(Xl), featuresY = getFeaturesList(Yl), normX = c("none", "rel", "log"), normY = c("none", "rel", "log"), variogramModels = c("Exp", "Lin"), width = cutoff/15, cutoff = sqrt(2)/3, pseudoCount = 1e-08, n_points_grid = 600, verbose = TRUE, findVariances = FALSE, findMaxW = TRUE, includeGPsmooth = TRUE, testSmooth = c("trend", "field") )
Xl, Yl
|
Lists of matrices of omics measurements |
Cxl, Eyl
|
Lists of corresponding coordinate matrices of dimension two |
families, n_points_grid
|
Passed onto GAMsMulti |
method |
A character string, indicating which method to apply |
wo, numNNs, etas
|
Passed onto MoransISingle |
featuresX, featuresY
|
Features to be tested. Defaults to all features, but specifying them allows to test a limited feature set, while using the whole matrix to calculate library sizes as offset or for normalization. |
normX, normY, pseudoCount
|
Normalization parameters, passed onto normMat |
variogramModels |
A character vector, indicating the variogram model passed onto vgm. Currently, only "Exp" and "Lin" are implemented for computational reasons. |
cutoff, width
|
Cutoff and width of the variogram estimation, passed onto vgm |
verbose |
Should info on type of analysis be printed? |
findVariances |
Should variances be calculated? For internal use only |
findMaxW |
Is the maximum bivariate Moran's I needed? |
includeGPsmooth |
Should a Gaussian random field smoother for stochastic neighbourhood similarity be included? |
testSmooth |
A character string indicating which smooth factor should be tested for, either "trend" for a deterministic process or "field" for the Gaussian random field |
A list containing
estimates |
The estimated measures of association |
multi |
TRUE, a flag for the type of analysis |
normX, normY, method
|
As provided |
families, wo, wParams
|
Optional, as provided. wParams are either etas or numNNs |
All methods use multithreading on the cluster provided using the BiocParallel package
fitLinModels, MoransIMulti, correlationsMulti, GAMsMulti
The tests can be applied to either joint or disjoint coordinate sets.
sbivarSingle( X, Y, Cx, Ey, method = c("Moran's I", "GAMs", "Modified t-test", "GPs"), normX = c("none", "rel", "log"), normY = c("none", "rel", "log"), pseudoCount = 1e-08, etas = c(5e-06, 2e-04, 0.02), findMaxW = FALSE, returnSEsMoransI = TRUE, families = list(X = gaussian(), Y = gaussian()), featuresX = colnames(X), featuresY = colnames(Y), n_points_grid = 600, verbose = TRUE, testSmooth = c("trend", "field"), variogramModels = c("Exp", "Lin"), width = cutoff/15, cutoff = sqrt(2)/3, wo = c("Gauss", "nn"), numNNs = c(4, 8, 24), includeGPsmooth = TRUE, GPmethod = c("REML", "ML"), gpParams, Quants = c(0.005, 0.5), numLscAlts = 5, optControl = lmeControl(opt = "optim", maxIter = 500, msMaxIter = 500, niterEM = 1000, msMaxEval = 1000), correlation = corGaus(form = ~x + y, nugget = TRUE, value = c(1, 0.25)) )sbivarSingle( X, Y, Cx, Ey, method = c("Moran's I", "GAMs", "Modified t-test", "GPs"), normX = c("none", "rel", "log"), normY = c("none", "rel", "log"), pseudoCount = 1e-08, etas = c(5e-06, 2e-04, 0.02), findMaxW = FALSE, returnSEsMoransI = TRUE, families = list(X = gaussian(), Y = gaussian()), featuresX = colnames(X), featuresY = colnames(Y), n_points_grid = 600, verbose = TRUE, testSmooth = c("trend", "field"), variogramModels = c("Exp", "Lin"), width = cutoff/15, cutoff = sqrt(2)/3, wo = c("Gauss", "nn"), numNNs = c(4, 8, 24), includeGPsmooth = TRUE, GPmethod = c("REML", "ML"), gpParams, Quants = c(0.005, 0.5), numLscAlts = 5, optControl = lmeControl(opt = "optim", maxIter = 500, msMaxIter = 500, niterEM = 1000, msMaxEval = 1000), correlation = corGaus(form = ~x + y, nugget = TRUE, value = c(1, 0.25)) )
X, Y
|
Matrices of omics measurements |
Cx, Ey
|
Coordinate matrices of dimension two, belonging to X and Y respectively |
method |
A character string, indicating which method to apply |
normX, normY, pseudoCount
|
Normalization parameters, passed onto normMat |
featuresX, featuresY
|
Features to be tested. Defaults to all features, but specifying them allows to test a limited feature set, while using the whole matrix to calculate library sizes as offset or for normalization. |
n_points_grid, families, includeGPsmooth, testSmooth
|
Passed onto GAMsSingle |
verbose |
Should info on type of analysis be printed? |
wo, variogramModels, numNNs, etas, cutoff, width, returnSEsMoransI, findMaxW
|
Parameters for the calculation of Moran's I, passed onto buildWeightMat |
GPmethod, Quants, numLscAlts, optControl, gpParams, correlation
|
Passed onto fitGP |
If only Cx is supplied and X and Y have the same number of rows, a joint analysis is performed If Cx and Ey are provided, and X and Y have the same number of rows, equality of Cx and Ey is checked. If true, a joint analysis is run, with a warning.
X and Y need to have rownames for matching to the coordinates, and column names for identifying the features. Cx and Ey must have rownames matching those in X and Y, and have two columns. For GAMs, usually no normalization is needed, as the non-gaussianity is taken care of by the outcome distribution, offset and link functions. Currently, identity, inverse and log-link are implemented.
A list with at least the following components
result |
A matrix which contains at least a p-values ("pVal") and a Benjamini-Hochberg adjusted p-value ("pAdj"), sorted by increasing p-value. |
multi |
FALSE, a flag for the type of analysis |
method, normX, normY
|
As provided |
families, wo, wParams
|
Optional, as provided. wParams are either etas or numNNs |
All methods use multithreading on the cluster provided using the BiocParallel package
MoransISingle, ModTtestSingle, GAMsSingle, GPsSingle
Wrapper to normalize, select feature and scale
scaleHelpFun(X, feat)scaleHelpFun(X, feat)
X |
data matrix |
feat |
the feature name |
Returns vector of NA if feature not found, leading to grey in the plots
A vector of values
Scale to [-1,1] range
scaleMinusOne(y, na.rm = TRUE)scaleMinusOne(y, na.rm = TRUE)
y |
The vector to be scaled |
na.rm |
The scaled vector
Scale to [0,1] range
scaleZeroOne(y, na.rm = TRUE)scaleZeroOne(y, na.rm = TRUE)
y |
The vector to be scaled |
na.rm |
The scaled vector
Name a character vector after itself
selfName(x)selfName(x)
x |
The vector to be named |
The named vector
selfName(LETTERS[1:5])selfName(LETTERS[1:5])
Split Spatial Experiment into a list of SpatialExperiment objects, based on a variable present in the colData slot
splitSpatialExperiment(spe, sample_id)splitSpatialExperiment(spe, sample_id)
spe |
The SpatialExperiment object |
sample_id |
A character vector |
A list of SpatialExperiment objects
Split a string
sund(string, split = "__")sund(string, split = "__")
string |
The string |
split |
string to split by |
A character vector of length 2
The variance of the correlation between the spline surfaces is found by propagating the uncertainties on the spline parameters through uncertainty on the spline predictions to uncertainty on the correlation estimate.
testGAM(modelx, modely, predx, predy, findVariances)testGAM(modelx, modely, predx, predy, findVariances)
modelx, modely
|
Two fitted GAMs |
predx, predy
|
Predictions and covariance matrices of fitted GAMs in common grid |
findVariances |
Should variances be calculated? For internal use only |
A vector with the correlation estimate, its standard error and the p-value
This function tests for the variance of a random effect, causing the covariance, to be zero. It is a score test as developed by (Zhang and Lin 2003), with the test statistic having a scaled chi-square distribution.
testGP(x, y, crossBlocks, solXonly, solYonly, sx, sy, derivX, derivY, distMat)testGP(x, y, crossBlocks, solXonly, solYonly, sx, sy, derivX, derivY, distMat)
x, y
|
outcome vectors |
crossBlocks |
An |
solXonly, solYonly
|
Parameters of the Gaussian processes of x and y |
sx, sy
|
Inverses of the covariance matrices of x and y respectively.
Computed from |
derivX, derivY
|
|
distMat |
The complete distance matrix of Cx and Ey stacked. Only needed when
|
Two tests are performed, one for positive and one for negative association, and two times the smallest p-value to achieve a two-sided test. The sign indicates which direction was most significant.
A vector of length 2: a p-value and an indicator of the sign: +1 for positive association, -1 for negative
Zhang D, Lin X (2003). “Hypothesis testing in semiparametric additive mixed models.” Biostatistics, 4(1), 57 - 74.
A (mxm) matric has one trace (the sum of the diagonal elements), a (mxmxp) array has p traces
tr(x, dim = c(1, 2))tr(x, dim = c(1, 2))
x |
Matrix or array |
dim |
Dimensions defining matrices to find traces over |
A trace or vector of traces
Spline predictions are , where is the basis (design) matrix
and are the spline coefficients. Rather than materialising the full
prediction covariance
, only ()
and () are returned. The quadratic form
is then computed cheaply in dimensions by getApproxVar.
vcovPredGam(model, newdata, testSmooth, findVariances = TRUE)vcovPredGam(model, newdata, testSmooth, findVariances = TRUE)
model |
The fitted GAM |
newdata |
The grid on which predictions are made |
testSmooth |
A character string indicating which smooth factor should be tested for, either "trend" for a deterministic process or "field" for the Gaussian random field |
findVariances |
Should variances be calculated? For internal use only |
A list with components
pred |
A vector of predictions on the response scale |
basis |
|
coef_cov |
|
Spatial transcriptomics and metabolomics data measured on the same tissue sections of mouse brains on a regular grid by Vicari et al. 2024. Only a subset of the data, consisting of the 5 most abundant transcripts and metabolites for 6 samples, are included in the package for computational and memory reasons. The images were pre-aligned manually with the help of MAGPIE (Williams et al. 2025). The data consist of two lists of outcome variables and their coordinates.
data(Vicari)data(Vicari)
Four lists of data matrices:
Coordinate lists
Outcome matrices
doi:10.1038/s41587-023-01937-y
Vicari M, Mirzazadeh R, Nilsson A, Shariatgorji R, Bjärterot P, Larsson L, Lee H, Nilsson M, Foyer J, Ekvall M, Czarnewski P, Zhang X, Svenningsson P, Käll L, Andrén PE, Lundeberg J (2024).
“Spatial multimodal analysis of transcriptomes and metabolomes in tissues.”
Nat. Biotechnol., 42(7), 1046 - 1050.
doi:10.1038/nmeth.2089.
https://pubmed.ncbi.nlm.nih.gov/37667091.
Williams EC, Franzén L, Lindvall MO, Hamm G, Oag S, Majumder MM, Denholm J, Hamidinekoo A, Morlanes JE, Vicari M, others (2025).
“Spatially resolved integrative analysis of transcriptomic and metabolomic changes in tissue injury studies.”
bioRxiv, 2025 - 2002.
The results of single- or multi-image analysis are written to an excel spreadsheet with separate tabs per parameter tested, sorted by increasing p-value.
writeSbivarToXlsx( results, file, overwrite = FALSE, digits = 3, sigLevel = 0.05 )writeSbivarToXlsx( results, file, overwrite = FALSE, digits = 3, sigLevel = 0.05 )
results |
The analysis results, from a call to sbivar (single-image) or extractResultsMulti (multi-image) |
file |
The file to write the results to |
overwrite |
A boolean, should the file be overwritten if it exists already? |
digits |
An integer, the number of significant digits to retain for the effect size, raw and adjusted p-values |
sigLevel |
The significance level threshold to use for the adjusted p-values, only features exceeding the threshold are written to the file. Set this parameter to 1 to write all features |
If no feature exceeds the significance threshold for a certain parameter, an empty tab is created. For each fixed effect, a single tab is written. The "baseline" tabs indicate the overall patterns, the other tabs are named after the fixed effects and indicate departure from this baseline for this fixed effect.
Returns invisible with a message when writing operation successful, otherwise throws a warning.
example(sbivar, "sbivar") # The significance level is set to 1 here for illustration, # meaning that all feature pairs will be written to the spreadsheet. # Single result writeSbivarToXlsx(resGAMs, file = tmpFile <- tempfile(fileext = ".xlsx"), sigLevel = 1) file.exists(tmpFile)example(sbivar, "sbivar") # The significance level is set to 1 here for illustration, # meaning that all feature pairs will be written to the spreadsheet. # Single result writeSbivarToXlsx(resGAMs, file = tmpFile <- tempfile(fileext = ".xlsx"), sigLevel = 1) file.exists(tmpFile)