multipointRmultipointR
is a package to fit point process models from spatstat.model
on SpatialExperiment
and SpatialFeatureExperiment
objects. Cells are approximated as points and their distribution in
space is modelled as the log-linear combination of spatial (or
non-spatial) covariates. multipointR
has a mode to do inferences on the level of a single image with
fitModel and across multiple images with
fitModelAcrossImages.
multipointR can be installed and loaded from Bioconductor as follows
For this document, we will use the well established MIBI TOF dataset from Keren et al. (2018).
spe <- SpatialDatasets::spe_Keren_2018()
spe
#> class: SpatialExperiment
#> dim: 48 197678
#> metadata(0):
#> assays(1): intensities
#> rownames(48): Na Si ... Ta Au
#> rowData names(0):
#> colnames(197678): 1 2 ... 197677 197678
#> colData names(40): CellID imageID ... Censored sample_id
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : x y
#> imgData names(0):The preprocessing and visualisation of this dataset is taken from the Statial vignette.
#> [1] "Keratin_Tumour" "dn_T_CD3" "B_cell" "CD4_T_cell"
#> [5] "DC_or_Mono" "Unidentified" "Macrophages" "CD8_T_cell"
#> [9] "Other_Immune" "Endothelial" "Mono_or_Neu" "Mesenchymal"
#> [13] "Neutrophils" "NK" "Tumour" "DC"
#> [17] "Tregs"
Plot of the entire dataset with meta-cell type labels “Immune”, “Tumour” and “p53+ Tumour”.
Parametric point process models (PPMs), as implemented in the
spatstat package, model the distribution of points in space
as an (in)homogeneous Poisson distribution (Baddeley et al. 2014). Thus, the expected
number of points falling in a region \(B\) follows an inhomogeneous Poisson
process with rate \(\lambda(u)\), the
local intensity (Baddeley et
al. 2016).
\[ \mathbb{E}[n(\mathbf{X} \cap B)] = \int_B \lambda(u)du \]
The PPM then models this inhomogeneous intensity \(\lambda(u)\) as a function of spatial and non-spatial covariates
\[ \lambda_{\Theta}(u) = \exp(\beta_0(u) + \Theta^TZ(u)) \]
where \(\beta_0(u)\) is the baseline intensity, \(Z(u)\) are spatially varying covariate functions and \(\Theta\) is the parameter vector to be estimated. In particular, parameters within \(\Theta\) would be relevant for making inferences in order to make claims about spatial associations.
In the Poisson point process, points can be arbitrarily close together, which in a tissue of cells is not feasible. However, a constraint on cell co-localization can be imposed by a Gibbs point process. This class of models allows inhibitory or repulsive interactions between points to be modeled. In our case, we model the interaction between cells as a so-called “hard core” process, whereby the resulting conditional intensity at a location \(u\) is given by
\[ \lambda(u \mid \mathbf{x}) = \cases{ \phi(u) & \text{if $u$ is permissible}\\ 0 & \text{if $u$ is not permissible}\\ } \]
where \(\phi(u)\) is the intensity function of an inhomogeneous Poisson process (note the change in notation to before). The hardcore permission is given if the distance of \(u\) to any other point \(x_i\) is greater than the hard core diameter \(r\). This diameter \(r\) can be either provided by the user (e.g. prior knowledge on the average cell size) or estimated from the data (Baddeley et al. 2016).
The estimation from a dataset then becomes the following:
\[ \lambda_{\Theta}(u \mid \mathbf{x}) = \exp(\beta_0(u) + \Theta^TZ(u)) \cdot h(u,r,\mathbf{x}) \]
where \(\Theta\) are the first order terms (similar to the Poisson point process), whereas \(h(u,r,\mathbf{x})\), the second order terms, define the hard core interaction with interaction radius \(r\), subject to the constraint:
\[ h(u,r,\mathbf{x}) = \cases{ 0 & if $\lVert u-v \rVert \leq r, v \in \mathbf{x}\setminus\{u\}$\\ 1 & if $\lVert u-v \rVert > r, v \in \mathbf{x}\setminus\{u\}$\\ } \]
(Baddeley et al. 2016; Baddeley and Turner 2000).
For the single image case, we will analyse the relationship between the distribution of p53+ tumour cells and immune cells in image \(6\).
We notice qualitatively a clearly separated tumour with p53+ cells and immune cells at the tumour border.
First, we will estimate the distribution of p53+ tumour cells as a homogeneous intensity in space, i.e., we fit a model of the form
\[ \lambda(u) = \exp(\beta_0) \cdot h(u,r,\mathbf{x}) \]
with a constant intercept \(\beta_0\)
speSub <- subset(spe, , imageID == "6")
m0 <- fitModel(spe = speSub,
marks = "cellTypeNew",
interaction = "Hardcore",
formula = as.formula("p53_Tumour ~ 1")
)
m0
#> Stationary Hard core process
#> Fitted to point pattern dataset 'p53_Tumour'
#>
#> First order term: beta = 0.0001885579
#>
#> Hard core distance: 10.1799
#>
#> For standard errors, type coef(summary(x))The model is very basic and has not many parameters, let’s look at the spatial trend instead
The resulting spatial trend is a flat surface.
This is not a very sensible model since we see clear a inhomogeneous distribution of points in space. Furthermore, we note that the Gibbs hard core model estimated some regions with an intensity of zero because the points are too close to each other. This raises the question whether the cells are actually closer than \(10 \mu m\).
This leads to a conceptual question on how to parametrise the Gibbs model best for biological tissue. An alternative to the hard core interaction model is the Strauss interaction model that parametrises not a hard cut-off to zero but rather an interaction probability \(\gamma\). A middle ground is the Fiksel process or a Strauss-Hardcore process that combines both a hard core effect for effects \(r_h\) where no cells are found and a Strauss/Fiksel process where cells are less likely to be found. This can either be a fixed probability \(\gamma\) (Strauss-Hard) or follow a double exponential decay (Fiksel). We will use a Fiksel process for this tutorial as it is more flexible in modelling interactions (Takacs and Fiksel 1986).
\[ h(u,r_h,r_f,\mathbf{x}) = \cases{ 0 & if $\lVert u-v \rVert \leq r_h, v \in \mathbf{x}\setminus\{u\}$\\ \exp(a\exp(-\kappa d)) & if $r_h < \lVert u-v \rVert \leq r_f, v \in \mathbf{x}\setminus\{u\}$\\ 1 & if $\lVert u-v \rVert > r_f, v \in \mathbf{x}\setminus\{u\}$\\ } \]
where \(r_h\) is the hard core radius (estimated from the data) and \(r_f\) is the Fiksel interaction radius (user provided) (Baddeley et al. 2016).
Next, we will formulate a null model of the distribution of p53 positive cells. For this model, we will specify an inhomogeneous intensity varying with a bivariate spline model of \(x\) and \(y\) (\(s(x,y)\)). The model we fit is of the form
\[ \lambda(u) = \exp(\beta_0(u)) \cdot Z_0(u) \cdot h(u,r_h,r_f,\mathbf{x}) \]
with a spatially-varying intercept \(\beta_0(u)\), which we specify as an offset image of the intensity \(Z_0(u)\)
m1 <- fitModel(spe = speSub,
marks = "cellTypeNew",
formula = as.formula("p53_Tumour ~ s(x,y)"),
interaction = "Fiksel",
use.gam = TRUE
)
m1
#> Nonstationary Fiksel process
#> Fitted to point pattern dataset 'p53_Tumour'
#>
#> Log trend: ~s(x, y)
#>
#> Fitted trend coefficients:
#> (Intercept) s(x,y).1 s(x,y).2 s(x,y).3 s(x,y).4 s(x,y).5
#> -9.1541554 -0.6476596 0.1653769 0.3654559 -1.5665713 0.1309464
#> s(x,y).6 s(x,y).7 s(x,y).8 s(x,y).9 s(x,y).10 s(x,y).11
#> -0.8590169 2.2184675 0.2961891 0.5272093 -1.3243271 -0.6068033
#> s(x,y).12 s(x,y).13 s(x,y).14 s(x,y).15 s(x,y).16 s(x,y).17
#> 0.5682193 1.4398388 0.4610939 0.7776327 0.3963535 -1.2996663
#> s(x,y).18 s(x,y).19 s(x,y).20 s(x,y).21 s(x,y).22 s(x,y).23
#> 0.3940691 0.9222074 -0.3882411 -0.9136774 0.3616286 -1.1287907
#> s(x,y).24 s(x,y).25 s(x,y).26 s(x,y).27 s(x,y).28 s(x,y).29
#> -0.5855092 1.0305721 -0.4833980 7.5097828 -0.7967504 -1.4568735
#>
#> Interaction distance: 14.7799
#> Hard core distance: 10.1799
#> Rate parameter: 0.5
#> Fitted interaction strength a: -117.497
#>
#> Relevant coefficients:
#> Interaction
#> -117.4974
#>
#> For standard errors, type coef(summary(x))We can look at the influence of the parameters with a likelihood
ratio test. This is a bit less verbose than the individual \(z\)-tests for each basis function that
summary(m1) would provide.
anova(m1, test="LRT")
#> Warning: Models were re-fitted with use.gam=TRUE
#> Warning: Deviance adjustment is not available for gam fits; unadjusted
#> composite deviance calculated.
#> Analysis of Deviance Table
#>
#> Terms added sequentially (first to last)
#>
#> Model 1: ~1 Fiksel
#> Model 2: ~s(x, y) Fiksel
#> Npar Df Deviance Pr(>Chi)
#> 1 2.000
#> 2 30.608 28.608 819.62 < 2.2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Given the degree to which the log-likelihood improves by adding the intensity function (i.e., the LRT), we can conclude that model of inhomogeneity is more suitable than the homogeneous model for this dataset.
This is also clear from the spatial trend, highlighting that most regions have a very low probability of p53+ cells.
From the literature, we know that p53+ tumour cells have immune-modulatory function. Our hypothesis is therefore that p53+ tumour cells will interact spatially with immune cells. In order to test this, we formulate a Gibbs model of distance to immune cells while accounting for an underlying inhomogeneous distribution of p53+ tumour cells.
The model is then:
\[ \lambda(u) = \exp(\beta_0 + \beta_{\text{dist}}Z_{\text{dist}}(u)) \cdot Z_0(u) \cdot h(u,r_h,r_f,\mathbf{x}) \]
m2 <- fitModel(spe = speSub,
marks = "cellTypeNew",
formula = as.formula("p53_Tumour ~ s(x,y) + distfun(Immune)"),
interaction = "Fiksel",
use.gam = TRUE
)
m2
#> Nonstationary Fiksel process
#> Fitted to point pattern dataset 'p53_Tumour'
#>
#> Log trend: ~s(x, y) + distfun.Immune.
#>
#> Fitted trend coefficients:
#> (Intercept) distfun.Immune. s(x,y).1 s(x,y).2 s(x,y).3
#> -9.593984802 0.009042154 -0.568528495 0.246894412 -0.473754627
#> s(x,y).4 s(x,y).5 s(x,y).6 s(x,y).7 s(x,y).8
#> -1.108346438 -0.079495043 -0.246843580 0.540848809 -0.075926450
#> s(x,y).9 s(x,y).10 s(x,y).11 s(x,y).12 s(x,y).13
#> -0.223434969 0.433273290 0.017014433 0.300267792 0.859605937
#> s(x,y).14 s(x,y).15 s(x,y).16 s(x,y).17 s(x,y).18
#> 0.371853146 0.509506428 -1.019652941 0.061193231 0.113557091
#> s(x,y).19 s(x,y).20 s(x,y).21 s(x,y).22 s(x,y).23
#> 1.212949045 0.343160189 -0.032280463 0.446734511 -0.140139990
#> s(x,y).24 s(x,y).25 s(x,y).26 s(x,y).27 s(x,y).28
#> -0.474576194 -0.331429761 0.699081765 0.277137722 -0.628914230
#> s(x,y).29
#> -1.085549125
#>
#> Interaction distance: 14.7799
#> Hard core distance: 10.1799
#> Rate parameter: 0.5
#> Fitted interaction strength a: -151.72
#>
#> Relevant coefficients:
#> Interaction
#> -151.7204
#>
#> For standard errors, type coef(summary(x))We can again look at the contribution of the model parameters with a LRT
anova(m2, test="LRT")
#> Warning: Models were re-fitted with use.gam=TRUE
#> Warning: Deviance adjustment is not available for gam fits; unadjusted
#> composite deviance calculated.
#> Analysis of Deviance Table
#>
#> Terms added sequentially (first to last)
#>
#> Model 1: ~1 Fiksel
#> Model 2: ~s(x, y) Fiksel
#> Model 3: ~s(x, y) + distfun.Immune. Fiksel
#> Npar Df Deviance Pr(>Chi)
#> 1 2.000
#> 2 30.608 28.60799 819.62 < 2.2e-16 ***
#> 3 31.537 0.92869 55.26 8.693e-14 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1We see that the model including the distance to immune cells fits better than the homogeneous and the inhomogeneous models respectively.
In comparison to the spatial trend of the purely inhomogeneous model, the inhomogeneous + distance to immune cells model has a more localised trend in the bottom right corner.
An important step in fitting complex spatial models is to check goodness-of-fit
#> Model diagnostics (raw residuals)
#> Diagnostics available:
#> four-panel plot
#> mark plot
#> smoothed residual field
#> x cumulative residuals
#> y cumulative residuals
#> sum of all residuals
#> sum of raw residuals in clipped window = -8.798e-12
#> area of clipped window = 3835000
#> quadrature area = 3934000
#> range of smoothed field = [-1.085e-05, 1.993e-05]
We note that there is still some unexplained variance of the residuals along both the \(x\) and \(y\) coordinates but overlaying both in a 2D density the deviations are only minor and centered around zero.
We can also look at the residuals of the model which are in this case not completely normally distributed, indicating some residual unexplained variance. For computational reasons we will reduce the number of simulations to 50.
p
#> Q-Q plot of point process residuals of type 'raw'
#> based on 50 simulations
#>
#> Simulations from fitted model: Nonstationary Fiksel processLooking at a simulated Q-Q plot of the residuals, we see slightly heavier tails than expected. Given the complexity of the dataset, the model fit is acceptable.
Another option that users have is to define a segmented polygon as
spatial covariate in their ppm model.
To do this, we first perform a segmentation on our image. For
simplicity, we will use the Bioconductor package sosta. Of
course, one can use any other segmentation method, the only requirement
being that the polygon is stored as an sf object.
segmentedTumour <- sosta::reconstructShapeDensityImage(
speSub,
marks = "cellTypeNew",
markSelect = c("Tumour"),
thres = 1.1e-4
)
p1 + geom_sf(data = segmentedTumour, inherit.aes = FALSE,
fill = NA, color = "red", linewidth = 1)
#> Coordinate system already present.
#> ℹ Adding new coordinate system, which will replace the existing one.In order to save this polygon we will convert our
SpatialExperiment object into a
SpatialFeatureExperiment object and store it in the
annotGeometries.
sfeSub <- toSpatialFeatureExperiment(speSub)
annotGeometry(sfeSub, "tumour_mask") <- segmentedTumourAfter this, we can specify our ppm with the segmented
tumour as spatial covariate. The result will be a logical covariate,
indicating the intensity of the response within the tumour
(tumour_mask == TRUE) as compared to outside.
m3 <- fitModel(spe = sfeSub,
marks = "cellTypeNew",
formula = as.formula("p53_Tumour ~ s(x,y) + tumour_mask + distfun(Immune)"),
interaction = "Fiksel",
use.gam = TRUE
)
m3
#> Nonstationary Fiksel process
#> Fitted to point pattern dataset 'p53_Tumour'
#>
#> Log trend: ~s(x, y) + tumour_mask + distfun.Immune.
#>
#> Fitted trend coefficients:
#> (Intercept) tumour_maskTRUE distfun.Immune. s(x,y).1 s(x,y).2
#> -11.077551770 1.831930831 0.007634018 -0.461920492 0.322118595
#> s(x,y).3 s(x,y).4 s(x,y).5 s(x,y).6 s(x,y).7
#> -1.297220439 -0.413798975 -0.202490213 0.537006375 -1.311916890
#> s(x,y).8 s(x,y).9 s(x,y).10 s(x,y).11 s(x,y).12
#> -0.493874715 -1.045207007 2.490537385 0.731000810 -0.026302475
#> s(x,y).13 s(x,y).14 s(x,y).15 s(x,y).16 s(x,y).17
#> -0.026421386 -0.015395257 0.081074225 -2.804937043 1.774649538
#> s(x,y).18 s(x,y).19 s(x,y).20 s(x,y).21 s(x,y).22
#> -0.146809101 1.372501825 1.251277836 1.034272424 0.515231976
#> s(x,y).23 s(x,y).24 s(x,y).25 s(x,y).26 s(x,y).27
#> 1.061321723 -0.164969717 -2.084148409 2.273095370 -9.015969693
#> s(x,y).28 s(x,y).29
#> -0.342711177 -0.783840883
#>
#> Interaction distance: 14.7799
#> Hard core distance: 10.1799
#> Rate parameter: 0.5
#> Fitted interaction strength a: -160.246
#>
#> Relevant coefficients:
#> Interaction
#> -160.2456
#>
#> For standard errors, type coef(summary(x))We will check again for the importance of this covariate with a LRT
anova(m3, test="LRT")
#> Warning: Models were re-fitted with use.gam=TRUE
#> Warning: Deviance adjustment is not available for gam fits; unadjusted
#> composite deviance calculated.
#> Analysis of Deviance Table
#>
#> Terms added sequentially (first to last)
#>
#> Model 1: ~1 Fiksel
#> Model 2: ~s(x, y) Fiksel
#> Model 3: ~s(x, y) + tumour_mask Fiksel
#> Model 4: ~s(x, y) + tumour_mask + distfun.Immune. Fiksel
#> Npar Df Deviance Pr(>Chi)
#> 1 2.000
#> 2 30.608 28.60799 819.62 < 2.2e-16 ***
#> 3 31.387 0.77870 106.30 < 2.2e-16 ***
#> 4 32.329 0.94183 39.18 3.334e-10 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1We can interpret from this test that the covariate
tumour_mask is important for describing the distribution of
p53+ cells in space. This makes sense, since p53+ tumour cells are a
subset of tumour cells and can only be found within the tumour.
plot(m3) + geom_sf(data = segmentedTumour, inherit.aes = FALSE,
fill = NA, color = "darkred", linewidth = 0.6)
#> Coordinate system already present.
#> ℹ Adding new coordinate system, which will replace the existing one.The model looks very similar to the spatial trend above in the model with just the distance to immune cells defined. We can again look at the improvement of the model diagnostics
#> Model diagnostics (raw residuals)
#> Diagnostics available:
#> four-panel plot
#> mark plot
#> smoothed residual field
#> x cumulative residuals
#> y cumulative residuals
#> sum of all residuals
#> sum of raw residuals in clipped window = -5.013e-12
#> area of clipped window = 3835000
#> quadrature area = 3934000
#> range of smoothed field = [-1.063e-05, 1.806e-05]
The model diagnostics look very similar to the model above. For computational reasons, we did not include the Q-Q plot here, but the result is also similar to the Q-Q plot above.
As mentioned above, this dataset contains many images. We may want to fit separate models like those above to each image and look at the resulting distribution of the coefficients relating the distance to immune cells \(\beta_{\text{dist}}\)
mdlLs <- fitModelAcrossImages(spe = spe,
imageId = "imageID",
marks = "cellTypeNew",
sharedModel = FALSE,
interaction = "Hardcore",
formula = as.formula("p53_Tumour ~ s(x,y) + distfun(Immune)"),
use.gam = TRUE,
threshold = 10)
#> Fitting ppm to image 1
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 2
#> Fitting ppm to image 3
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 4
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 5
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 6
#> Fitting ppm to image 7
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 8
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 9
#> Fitting ppm to image 10
#> Fitting ppm to image 11
#> Fitting ppm to image 12
#> Fitting ppm to image 13
#> Fitting ppm to image 14
#> Fitting ppm to image 15
#> Fitting ppm to image 16
#> Fitting ppm to image 17
#> Fitting ppm to image 18
#> Fitting ppm to image 19
#> Fitting ppm to image 20
#> Fitting ppm to image 21
#> Fitting ppm to image 22
#> Fitting ppm to image 23
#> Fitting ppm to image 24
#> Fitting ppm to image 25
#> Fitting ppm to image 26
#> Fitting ppm to image 27
#> Fitting ppm to image 28
#> Fitting ppm to image 29
#> Fitting ppm to image 31
#> Fitting ppm to image 32
#> Fitting ppm to image 33
#> Fitting ppm to image 34
#> Fitting ppm to image 35
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 36
#> Fitting ppm to image 37
#> Fitting ppm to image 38
#> Fitting ppm to image 39
#> Warning in newton(lsp = lsp, X = G$X, y = G$y, Eb = G$Eb, UrS = G$UrS, L = G$L,
#> : Iteration limit reached without full convergence - check carefully
#> Fitting ppm to image 40
#> Fitting ppm to image 41
#> There were less than 10 points to compute an intensity onThe dataset describes three tumour subtypes, cold, compartmentalised
and mixed. We want to investigate the differences in the distance to
immune cells coefficient \(\beta_{\text{dist}}\) across these
subtypes. First, we need to extract the coefficients from the model and
convert this to a data.frame.
mdlDfSub <- mdlDf %>% filter(covariate %in% c("distfun.Immune."))
ggplot(mdlDfSub, aes(x = covariate, y = Estimate, label = imageID)) +
geom_boxplot(outlier.shape = NA, alpha = 0.3) +
geom_jitter(aes(color = log10(S.E.)), position = position_jitter(seed = 123)) +
geom_text(aes(color = log10(S.E.)), hjust = 0, vjust = 0, position = position_jitter(seed = 123)) +
theme_light() +
facet_wrap(~tumour_type)
#> Warning: The following aesthetics were dropped during statistical transformation: label.
#> ℹ This can happen when ggplot fails to infer the correct grouping structure in
#> the data.
#> ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
#> variable into a factor?
#> The following aesthetics were dropped during statistical transformation: label.
#> ℹ This can happen when ggplot fails to infer the correct grouping structure in
#> the data.
#> ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
#> variable into a factor?
#> The following aesthetics were dropped during statistical transformation: label.
#> ℹ This can happen when ggplot fails to infer the correct grouping structure in
#> the data.
#> ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
#> variable into a factor?We see a difference between the tumour types with cold tumours showing no effect and a generally positive effect of immune cell distance on p53+ cells in compartmentalised and mixed tumours.
There are two of these models that are clear outliers, let’s look at them in more detail
#> Model diagnostics (raw residuals)
#> Diagnostics available:
#> four-panel plot
#> mark plot
#> smoothed residual field
#> x cumulative residuals
#> y cumulative residuals
#> sum of all residuals
#> sum of raw residuals in clipped window = 5.467e-12
#> area of clipped window = 3890000
#> quadrature area = 3942000
#> range of smoothed field = [-2.726e-05, 1.946e-05]
#model 13
plot(mdlLs[[13]])
#> Model diagnostics (raw residuals)
#> Diagnostics available:
#> four-panel plot
#> mark plot
#> smoothed residual field
#> x cumulative residuals
#> y cumulative residuals
#> sum of all residuals
#> sum of raw residuals in clipped window = -3.48e-12
#> area of clipped window = 3868000
#> quadrature area = 3937000
#> range of smoothed field = [-3.978e-06, 5.085e-06]
The first model seems reasonable in terms of diagnostics, however it contains many cells and thus shows a stronger effect than the other images. The second image shows larger variation in the residuals at both higher \(x\) and \(y\) values, suggesting that the model is misspecified for this image.
We can also test the effect of tumour type (cold, compartmentalised, mixed) on the distance to immune cells coefficient \(\beta_{\text{dist}}\) with a linear model. We weight each observation by the inverse of the standard error, as done by Gerber et al. (2026).
The model is then parametrised as follows:
\[ {\hat{\beta}}_{\text{dist}} = \gamma_0 + \gamma Z_{\text{stage}} + \epsilon \quad \epsilon \sim \mathcal{N}\!\left(0,\, \sigma^2 \text{SE}_{\text{dist}}\right) \]
We can fit this model
mdl <- lm(Estimate ~ tumour_type, data = mdlDf,
weights = 1/(mdlDf$S.E.),
subset = covariate == "distfun.Immune.")
print(summary(mdl))
#>
#> Call:
#> lm(formula = Estimate ~ tumour_type, data = mdlDf, subset = covariate ==
#> "distfun.Immune.", weights = 1/(mdlDf$S.E.))
#>
#> Weighted Residuals:
#> Min 1Q Median 3Q Max
#> -0.27623 -0.10621 -0.00024 0.07264 0.73770
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.003047 0.002870 1.062 0.2985
#> tumour_typecompartmentalised 0.010898 0.004616 2.361 0.0263 *
#> tumour_typemixed 0.009261 0.003869 2.394 0.0245 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.218 on 25 degrees of freedom
#> (4 observations deleted due to missingness)
#> Multiple R-squared: 0.2364, Adjusted R-squared: 0.1753
#> F-statistic: 3.869 on 2 and 25 DF, p-value: 0.03436We note a significant but modest positive effect for both compartmentalised and mixed tumours in comparison to cold tumours.