Introduction to multipointR

Introduction

multipointR is a package to fit point process models from spatstat.model on SpatialExperiment and SpatialFeatureExperiment objects. Cells are approximated as points and their distribution in space is modelled as the log-linear combination of spatial (or non-spatial) covariates. multipointR has a mode to do inferences on the level of a single image with fitModel and across multiple images with fitModelAcrossImages.

Installation

multipointR can be installed and loaded from Bioconductor as follows

library("multipointR")
library("SpatialExperiment")
library("SpatialFeatureExperiment")
library("dplyr")
library("ggplot2")
library("spatstat.model")
library("patchwork")

Dataset

For this document, we will use the well established MIBI TOF dataset from Keren et al. (2018).

spe <- SpatialDatasets::spe_Keren_2018()
spe
#> class: SpatialExperiment 
#> dim: 48 197678 
#> metadata(0):
#> assays(1): intensities
#> rownames(48): Na Si ... Ta Au
#> rowData names(0):
#> colnames(197678): 1 2 ... 197677 197678
#> colData names(40): CellID imageID ... Censored sample_id
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : x y
#> imgData names(0):

The preprocessing and visualisation of this dataset is taken from the Statial vignette.

#>  [1] "Keratin_Tumour" "dn_T_CD3"       "B_cell"         "CD4_T_cell"    
#>  [5] "DC_or_Mono"     "Unidentified"   "Macrophages"    "CD8_T_cell"    
#>  [9] "Other_Immune"   "Endothelial"    "Mono_or_Neu"    "Mesenchymal"   
#> [13] "Neutrophils"    "NK"             "Tumour"         "DC"            
#> [17] "Tregs"

Plot of the entire dataset with meta-cell type labels “Immune”, “Tumour” and “p53+ Tumour”.

Inference with parametric point process models

The inhomogeneous Poisson point process

Parametric point process models (PPMs), as implemented in the spatstat package, model the distribution of points in space as an (in)homogeneous Poisson distribution (Baddeley et al. 2014). Thus, the expected number of points falling in a region \(B\) follows an inhomogeneous Poisson process with rate \(\lambda(u)\), the local intensity (Baddeley et al. 2016).

\[ \mathbb{E}[n(\mathbf{X} \cap B)] = \int_B \lambda(u)du \]

The PPM then models this inhomogeneous intensity \(\lambda(u)\) as a function of spatial and non-spatial covariates

\[ \lambda_{\Theta}(u) = \exp(\beta_0(u) + \Theta^TZ(u)) \]

where \(\beta_0(u)\) is the baseline intensity, \(Z(u)\) are spatially varying covariate functions and \(\Theta\) is the parameter vector to be estimated. In particular, parameters within \(\Theta\) would be relevant for making inferences in order to make claims about spatial associations.

The Gibbs point process

In the Poisson point process, points can be arbitrarily close together, which in a tissue of cells is not feasible. However, a constraint on cell co-localization can be imposed by a Gibbs point process. This class of models allows inhibitory or repulsive interactions between points to be modeled. In our case, we model the interaction between cells as a so-called “hard core” process, whereby the resulting conditional intensity at a location \(u\) is given by

\[ \lambda(u \mid \mathbf{x}) = \cases{ \phi(u) & \text{if $u$ is permissible}\\ 0 & \text{if $u$ is not permissible}\\ } \]

where \(\phi(u)\) is the intensity function of an inhomogeneous Poisson process (note the change in notation to before). The hardcore permission is given if the distance of \(u\) to any other point \(x_i\) is greater than the hard core diameter \(r\). This diameter \(r\) can be either provided by the user (e.g. prior knowledge on the average cell size) or estimated from the data (Baddeley et al. 2016).

The estimation from a dataset then becomes the following:

\[ \lambda_{\Theta}(u \mid \mathbf{x}) = \exp(\beta_0(u) + \Theta^TZ(u)) \cdot h(u,r,\mathbf{x}) \]

where \(\Theta\) are the first order terms (similar to the Poisson point process), whereas \(h(u,r,\mathbf{x})\), the second order terms, define the hard core interaction with interaction radius \(r\), subject to the constraint:

\[ h(u,r,\mathbf{x}) = \cases{ 0 & if $\lVert u-v \rVert \leq r, v \in \mathbf{x}\setminus\{u\}$\\ 1 & if $\lVert u-v \rVert > r, v \in \mathbf{x}\setminus\{u\}$\\ } \]

(Baddeley et al. 2016; Baddeley and Turner 2000).

Single image inference

For the single image case, we will analyse the relationship between the distribution of p53+ tumour cells and immune cells in image \(6\).

We notice qualitatively a clearly separated tumour with p53+ cells and immune cells at the tumour border.

Model of homogeneous intensity

First, we will estimate the distribution of p53+ tumour cells as a homogeneous intensity in space, i.e., we fit a model of the form

\[ \lambda(u) = \exp(\beta_0) \cdot h(u,r,\mathbf{x}) \]

with a constant intercept \(\beta_0\)

speSub <- subset(spe, , imageID == "6")

m0 <- fitModel(spe = speSub,
                marks = "cellTypeNew",
                interaction = "Hardcore",
                formula = as.formula("p53_Tumour ~ 1")
)

m0
#> Stationary Hard core process
#> Fitted to point pattern dataset 'p53_Tumour'
#> 
#> First order term:  beta = 0.0001885579
#> 
#> Hard core distance:  10.1799
#> 
#> For standard errors, type coef(summary(x))

The model is very basic and has not many parameters, let’s look at the spatial trend instead

plot(m0)

The resulting spatial trend is a flat surface.

This is not a very sensible model since we see clear a inhomogeneous distribution of points in space. Furthermore, we note that the Gibbs hard core model estimated some regions with an intensity of zero because the points are too close to each other. This raises the question whether the cells are actually closer than \(10 \mu m\).

This leads to a conceptual question on how to parametrise the Gibbs model best for biological tissue. An alternative to the hard core interaction model is the Strauss interaction model that parametrises not a hard cut-off to zero but rather an interaction probability \(\gamma\). A middle ground is the Fiksel process or a Strauss-Hardcore process that combines both a hard core effect for effects \(r_h\) where no cells are found and a Strauss/Fiksel process where cells are less likely to be found. This can either be a fixed probability \(\gamma\) (Strauss-Hard) or follow a double exponential decay (Fiksel). We will use a Fiksel process for this tutorial as it is more flexible in modelling interactions (Takacs and Fiksel 1986).

\[ h(u,r_h,r_f,\mathbf{x}) = \cases{ 0 & if $\lVert u-v \rVert \leq r_h, v \in \mathbf{x}\setminus\{u\}$\\ \exp(a\exp(-\kappa d)) & if $r_h < \lVert u-v \rVert \leq r_f, v \in \mathbf{x}\setminus\{u\}$\\ 1 & if $\lVert u-v \rVert > r_f, v \in \mathbf{x}\setminus\{u\}$\\ } \]

where \(r_h\) is the hard core radius (estimated from the data) and \(r_f\) is the Fiksel interaction radius (user provided) (Baddeley et al. 2016).

Model of inhomogeneous intensity

Next, we will formulate a null model of the distribution of p53 positive cells. For this model, we will specify an inhomogeneous intensity varying with a bivariate spline model of \(x\) and \(y\) (\(s(x,y)\)). The model we fit is of the form

\[ \lambda(u) = \exp(\beta_0(u)) \cdot Z_0(u) \cdot h(u,r_h,r_f,\mathbf{x}) \]

with a spatially-varying intercept \(\beta_0(u)\), which we specify as an offset image of the intensity \(Z_0(u)\)

m1 <- fitModel(spe = speSub,
                marks = "cellTypeNew",
                formula = as.formula("p53_Tumour ~ s(x,y)"),
                interaction = "Fiksel",
                use.gam = TRUE
)
m1
#> Nonstationary Fiksel process
#> Fitted to point pattern dataset 'p53_Tumour'
#> 
#> Log trend:  ~s(x, y)
#> 
#> Fitted trend coefficients:
#> (Intercept)    s(x,y).1    s(x,y).2    s(x,y).3    s(x,y).4    s(x,y).5 
#>  -9.1541554  -0.6476596   0.1653769   0.3654559  -1.5665713   0.1309464 
#>    s(x,y).6    s(x,y).7    s(x,y).8    s(x,y).9   s(x,y).10   s(x,y).11 
#>  -0.8590169   2.2184675   0.2961891   0.5272093  -1.3243271  -0.6068033 
#>   s(x,y).12   s(x,y).13   s(x,y).14   s(x,y).15   s(x,y).16   s(x,y).17 
#>   0.5682193   1.4398388   0.4610939   0.7776327   0.3963535  -1.2996663 
#>   s(x,y).18   s(x,y).19   s(x,y).20   s(x,y).21   s(x,y).22   s(x,y).23 
#>   0.3940691   0.9222074  -0.3882411  -0.9136774   0.3616286  -1.1287907 
#>   s(x,y).24   s(x,y).25   s(x,y).26   s(x,y).27   s(x,y).28   s(x,y).29 
#>  -0.5855092   1.0305721  -0.4833980   7.5097828  -0.7967504  -1.4568735 
#> 
#> Interaction distance:    14.7799
#> Hard core distance:  10.1799
#> Rate parameter:  0.5
#> Fitted interaction strength a:   -117.497
#> 
#> Relevant coefficients:
#> Interaction 
#>   -117.4974 
#> 
#> For standard errors, type coef(summary(x))

We can look at the influence of the parameters with a likelihood ratio test. This is a bit less verbose than the individual \(z\)-tests for each basis function that summary(m1) would provide.

anova(m1, test="LRT")
#> Warning: Models were re-fitted with use.gam=TRUE
#> Warning: Deviance adjustment is not available for gam fits; unadjusted
#> composite deviance calculated.
#> Analysis of Deviance Table
#> 
#> Terms added sequentially (first to last)
#> 
#> Model 1: ~1   Fiksel
#> Model 2: ~s(x, y)     Fiksel
#>     Npar     Df Deviance  Pr(>Chi)    
#> 1  2.000                              
#> 2 30.608 28.608   819.62 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Given the degree to which the log-likelihood improves by adding the intensity function (i.e., the LRT), we can conclude that model of inhomogeneity is more suitable than the homogeneous model for this dataset.

plot(m1)

This is also clear from the spatial trend, highlighting that most regions have a very low probability of p53+ cells.

Model of distance to immune cells

From the literature, we know that p53+ tumour cells have immune-modulatory function. Our hypothesis is therefore that p53+ tumour cells will interact spatially with immune cells. In order to test this, we formulate a Gibbs model of distance to immune cells while accounting for an underlying inhomogeneous distribution of p53+ tumour cells.

The model is then:

\[ \lambda(u) = \exp(\beta_0 + \beta_{\text{dist}}Z_{\text{dist}}(u)) \cdot Z_0(u) \cdot h(u,r_h,r_f,\mathbf{x}) \]

m2 <- fitModel(spe = speSub,
                marks = "cellTypeNew",
                formula = as.formula("p53_Tumour ~ s(x,y) + distfun(Immune)"),
                interaction = "Fiksel",
                use.gam = TRUE
)

m2
#> Nonstationary Fiksel process
#> Fitted to point pattern dataset 'p53_Tumour'
#> 
#> Log trend:  ~s(x, y) + distfun.Immune.
#> 
#> Fitted trend coefficients:
#>     (Intercept) distfun.Immune.        s(x,y).1        s(x,y).2        s(x,y).3 
#>    -9.593984802     0.009042154    -0.568528495     0.246894412    -0.473754627 
#>        s(x,y).4        s(x,y).5        s(x,y).6        s(x,y).7        s(x,y).8 
#>    -1.108346438    -0.079495043    -0.246843580     0.540848809    -0.075926450 
#>        s(x,y).9       s(x,y).10       s(x,y).11       s(x,y).12       s(x,y).13 
#>    -0.223434969     0.433273290     0.017014433     0.300267792     0.859605937 
#>       s(x,y).14       s(x,y).15       s(x,y).16       s(x,y).17       s(x,y).18 
#>     0.371853146     0.509506428    -1.019652941     0.061193231     0.113557091 
#>       s(x,y).19       s(x,y).20       s(x,y).21       s(x,y).22       s(x,y).23 
#>     1.212949045     0.343160189    -0.032280463     0.446734511    -0.140139990 
#>       s(x,y).24       s(x,y).25       s(x,y).26       s(x,y).27       s(x,y).28 
#>    -0.474576194    -0.331429761     0.699081765     0.277137722    -0.628914230 
#>       s(x,y).29 
#>    -1.085549125 
#> 
#> Interaction distance:    14.7799
#> Hard core distance:  10.1799
#> Rate parameter:  0.5
#> Fitted interaction strength a:   -151.72
#> 
#> Relevant coefficients:
#> Interaction 
#>   -151.7204 
#> 
#> For standard errors, type coef(summary(x))

We can again look at the contribution of the model parameters with a LRT

anova(m2, test="LRT")
#> Warning: Models were re-fitted with use.gam=TRUE
#> Warning: Deviance adjustment is not available for gam fits; unadjusted
#> composite deviance calculated.
#> Analysis of Deviance Table
#> 
#> Terms added sequentially (first to last)
#> 
#> Model 1: ~1   Fiksel
#> Model 2: ~s(x, y)     Fiksel
#> Model 3: ~s(x, y) + distfun.Immune.   Fiksel
#>     Npar       Df Deviance  Pr(>Chi)    
#> 1  2.000                                
#> 2 30.608 28.60799   819.62 < 2.2e-16 ***
#> 3 31.537  0.92869    55.26 8.693e-14 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We see that the model including the distance to immune cells fits better than the homogeneous and the inhomogeneous models respectively.

plot(m2)

In comparison to the spatial trend of the purely inhomogeneous model, the inhomogeneous + distance to immune cells model has a more localised trend in the bottom right corner.

An important step in fitting complex spatial models is to check goodness-of-fit

diagnose.ppm(m2)

#> Model diagnostics (raw residuals)
#> Diagnostics available:
#>  four-panel plot
#>  mark plot 
#>  smoothed residual field
#>  x cumulative residuals
#>  y cumulative residuals
#>  sum of all residuals
#> sum of raw residuals in clipped window = -8.798e-12
#> area of clipped window = 3835000
#> quadrature area = 3934000
#> range of smoothed field =  [-1.085e-05, 1.993e-05]

We note that there is still some unexplained variance of the residuals along both the \(x\) and \(y\) coordinates but overlaying both in a 2D density the deviations are only minor and centered around zero.

We can also look at the residuals of the model which are in this case not completely normally distributed, indicating some residual unexplained variance. For computational reasons we will reduce the number of simulations to 50.

p <- qqplot.ppm(m2,nsim=50)

p
#> Q-Q plot of point process residuals of type 'raw'
#>  based on 50 simulations
#> 
#> Simulations from fitted model: Nonstationary Fiksel process

Looking at a simulated Q-Q plot of the residuals, we see slightly heavier tails than expected. Given the complexity of the dataset, the model fit is acceptable.

Model of segmented objects

Another option that users have is to define a segmented polygon as spatial covariate in their ppm model.

To do this, we first perform a segmentation on our image. For simplicity, we will use the Bioconductor package sosta. Of course, one can use any other segmentation method, the only requirement being that the polygon is stored as an sf object.

segmentedTumour <- sosta::reconstructShapeDensityImage(
  speSub,
  marks      = "cellTypeNew",
  markSelect = c("Tumour"),
  thres = 1.1e-4
)

p1 + geom_sf(data = segmentedTumour, inherit.aes = FALSE,
          fill = NA, color = "red", linewidth = 1)
#> Coordinate system already present.
#> ℹ Adding new coordinate system, which will replace the existing one.

In order to save this polygon we will convert our SpatialExperiment object into a SpatialFeatureExperiment object and store it in the annotGeometries.

sfeSub <- toSpatialFeatureExperiment(speSub)
annotGeometry(sfeSub, "tumour_mask") <- segmentedTumour

After this, we can specify our ppm with the segmented tumour as spatial covariate. The result will be a logical covariate, indicating the intensity of the response within the tumour (tumour_mask == TRUE) as compared to outside.

m3 <- fitModel(spe = sfeSub,
                marks = "cellTypeNew",
                formula = as.formula("p53_Tumour ~ s(x,y) + tumour_mask + distfun(Immune)"),
                interaction = "Fiksel",
                use.gam = TRUE
)

m3
#> Nonstationary Fiksel process
#> Fitted to point pattern dataset 'p53_Tumour'
#> 
#> Log trend:  ~s(x, y) + tumour_mask + distfun.Immune.
#> 
#> Fitted trend coefficients:
#>     (Intercept) tumour_maskTRUE distfun.Immune.        s(x,y).1        s(x,y).2 
#>   -11.077551770     1.831930831     0.007634018    -0.461920492     0.322118595 
#>        s(x,y).3        s(x,y).4        s(x,y).5        s(x,y).6        s(x,y).7 
#>    -1.297220439    -0.413798975    -0.202490213     0.537006375    -1.311916890 
#>        s(x,y).8        s(x,y).9       s(x,y).10       s(x,y).11       s(x,y).12 
#>    -0.493874715    -1.045207007     2.490537385     0.731000810    -0.026302475 
#>       s(x,y).13       s(x,y).14       s(x,y).15       s(x,y).16       s(x,y).17 
#>    -0.026421386    -0.015395257     0.081074225    -2.804937043     1.774649538 
#>       s(x,y).18       s(x,y).19       s(x,y).20       s(x,y).21       s(x,y).22 
#>    -0.146809101     1.372501825     1.251277836     1.034272424     0.515231976 
#>       s(x,y).23       s(x,y).24       s(x,y).25       s(x,y).26       s(x,y).27 
#>     1.061321723    -0.164969717    -2.084148409     2.273095370    -9.015969693 
#>       s(x,y).28       s(x,y).29 
#>    -0.342711177    -0.783840883 
#> 
#> Interaction distance:    14.7799
#> Hard core distance:  10.1799
#> Rate parameter:  0.5
#> Fitted interaction strength a:   -160.246
#> 
#> Relevant coefficients:
#> Interaction 
#>   -160.2456 
#> 
#> For standard errors, type coef(summary(x))

We will check again for the importance of this covariate with a LRT

anova(m3, test="LRT")
#> Warning: Models were re-fitted with use.gam=TRUE
#> Warning: Deviance adjustment is not available for gam fits; unadjusted
#> composite deviance calculated.
#> Analysis of Deviance Table
#> 
#> Terms added sequentially (first to last)
#> 
#> Model 1: ~1   Fiksel
#> Model 2: ~s(x, y)     Fiksel
#> Model 3: ~s(x, y) + tumour_mask   Fiksel
#> Model 4: ~s(x, y) + tumour_mask + distfun.Immune.     Fiksel
#>     Npar       Df Deviance  Pr(>Chi)    
#> 1  2.000                                
#> 2 30.608 28.60799   819.62 < 2.2e-16 ***
#> 3 31.387  0.77870   106.30 < 2.2e-16 ***
#> 4 32.329  0.94183    39.18 3.334e-10 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can interpret from this test that the covariate tumour_mask is important for describing the distribution of p53+ cells in space. This makes sense, since p53+ tumour cells are a subset of tumour cells and can only be found within the tumour.

plot(m3) + geom_sf(data = segmentedTumour, inherit.aes = FALSE,
          fill = NA, color = "darkred", linewidth = 0.6)
#> Coordinate system already present.
#> ℹ Adding new coordinate system, which will replace the existing one.

The model looks very similar to the spatial trend above in the model with just the distance to immune cells defined. We can again look at the improvement of the model diagnostics

diagnose.ppm(m3)

#> Model diagnostics (raw residuals)
#> Diagnostics available:
#>  four-panel plot
#>  mark plot 
#>  smoothed residual field
#>  x cumulative residuals
#>  y cumulative residuals
#>  sum of all residuals
#> sum of raw residuals in clipped window = -5.013e-12
#> area of clipped window = 3835000
#> quadrature area = 3934000
#> range of smoothed field =  [-1.063e-05, 1.806e-05]

The model diagnostics look very similar to the model above. For computational reasons, we did not include the Q-Q plot here, but the result is also similar to the Q-Q plot above.

Multi-image inference

Fitting a model per image

As mentioned above, this dataset contains many images. We may want to fit separate models like those above to each image and look at the resulting distribution of the coefficients relating the distance to immune cells \(\beta_{\text{dist}}\)

mdlLs <- fitModelAcrossImages(spe = spe,
                imageId = "imageID",
                marks = "cellTypeNew",
                sharedModel = FALSE,
                interaction = "Hardcore",
                formula = as.formula("p53_Tumour ~ s(x,y) + distfun(Immune)"),
                use.gam = TRUE,
                threshold = 10)
#> Fitting ppm to image 1
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 2
#> Fitting ppm to image 3
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 4
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 5
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 6
#> Fitting ppm to image 7
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 8
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 9
#> Fitting ppm to image 10
#> Fitting ppm to image 11
#> Fitting ppm to image 12
#> Fitting ppm to image 13
#> Fitting ppm to image 14
#> Fitting ppm to image 15
#> Fitting ppm to image 16
#> Fitting ppm to image 17
#> Fitting ppm to image 18
#> Fitting ppm to image 19
#> Fitting ppm to image 20
#> Fitting ppm to image 21
#> Fitting ppm to image 22
#> Fitting ppm to image 23
#> Fitting ppm to image 24
#> Fitting ppm to image 25
#> Fitting ppm to image 26
#> Fitting ppm to image 27
#> Fitting ppm to image 28
#> Fitting ppm to image 29
#> Fitting ppm to image 31
#> Fitting ppm to image 32
#> Fitting ppm to image 33
#> Fitting ppm to image 34
#> Fitting ppm to image 35
#> There were less than 10 points to compute an intensity on
#> Fitting ppm to image 36
#> Fitting ppm to image 37
#> Fitting ppm to image 38
#> Fitting ppm to image 39
#> Warning in newton(lsp = lsp, X = G$X, y = G$y, Eb = G$Eb, UrS = G$UrS, L = G$L,
#> : Iteration limit reached without full convergence - check carefully
#> Fitting ppm to image 40
#> Fitting ppm to image 41
#> There were less than 10 points to compute an intensity on

The dataset describes three tumour subtypes, cold, compartmentalised and mixed. We want to investigate the differences in the distance to immune cells coefficient \(\beta_{\text{dist}}\) across these subtypes. First, we need to extract the coefficients from the model and convert this to a data.frame.

mdlDf <- mdlToDf(mdlLs = mdlLs,
                imageCovariates = c("imageID",
                                    "tumour_type"))
mdlDfSub <- mdlDf %>% filter(covariate %in% c("distfun.Immune.")) 

ggplot(mdlDfSub, aes(x = covariate, y = Estimate, label = imageID)) +
  geom_boxplot(outlier.shape = NA, alpha = 0.3) +
  geom_jitter(aes(color = log10(S.E.)), position = position_jitter(seed = 123)) +
  geom_text(aes(color = log10(S.E.)), hjust = 0, vjust = 0, position = position_jitter(seed = 123)) +
  theme_light() +
  facet_wrap(~tumour_type)
#> Warning: The following aesthetics were dropped during statistical transformation: label.
#> ℹ This can happen when ggplot fails to infer the correct grouping structure in
#>   the data.
#> ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
#>   variable into a factor?
#> The following aesthetics were dropped during statistical transformation: label.
#> ℹ This can happen when ggplot fails to infer the correct grouping structure in
#>   the data.
#> ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
#>   variable into a factor?
#> The following aesthetics were dropped during statistical transformation: label.
#> ℹ This can happen when ggplot fails to infer the correct grouping structure in
#>   the data.
#> ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
#>   variable into a factor?

We see a difference between the tumour types with cold tumours showing no effect and a generally positive effect of immune cell distance on p53+ cells in compartmentalised and mixed tumours.

There are two of these models that are clear outliers, let’s look at them in more detail

#model 12
plot(mdlLs[[12]])

diagnose.ppm(mdlLs[[12]])

#> Model diagnostics (raw residuals)
#> Diagnostics available:
#>  four-panel plot
#>  mark plot 
#>  smoothed residual field
#>  x cumulative residuals
#>  y cumulative residuals
#>  sum of all residuals
#> sum of raw residuals in clipped window = 5.467e-12
#> area of clipped window = 3890000
#> quadrature area = 3942000
#> range of smoothed field =  [-2.726e-05, 1.946e-05]
#model 13
plot(mdlLs[[13]])

diagnose.ppm(mdlLs[[13]])

#> Model diagnostics (raw residuals)
#> Diagnostics available:
#>  four-panel plot
#>  mark plot 
#>  smoothed residual field
#>  x cumulative residuals
#>  y cumulative residuals
#>  sum of all residuals
#> sum of raw residuals in clipped window = -3.48e-12
#> area of clipped window = 3868000
#> quadrature area = 3937000
#> range of smoothed field =  [-3.978e-06, 5.085e-06]

The first model seems reasonable in terms of diagnostics, however it contains many cells and thus shows a stronger effect than the other images. The second image shows larger variation in the residuals at both higher \(x\) and \(y\) values, suggesting that the model is misspecified for this image.

Second-stage model of coefficients

We can also test the effect of tumour type (cold, compartmentalised, mixed) on the distance to immune cells coefficient \(\beta_{\text{dist}}\) with a linear model. We weight each observation by the inverse of the standard error, as done by Gerber et al. (2026).

The model is then parametrised as follows:

\[ {\hat{\beta}}_{\text{dist}} = \gamma_0 + \gamma Z_{\text{stage}} + \epsilon \quad \epsilon \sim \mathcal{N}\!\left(0,\, \sigma^2 \text{SE}_{\text{dist}}\right) \]

We can fit this model

mdl <- lm(Estimate ~ tumour_type, data = mdlDf, 
          weights = 1/(mdlDf$S.E.), 
          subset = covariate == "distfun.Immune.") 

print(summary(mdl))
#> 
#> Call:
#> lm(formula = Estimate ~ tumour_type, data = mdlDf, subset = covariate == 
#>     "distfun.Immune.", weights = 1/(mdlDf$S.E.))
#> 
#> Weighted Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.27623 -0.10621 -0.00024  0.07264  0.73770 
#> 
#> Coefficients:
#>                              Estimate Std. Error t value Pr(>|t|)  
#> (Intercept)                  0.003047   0.002870   1.062   0.2985  
#> tumour_typecompartmentalised 0.010898   0.004616   2.361   0.0263 *
#> tumour_typemixed             0.009261   0.003869   2.394   0.0245 *
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.218 on 25 degrees of freedom
#>   (4 observations deleted due to missingness)
#> Multiple R-squared:  0.2364, Adjusted R-squared:  0.1753 
#> F-statistic: 3.869 on 2 and 25 DF,  p-value: 0.03436

We note a significant but modest positive effect for both compartmentalised and mixed tumours in comparison to cold tumours.

Fitting a shared model across images

We can also fit a shared model across all images. In that case, we parametrise an interaction effect between the tumour type and the distance to immune cells. Furthermore, we specify a random intercept per sample ID.

In comparison to the ppm models from spatstat.model mppm offers less flexibility in terms of model parametrisation. If you need more flexibility, we recommend you to consider the previous section.

For instance, gam fits are not available in mppm, therefore we account for inhomogeneity with a kernel density estimation of the point distribution. As offsets are also not supported by mppm we can add the KDE as a parameter to our estimation via a power law.

mdl <- fitModelAcrossImages(spe = spe,
                imageId = "imageID",
                imageLs = seq(10,15),
                marks = "cellTypeNew",
                interaction = "Fiksel",
                formula = as.formula("p53_Tumour ~  log(lambda) + tumour_type:distfun(Immune) + (1|DONOR_NO)"),
                threshold = 10)
#> Warning: glm.fit: algorithm did not converge
#> Warning: glm.fit: algorithm did not converge
#> Warning: glm.fit: algorithm did not converge
#> Warning: glm.fit: algorithm did not converge
#> (Fixing subset index..)
#> iteration 1 
#> iteration 2 
#> iteration 3
mdl |> summary()
#> Point process model fitted to 6 point patterns
#> Call: spatstat.model::mppm(formula = feFormula, random = reFormula,
#> Log trend formula: ~log(lambda) + tumour_type:distfun.Immune.
#> 
#> Fixed effects:
#>                                  (Intercept) 
#>                                 6.8099763559 
#>                                  log(lambda) 
#>                                 1.8018079221 
#>              tumour_typecold:distfun.Immune. 
#>                                 0.0002376832 
#> tumour_typecompartmentalised:distfun.Immune. 
#>                                 0.0087401920 
#>             tumour_typemixed:distfun.Immune. 
#>                                 0.0063141223 
#> Random effects:
#>       (Intercept)
#> 30740   1.7313934
#> 30753  -1.2428599
#> 30766  -1.6322514
#> 30770  -0.2350876
#> 30781   0.2534301
#> 30782   1.1253754
#> 
#> Fitted coefficients of interpoint interaction:
#>             x 
#> -2.973352e-05 
#> All fitted coefficients:
#>                                  (Intercept) 
#>                                 6.809976e+00 
#>                                  log(lambda) 
#>                                 1.801808e+00 
#>                                            x 
#>                                -2.973352e-05 
#>              tumour_typecold:distfun.Immune. 
#>                                 2.376832e-04 
#> tumour_typecompartmentalised:distfun.Immune. 
#>                                 8.740192e-03 
#>             tumour_typemixed:distfun.Immune. 
#>                                 6.314122e-03 
#> 
#> Random effects summary:
#> Random effects:
#>  Formula: ~1 | DONOR_NO
#>         (Intercept) Residual
#> StdDev:    1.037534        1
#> 
#> Variance function:
#>  Structure: fixed weights
#>  Formula: ~invwt 
#> 
#> Interaction for each pattern:    Fiksel process
#> 
#> Interaction on row 1:
#> Fiksel process
#> Interaction distance:    17.30489
#> Hard core distance:  12.70489
#> Rate parameter:  0.5
#> Fitted interaction strength a:   NaN
#> 
#> Interaction on row 2:
#> Fiksel process
#> Interaction distance:    17.25228
#> Hard core distance:  12.65228
#> Rate parameter:  0.5
#> Fitted interaction strength a:   NaN
#> 
#> Interaction on row 3:
#> Fiksel process
#> Interaction distance:    12.4811
#> Hard core distance:  7.8811
#> Rate parameter:  0.5
#> Fitted interaction strength a:   NaN
#> 
#> Interaction on row 4:
#> Fiksel process
#> Interaction distance:    15.04625
#> Hard core distance:  10.44625
#> Rate parameter:  0.5
#> Fitted interaction strength a:   NaN
#> 
#> Interaction on row 5:
#> Fiksel process
#> Interaction distance:    11.90547
#> Hard core distance:  7.30547
#> Rate parameter:  0.5
#> Fitted interaction strength a:   NaN
#> 
#> Interaction on row 6:
#> Fiksel process
#> Interaction distance:    24.55344
#> Hard core distance:  21.95344
#> Rate parameter:  0.5
#> Fitted interaction strength a:   NaN
#> 
#> --- Gory details: ---
#> Combined data frame has 47246 rows
#> Linear mixed-effects model fit by maximum likelihood
#>   Data: moadf 
#>   Subset: glmmsubset 
#>   AIC BIC    logLik
#>    NA  NA -114140.9
#> 
#> Random effects:
#>  Formula: ~1 | DONOR_NO
#>         (Intercept) Residual
#> StdDev:    1.200575 1.157142
#> 
#> Variance function:
#>  Structure: fixed weights
#>  Formula: ~invwt 
#> Fixed effects:  .mpl.Y ~ log(lambda) + tumour_type:distfun.Immune. + x 
#>                                                  Value Std.Error    DF  t-value
#> (Intercept)                                   6.809976 0.6127021 40986 11.11466
#> log(lambda)                                   1.801808 0.0415021 40986 43.41483
#> x                                            -0.000030 0.0000219 40986 -1.35698
#> tumour_typecold:distfun.Immune.               0.000238 0.0028626 40986  0.08303
#> tumour_typecompartmentalised:distfun.Immune.  0.008740 0.0030142 40986  2.89964
#> tumour_typemixed:distfun.Immune.              0.006314 0.0005585 40986 11.30586
#>                                              p-value
#> (Intercept)                                   0.0000
#> log(lambda)                                   0.0000
#> x                                             0.1748
#> tumour_typecold:distfun.Immune.               0.9338
#> tumour_typecompartmentalised:distfun.Immune.  0.0037
#> tumour_typemixed:distfun.Immune.              0.0000
#>  Correlation: 
#>                                              (Intr) lg(lm) x      tmr_typcl:.I.
#> log(lambda)                                   0.588                            
#> x                                            -0.142 -0.167                     
#> tumour_typecold:distfun.Immune.              -0.065  0.030 -0.004              
#> tumour_typecompartmentalised:distfun.Immune. -0.088 -0.085  0.084  0.002       
#> tumour_typemixed:distfun.Immune.             -0.144 -0.205  0.108 -0.004       
#>                                              tmr_typcm:.I.
#> log(lambda)                                               
#> x                                                         
#> tumour_typecold:distfun.Immune.                           
#> tumour_typecompartmentalised:distfun.Immune.              
#> tumour_typemixed:distfun.Immune.              0.024       
#> 
#> Standardized Within-Group Residuals:
#>        Min         Q1        Med         Q3        Max 
#> -1.3835491 -0.4519201 -0.3231232 -0.0224754 12.0839852 
#> 
#> Number of Observations: 40997
#> Number of Groups: 6

We notice again a significant effect of the interaction of compartmentalised tumour with distance to immune cells as well as the mixed tumour types. The effect sizes are in both cases very subtle.

In order to assess the model fit we can calculate the sum of the residuals per image and plot them as a function of the condition. In expectation these should be zero.

res <- residuals(mdl, type="raw")
resInt <- sapply(res, integral.msr)
print(resInt)
#>            1            2            3            4            5            6 
#>   -8.9452996   -0.4684389 -428.9582856  -12.7916616 -668.2419431    0.2285766

These residuals are not perfectly centered around zero for some of the models, implying that the model fit from mppm is not ideal in this scenario.

sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#>  [1] SpatialDatasets_1.11.0          ExperimentHub_3.3.1            
#>  [3] AnnotationHub_4.3.1             BiocFileCache_3.3.0            
#>  [5] dbplyr_2.6.0                    patchwork_1.3.2                
#>  [7] spatstat.model_3.7-1            rpart_4.1.27                   
#>  [9] spatstat.explore_3.8-1          nlme_3.1-169                   
#> [11] spatstat.random_3.5-0           spatstat.geom_3.8-1            
#> [13] spatstat.univar_3.2-0           spatstat.data_3.1-9            
#> [15] ggplot2_4.0.3                   dplyr_1.2.1                    
#> [17] SpatialFeatureExperiment_1.15.0 SpatialExperiment_1.23.0       
#> [19] SingleCellExperiment_1.35.1     SummarizedExperiment_1.43.0    
#> [21] Biobase_2.73.1                  GenomicRanges_1.65.0           
#> [23] Seqinfo_1.3.0                   IRanges_2.47.2                 
#> [25] S4Vectors_0.51.3                BiocGenerics_0.59.7            
#> [27] generics_0.1.4                  MatrixGenerics_1.25.0          
#> [29] matrixStats_1.5.0               multipointR_0.99.0             
#> [31] BiocStyle_2.41.0               
#> 
#> loaded via a namespace (and not attached):
#>   [1] splines_4.6.0             bitops_1.0-9             
#>   [3] filelock_1.0.3            tibble_3.3.1             
#>   [5] R.oo_1.27.1               polyclip_1.10-7          
#>   [7] lifecycle_1.0.5           httr2_1.2.2              
#>   [9] Rdpack_2.6.6              formula.tools_1.7.1      
#>  [11] sf_1.1-1                  edgeR_4.11.1             
#>  [13] lattice_0.22-9            MASS_7.3-65              
#>  [15] backports_1.5.1           magrittr_2.0.5           
#>  [17] limma_3.69.2              sass_0.4.10              
#>  [19] rmarkdown_2.31            jquerylib_0.1.4          
#>  [21] yaml_2.3.12               otel_0.2.0               
#>  [23] sp_2.2-1                  spatstat.sparse_3.2-0    
#>  [25] DBI_1.3.0                 buildtools_1.0.0         
#>  [27] RColorBrewer_1.1-3        multcomp_1.4-30          
#>  [29] abind_1.4-8               spatialreg_1.4-3         
#>  [31] purrr_1.2.2               R.utils_2.13.0           
#>  [33] RCurl_1.98-1.19           TH.data_1.1-5            
#>  [35] rappdirs_0.3.4            sandwich_3.1-1           
#>  [37] spatstat.utils_3.2-3      maketools_1.3.2          
#>  [39] terra_1.9-27              units_1.0-1              
#>  [41] goftest_1.2-3             dqrng_0.4.1              
#>  [43] DelayedMatrixStats_1.35.0 codetools_0.2-20         
#>  [45] DropletUtils_1.33.0       DelayedArray_0.39.3      
#>  [47] scuttle_1.23.1            tidyselect_1.2.1         
#>  [49] farver_2.1.2              jsonlite_2.0.0           
#>  [51] BiocNeighbors_2.7.2       e1071_1.7-17             
#>  [53] survival_3.8-6            sosta_1.5.1              
#>  [55] smoothr_1.3.0             tools_4.6.0              
#>  [57] Rcpp_1.1.1-1.1            glue_1.8.1               
#>  [59] SparseArray_1.13.2        xfun_0.58                
#>  [61] mgcv_1.9-4                EBImage_4.55.0           
#>  [63] HDF5Array_1.41.0          withr_3.0.2              
#>  [65] BiocManager_1.30.27       fastmap_1.2.0            
#>  [67] boot_1.3-32               rhdf5filters_1.25.0      
#>  [69] spData_2.3.5              digest_0.6.39            
#>  [71] R6_2.6.1                  wk_0.9.5                 
#>  [73] LearnBayes_2.15.2         tensor_1.5.1             
#>  [75] jpeg_0.1-11               RSQLite_3.53.2           
#>  [77] R.methodsS3_1.8.2         h5mread_1.5.0            
#>  [79] data.table_1.18.4         class_7.3-23             
#>  [81] httr_1.4.8                htmlwidgets_1.6.4        
#>  [83] S4Arrays_1.13.0           spdep_1.4-2              
#>  [85] pkgconfig_2.0.3           gtable_0.3.6             
#>  [87] blob_1.3.0                S7_0.2.2                 
#>  [89] XVector_0.53.0            sys_3.4.3                
#>  [91] htmltools_0.5.9           fftwtools_0.9-11         
#>  [93] scales_1.4.0              png_0.1-9                
#>  [95] reformulas_0.4.4          knitr_1.51               
#>  [97] rjson_0.2.23              coda_0.19-4.1            
#>  [99] curl_7.1.0                proxy_0.4-29             
#> [101] cachem_1.1.0              zoo_1.8-15               
#> [103] rhdf5_2.57.1              operator.tools_1.6.3.1   
#> [105] BiocVersion_3.24.0        KernSmooth_2.23-26       
#> [107] parallel_4.6.0            AnnotationDbi_1.75.0     
#> [109] s2_1.1.11                 pillar_1.11.1            
#> [111] grid_4.6.0                vctrs_0.7.3              
#> [113] beachmat_2.29.0           sfheaders_0.4.5          
#> [115] evaluate_1.0.5            zeallot_0.2.0            
#> [117] magick_2.9.1              mvtnorm_1.4-1            
#> [119] cli_3.6.6                 locfit_1.5-9.12          
#> [121] compiler_4.6.0            crayon_1.5.3             
#> [123] rlang_1.2.0               labeling_0.4.3           
#> [125] classInt_0.4-11           viridisLite_0.4.3        
#> [127] deldir_2.0-4              BiocParallel_1.47.0      
#> [129] Biostrings_2.81.3         tiff_0.1-12              
#> [131] marginaleffects_0.32.0    Matrix_1.7-5             
#> [133] sparseMatrixStats_1.25.0  bit64_4.8.2              
#> [135] Rhdf5lib_2.1.0            KEGGREST_1.53.4          
#> [137] statmod_1.5.2             rbibutils_2.4.1          
#> [139] memoise_2.0.1             bslib_0.11.0             
#> [141] bit_4.6.0
Baddeley, Adrian, Ege Rubak, Rolf Turner, et al. 2016. Spatial Point Patterns: Methodology and Applications with r. Vol. 1. CRC press Boca Raton.
Baddeley, Adrian, and Rolf Turner. 2000. “Practical Maximum Pseudolikelihood for Spatial Point Patterns: (With Discussion).” Australian & New Zealand Journal of Statistics 42 (3): 283–322.
Baddeley, Adrian, Rolf Turner, et al. 2014. “Package ‘Spatstat’.” The Comprehensive R Archive Network () 146.
Gerber, Reto, Jake Griner, Silvia Guglietta, Carsten Krieg, and Mark D Robinson. 2026. “MIMIC: A Flexible Pipeline to Register and Summarize IMC-MSI Experiments.” Communications Biology.
Keren, Leeat, Marc Bosse, Diana Marquez, et al. 2018. “A Structured Tumor-Immune Microenvironment in Triple Negative Breast Cancer Revealed by Multiplexed Ion Beam Imaging.” Cell 174 (6): 1373–87.
Takacs, Roland, and Th Fiksel. 1986. “Interaction Pair-Potentials for a System of Ant’s Nests.” Biometrical Journal 28 (8): 1007–13.