Skip to contents

Abstract

Perform linear and mixed-effects models to assess and visualise changes in cell localisation across disease conditions.

Installation

if (!require("BiocManager")) {
  install.packages("BiocManager")
}
BiocManager::install("spicyR")

Overview

This guide provides step-by-step instructions on how to apply a linear model to multiple segmented and labelled images to assess how the localisation of different cell types changes across different disease conditions.

Example data

We use the Keren et al. (2018) breast cancer dataset to compare the spatial distribution of immune cells in individuals with different levels of tumour infiltration (cold and compartmentalised).

The data is stored as a SpatialExperiment object and contains single-cell spatial data from 41 images.

kerenSPE <- SpatialDatasets::spe_Keren_2018()

The cell types in this dataset includes 11 immune cell types (double negative CD3 T cells, CD4 T cells, B cells, monocytes, macrophages, CD8 T cells, neutrophils, natural killer cells, dendritic cells, regulatory T cells), 2 structural cell types (endothelial, mesenchymal), 2 tumour cell types (keratin+ tumour, tumour) and one unidentified category.

Linear modelling

To investigate changes in localisation between two different cell types, we measure the level of localisation between two cell types by modelling with the L-function. The L-function is a variance-stabilised K-function given by the equation

Liĵ(r)=Kiĵ(r)π \widehat{L_{ij}} (r) = \sqrt{\frac{\widehat{K_{ij}}(r)}{\pi}}

with Kiĵ\widehat{K_{ij}} defined as

Kiĵ(r)=|W|ninjninj1{dijr}eij(r) \widehat{K_{ij}} (r) = \frac{|W|}{n_i n_j} \sum_{n_i} \sum_{n_j} 1 \{d_{ij} \leq r \} e_{ij} (r)

where Kiĵ\widehat{K_{ij}} summarises the degree of co-localisation of cell type jj with cell type ii, nin_i and njn_j are the number of cells of type ii and jj, |W||W| is the image area, dijd_{ij} is the distance between two cells and eij(r)e_{ij} (r) is an edge correcting factor.

Specifically, the mean difference between the experimental function and the theoretical function is used as a measure for the level of localisation, defined as

u=r=rminrmaxL̂ij,Experimental(r)L̂ij,Poisson(r) u = \sum_{r' = r_{\text{min}}}^{r_{\text{max}}} \widehat L_{ij, \text{Experimental}} (r') - \widehat L_{ij, \text{Poisson}} (r')

where uu is the sum is taken over a discrete range of rr between rminr_{\text{min}} and rmaxr_{\text{max}}. Differences of the statistic uu between two conditions is modelled using a weighted linear model.

Test for change in localisation for a specific pair of cells

Firstly, we can test whether one cell type tends to be more localised with another cell type in one condition compared to the other. This can be done using the spicy() function, where we specify the condition parameter.

In this example, we want to see whether or not neutrophils (to) tend to be found around CD8 T cells (from) in compartmentalised tumours compared to cold tumours. Given that there are 3 conditions, we can specify the desired conditions by setting the order of our condition factor. spicy will choose the first level of the factor as the base condition and the second level as the comparison condition. spicy will also naturally coerce the condition column into a factor if it is not already a factor. The column containing cell type annotations can be specified using the cellType argument. By default, spicy uses the column named cellType in the SpatialExperiment object.

spicyTestPair <- spicy(
  kerenSPE,
  condition = "tumour_type",
  from = "CD8_T_cell",
  to = "Neutrophils"
)

topPairs(spicyTestPair)
#>                         intercept coefficient      p.value   adj.pvalue
#> CD8_T_cell__Neutrophils  -109.081    112.0185 2.166646e-05 2.166646e-05
#>                               from          to
#> CD8_T_cell__Neutrophils CD8_T_cell Neutrophils

We obtain a spicy object which details the results of the modelling performed. The topPairs() function can be used to obtain the associated coefficients and p-value.

As the coefficient in spicyTestPair is positive, we find that neutrophils are significantly more likely to be found near CD8 T cells in the compartmentalised tumours group compared to the cold tumour group.

Test for change in localisation for all pairwise cell combinations

We can perform what we did above for all pairwise combinations of cell types by excluding the from and to parameters in spicy().

spicyTest <- spicy(
  kerenSPE,
  condition = "tumour_type"
)

topPairs(spicyTest)
#>                             intercept coefficient      p.value   adj.pvalue
#> Macrophages__dn_T_CD3       56.446064   -50.08474 1.080273e-07 3.035568e-05
#> dn_T_CD3__Macrophages       54.987151   -48.38664 2.194018e-07 3.082595e-05
#> Macrophages__DC_or_Mono     73.239404   -59.90361 5.224660e-06 4.893765e-04
#> DC_or_Mono__Macrophages     71.777087   -58.46833 7.431172e-06 5.220399e-04
#> dn_T_CD3__dn_T_CD3         -63.786032   100.61010 2.878804e-05 1.208706e-03
#> Neutrophils__dn_T_CD3      -63.141840    69.64356 2.891872e-05 1.208706e-03
#> dn_T_CD3__Neutrophils      -63.133725    70.15508 3.011012e-05 1.208706e-03
#> DC__Macrophages             96.893239   -92.55112 1.801300e-04 5.758129e-03
#> Macrophages__DC             96.896215   -93.25194 1.844241e-04 5.758129e-03
#> CD4_T_cell__Keratin_Tumour  -4.845037   -22.14995 2.834659e-04 7.409016e-03
#>                                   from             to
#> Macrophages__dn_T_CD3      Macrophages       dn_T_CD3
#> dn_T_CD3__Macrophages         dn_T_CD3    Macrophages
#> Macrophages__DC_or_Mono    Macrophages     DC_or_Mono
#> DC_or_Mono__Macrophages     DC_or_Mono    Macrophages
#> dn_T_CD3__dn_T_CD3            dn_T_CD3       dn_T_CD3
#> Neutrophils__dn_T_CD3      Neutrophils       dn_T_CD3
#> dn_T_CD3__Neutrophils         dn_T_CD3    Neutrophils
#> DC__Macrophages                     DC    Macrophages
#> Macrophages__DC            Macrophages             DC
#> CD4_T_cell__Keratin_Tumour  CD4_T_cell Keratin_Tumour

Again, we obtain a spicy object which outlines the result of the linear models performed for each pairwise combination of cell types.

We can also examine the L-function metrics of individual images by using the convenient bind() function on our spicyTest results object.

bind(spicyTest)[1:5, 1:5]
#>   imageID         condition Keratin_Tumour__Keratin_Tumour
#> 1       1             mixed                      -2.300602
#> 2       2             mixed                      -1.989699
#> 3       3 compartmentalised                      11.373530
#> 4       4 compartmentalised                      33.931133
#> 5       5 compartmentalised                      17.922818
#>   dn_T_CD3__Keratin_Tumour B_cell__Keratin_Tumour
#> 1                -5.298543             -20.827279
#> 2               -16.020022               3.025815
#> 3               -21.857447             -24.962913
#> 4               -36.438476             -40.470221
#> 5               -20.816783             -38.138076

The results can be represented as a bubble plot using the signifPlot() function.

signifPlot(
  spicyTest,
  breaks = c(-3, 3, 1),
  marksToPlot = c("Macrophages", "DC_or_Mono", "dn_T_CD3", "Neutrophils",
                  "CD8_T_cell", "Keratin_Tumour")
)

Here, we can observe that the most significant relationships occur between macrophages and double negative CD3 T cells, suggesting that the two cell types are far more dispersed in compartmentalised tumours compared to cold tumours.

To examine a specific cell type-cell type relationship in more detail, we can use spicyBoxplot() and specify either from = "Macrophages" and to = "dn_T_CD3" or rank = 1.

spicyBoxPlot(results = spicyTest, 
             # from = "Macrophages",
             # to = "dn_T_CD3"
             rank = 1)

Linear modelling for custom metrics

spicyR can also be applied to custom distance or abundance metrics. A kNN interactions graph can be generated with the function buildSpatialGraph from the imcRtools package. This generates a colPairs object inside of the SpatialExperiment object.

spicyR provides the function convPairs for converting a colPairs object into an abundance matrix by calculating the average number of nearby cells types for every cell type for a given k. For example, if there exists on average 5 neutrophils for every macrophage in image 1, the column Neutrophil__Macrophage would have a value of 5 for image 1.

kerenSPE <- imcRtools::buildSpatialGraph(kerenSPE, 
                                         img_id = "imageID", 
                                         type = "knn", k = 20,
                                        coords = c("x", "y"))

pairAbundances <- convPairs(kerenSPE,
                  colPair = "knn_interaction_graph")

head(pairAbundances["B_cell__B_cell"])
#>    B_cell__B_cell
#> 1      12.7349608
#> 10      0.2777778
#> 11      0.0000000
#> 12      1.3333333
#> 13      1.2200957
#> 14      0.0000000

The custom distance or abundance metrics can then be included in the analysis with the alternateResult parameter. The Statial package contains other custom distance metrics which can be used with spicy.

spicyTestColPairs <- spicy(
  kerenSPE,
  condition = "tumour_type",
  alternateResult = pairAbundances,
  weights = FALSE
)

topPairs(spicyTestColPairs)
#>                            intercept coefficient     p.value adj.pvalue
#> CD8_T_cell__Neutrophils  0.833333333  -0.7592968 0.002645466  0.3291833
#> B_cell__Tumour           0.001937984   0.2602822 0.004872664  0.3291833
#> Other_Immune__NK         0.012698413   0.2612881 0.005673068  0.3291833
#> Unidentified__CD8_T_cell 0.106626794   0.6387339 0.005906526  0.3291833
#> dn_T_CD3__NK             0.004242424   0.2148797 0.006317829  0.3291833
#> CD4_T_cell__Neutrophils  0.036213602   0.2947696 0.007902670  0.3291833
#> Tregs__CD4_T_cell        0.128876212   0.5726201 0.010207087  0.3291833
#> Endothelial__DC          0.008771930   0.3008523 0.011189533  0.3291833
#> Tumour__Neutrophils      0.021638939   0.2529045 0.011388850  0.3291833
#> Mesenchymal__Neutrophils 0.004504505   0.2494301 0.012761315  0.3291833
#>                                  from          to
#> CD8_T_cell__Neutrophils    CD8_T_cell Neutrophils
#> B_cell__Tumour                 B_cell      Tumour
#> Other_Immune__NK         Other_Immune          NK
#> Unidentified__CD8_T_cell Unidentified  CD8_T_cell
#> dn_T_CD3__NK                 dn_T_CD3          NK
#> CD4_T_cell__Neutrophils    CD4_T_cell Neutrophils
#> Tregs__CD4_T_cell               Tregs  CD4_T_cell
#> Endothelial__DC           Endothelial          DC
#> Tumour__Neutrophils            Tumour Neutrophils
#> Mesenchymal__Neutrophils  Mesenchymal Neutrophils
signifPlot(
  spicyTestColPairs,
  breaks = c(-3, 3, 1),
  marksToPlot = c("Macrophages", "dn_T_CD3", "CD4_T_cell", 
                  "B_cell", "DC_or_Mono", "Neutrophils", "CD8_T_cell")
)

Performing survival analysis

spicy can also be used to perform survival analysis to asses whether changes in co-localisation between cell types are associated with survival probability. spicy requires the SingleCellExperiment object being used to contain a column called survival as a Surv object.

kerenSPE$event = 1 - kerenSPE$Censored
kerenSPE$survival = Surv(kerenSPE$`Survival_days_capped*`, kerenSPE$event)

We can then perform survival analysis using the spicy function by specifying condition = "survival". We can then access the corresponding coefficients and p-values by accessing the survivalResults slot in the spicy results object.

# Running survival analysis
spicySurvival = spicy(kerenSPE,
                      condition = "survival")

# top 10 significant pairs
head(spicySurvival$survivalResults, 10)
#> # A tibble: 10 × 4
#>    test                       coef se.coef    p.value
#>    <chr>                     <dbl>   <dbl>      <dbl>
#>  1 Other_Immune__Tregs     0.0236  0.00866 0.00000893
#>  2 CD4_T_cell__Tregs       0.0177  0.00685 0.0000124 
#>  3 Tregs__Other_Immune     0.0237  0.00873 0.0000126 
#>  4 Tregs__CD4_T_cell       0.0171  0.00676 0.0000285 
#>  5 CD8_T_cell__CD8_T_cell  0.00605 0.00272 0.000332  
#>  6 Tumour__CD8_T_cell     -0.0305  0.0114  0.000617  
#>  7 CD8_T_cell__Tumour     -0.0305  0.0116  0.000721  
#>  8 CD4_T_cell__dn_T_CD3    0.00845 0.00353 0.000794  
#>  9 dn_T_CD3__CD4_T_cell    0.00840 0.00353 0.000937  
#> 10 DC__Other_Immune       -0.0289  0.0123  0.00103

Accounting for tissue inhomogeneity

The spicy function can also account for tissue inhomogeneity to avoid false positives or negatives. This can be done by setting the sigma = parameter within the spicy function. By default, sigma is set to NULL, and spicy assumes a homogeneous tissue structure.

For example, when we examine the L-function for Keratin_Tumour__Neutrophils when sigma = NULL and Rs = 100, the value is positive, indicating attraction between the two cell types.

# filter SPE object to obtain image 24 data
kerenSubset = kerenSPE[, colData(kerenSPE)$imageID == "24"]

pairwiseAssoc = getPairwise(kerenSubset, 
                            sigma = NULL, 
                            Rs = 100) |>
  as.data.frame()

pairwiseAssoc[["Keratin_Tumour__Neutrophils"]]
#> [1] 10.88892

When we specify sigma = 20 and re-calculate the L-function, it indicates that there is no relationship between Keratin_Tumour and Neutrophils, i.e., there is no major attraction or dispersion, as it now takes into account tissue inhomogeneity.

pairwiseAssoc = getPairwise(kerenSubset, 
                            sigma = 20, 
                            Rs = 100) |>
  as.data.frame()

pairwiseAssoc[["Keratin_Tumour__Neutrophils"]]
#> [1] 0.9024836
# obtain colData for image 24
cData = colData(kerenSPE) |> as.data.frame() |> 
          dplyr::filter(imageID == "24")

# obtain cells present in image 24
coords = spatialCoords(kerenSPE) |> as.data.frame()
coords$cellID = rownames(coords)
coords = coords |> dplyr::filter(cellID %in% cData$CellID)

cData$X = coords$x
cData$Y = coords$y

cData = cData |> 
  dplyr::mutate(cellTypeNew = ifelse(cellType %in% c("Keratin_Tumour", "Neutrophils"), 
                                     cellType, "Other"))

 pal = setNames(c("#d6b11c", "#850f07"), 
                c("Keratin_Tumour", "Neutrophils"))

ggplot() +
    stat_density_2d(data = cData, aes(x = X, y = Y, fill = after_stat(density)), 
                    geom = "raster", 
                    contour = FALSE) +
    geom_point(data = cData |> filter(cellType != "Other"),
               aes(x = X, y = Y, colour = cellTypeNew), size = 1) +
    scale_color_manual(values = pal) +
    scale_fill_distiller(palette = "Blues", direction = 1) +
    theme_classic() +
    labs(title = "image ID: 24")

Plotting image 24 shows that the supposed co-localisation occurs due to the dense cluster of cells near the top of the image.

Mixed effects modelling

spicyR supports mixed effects modelling when multiple images are obtained for each subject. In this case, subject is treated as a random effect and condition is treated as a fixed effect. To perform mixed effects modelling, we can specify the subject parameter in the spicy function.

spicyMixedTest <- spicy(
  diabetesData,
  condition = "stage",
  subject = "case"
)

References

sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> time zone: UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#>  [1] survival_3.7-0              dplyr_1.1.4                
#>  [3] imcRtools_1.12.0            SpatialDatasets_1.4.0      
#>  [5] ExperimentHub_2.14.0        AnnotationHub_3.14.0       
#>  [7] BiocFileCache_2.14.0        dbplyr_2.5.0               
#>  [9] SpatialExperiment_1.16.0    SingleCellExperiment_1.28.1
#> [11] SummarizedExperiment_1.36.0 Biobase_2.66.0             
#> [13] GenomicRanges_1.58.0        GenomeInfoDb_1.42.0        
#> [15] IRanges_2.40.0              S4Vectors_0.44.0           
#> [17] BiocGenerics_0.52.0         MatrixGenerics_1.18.0      
#> [19] matrixStats_1.4.1           ggplot2_3.5.1              
#> [21] spicyR_1.19.3               BiocStyle_2.34.0           
#> 
#> loaded via a namespace (and not attached):
#>   [1] fs_1.6.5                    spatstat.sparse_3.1-0      
#>   [3] bitops_1.0-9                sf_1.0-19                  
#>   [5] EBImage_4.48.0              httr_1.4.7                 
#>   [7] RColorBrewer_1.1-3          numDeriv_2016.8-1.1        
#>   [9] tools_4.4.2                 backports_1.5.0            
#>  [11] utf8_1.2.4                  R6_2.5.1                   
#>  [13] DT_0.33                     HDF5Array_1.34.0           
#>  [15] mgcv_1.9-1                  rhdf5filters_1.18.0        
#>  [17] withr_3.0.2                 sp_2.1-4                   
#>  [19] gridExtra_2.3               coxme_2.2-22               
#>  [21] ClassifyR_3.10.0            cli_3.6.3                  
#>  [23] textshaping_0.4.0           spatstat.explore_3.3-3     
#>  [25] labeling_0.4.3              sass_0.4.9                 
#>  [27] nnls_1.6                    spatstat.data_3.1-4        
#>  [29] readr_2.1.5                 proxy_0.4-27               
#>  [31] pkgdown_2.1.1               systemfonts_1.1.0          
#>  [33] ggupset_0.4.0               svglite_2.1.3              
#>  [35] RSQLite_2.3.8               generics_0.1.3             
#>  [37] vroom_1.6.5                 spatstat.random_3.3-2      
#>  [39] car_3.1-3                   scam_1.2-17                
#>  [41] Matrix_1.7-1                ggbeeswarm_0.7.2           
#>  [43] fansi_1.0.6                 abind_1.4-8                
#>  [45] terra_1.7-83                lifecycle_1.0.4            
#>  [47] yaml_2.3.10                 carData_3.0-5              
#>  [49] rhdf5_2.50.0                SparseArray_1.6.0          
#>  [51] grid_4.4.2                  blob_1.2.4                 
#>  [53] promises_1.3.0              crayon_1.5.3               
#>  [55] bdsmatrix_1.3-7             shinydashboard_0.7.2       
#>  [57] lattice_0.22-6              beachmat_2.22.0            
#>  [59] KEGGREST_1.46.0             magick_2.8.5               
#>  [61] cytomapper_1.18.0           pillar_1.9.0               
#>  [63] knitr_1.49                  RTriangle_1.6-0.14         
#>  [65] rjson_0.2.23                boot_1.3-31                
#>  [67] codetools_0.2-20            glue_1.8.0                 
#>  [69] spatstat.univar_3.1-1       data.table_1.16.2          
#>  [71] MultiAssayExperiment_1.32.0 vctrs_0.6.5                
#>  [73] png_0.1-8                   gtable_0.3.6               
#>  [75] cachem_1.1.0                xfun_0.49                  
#>  [77] S4Arrays_1.6.0              mime_0.12                  
#>  [79] tidygraph_1.3.1             pheatmap_1.0.12            
#>  [81] units_0.8-5                 nlme_3.1-166               
#>  [83] bit64_4.5.2                 filelock_1.0.3             
#>  [85] bslib_0.8.0                 svgPanZoom_0.3.4           
#>  [87] KernSmooth_2.23-24          vipor_0.4.7                
#>  [89] colorspace_2.1-1            DBI_1.2.3                  
#>  [91] raster_3.6-30               tidyselect_1.2.1           
#>  [93] bit_4.5.0                   compiler_4.4.2             
#>  [95] curl_6.0.1                  BiocNeighbors_2.0.0        
#>  [97] desc_1.4.3                  DelayedArray_0.32.0        
#>  [99] bookdown_0.41               scales_1.3.0               
#> [101] classInt_0.4-10             distances_0.1.11           
#> [103] rappdirs_0.3.3              tiff_0.1-12                
#> [105] stringr_1.5.1               digest_0.6.37              
#> [107] goftest_1.2-3               fftwtools_0.9-11           
#> [109] spatstat.utils_3.1-1        minqa_1.2.8                
#> [111] rmarkdown_2.29              XVector_0.46.0             
#> [113] htmltools_0.5.8.1           pkgconfig_2.0.3            
#> [115] jpeg_0.1-10                 lme4_1.1-35.5              
#> [117] fastmap_1.2.0               rlang_1.1.4                
#> [119] htmlwidgets_1.6.4           ggthemes_5.1.0             
#> [121] UCSC.utils_1.2.0            shiny_1.9.1                
#> [123] ggh4x_0.2.8                 farver_2.1.2               
#> [125] jquerylib_0.1.4             jsonlite_1.8.9             
#> [127] BiocParallel_1.40.0         RCurl_1.98-1.16            
#> [129] magrittr_2.0.3              scuttle_1.16.0             
#> [131] Formula_1.2-5               GenomeInfoDbData_1.2.13    
#> [133] Rhdf5lib_1.28.0             munsell_0.5.1              
#> [135] Rcpp_1.0.13-1               ggnewscale_0.5.0           
#> [137] viridis_0.6.5               stringi_1.8.4              
#> [139] ggraph_2.2.1                zlibbioc_1.52.0            
#> [141] MASS_7.3-61                 plyr_1.8.9                 
#> [143] parallel_4.4.2              ggrepel_0.9.6              
#> [145] deldir_2.0-4                Biostrings_2.74.0          
#> [147] graphlayouts_1.2.1          splines_4.4.2              
#> [149] tensor_1.5                  hms_1.1.3                  
#> [151] locfit_1.5-9.10             igraph_2.1.1               
#> [153] ggpubr_0.6.0                spatstat.geom_3.3-4        
#> [155] ggsignif_0.6.4              reshape2_1.4.4             
#> [157] BiocVersion_3.20.0          evaluate_1.0.1             
#> [159] BiocManager_1.30.25         tzdb_0.4.0                 
#> [161] nloptr_2.1.1                tweenr_2.0.3               
#> [163] httpuv_1.6.15               tidyr_1.3.1                
#> [165] purrr_1.0.2                 polyclip_1.10-7            
#> [167] ggforce_0.4.2               broom_1.0.7                
#> [169] xtable_1.8-4                e1071_1.7-16               
#> [171] rstatix_0.7.2               later_1.3.2                
#> [173] class_7.3-22                viridisLite_0.4.2          
#> [175] ragg_1.3.3                  tibble_3.2.1               
#> [177] lmerTest_3.1-3              memoise_2.0.1              
#> [179] beeswarm_0.4.0              AnnotationDbi_1.68.0       
#> [181] concaveman_1.1.0
Keren, L, M Bosse, D Marquez, R Angoshtari, S Jain, S Varma, S Yang, et al. 2018. “A Structured Tumor-Immune Microenvironment in Triple Negative Breast Cancer Revealed by Multiplexed Ion Beam Imaging.” Cell 174 (6): 1373–1387.e19.