9  Cell relationships relative to expected behaviour

Understanding the spatial relationships of cells within tissue is crucial for interpreting their roles within tissues. However, without defining appropriate contexts, these relationships can be misinterpreted due to confounding factors such as tissue stricture and the choice of region to image. In this section, we will demonstrate the use of our Statial package to derive appropriate contexts for evaluating cell localisation.

# set parameters
set.seed(51773)
use_mc <- TRUE
if (use_mc) {
  nCores <- max(parallel::detectCores()/2, 1)
} else {
  nCores <- 1
}
BPPARAM <- simpleSeg:::generateBPParam(nCores)
theme_set(theme_classic())

9.1 Load datasets

kerenSPE <- SpatialDatasets::spe_Keren_2018()

# remove any missing data in our outcome columns
kerenSPE = kerenSPE[, complete.cases(colData(kerenSPE)[, c("Censored", "Survival_days_capped*",
                                                           "tumour_type")])]

9.2 Kontextual: Context aware cell localisation

Kontextual is a method for performing inference on cell localisation which explicitly defines the contexts in which spatial relationships between cells can be identified and interpreted. These contexts may represent landmarks, spatial domains, or groups of functionally similar cells which are consistent across regions. By modelling spatial relationships between cells relative to these contexts, Kontextual produces robust spatial quantifications that are not confounded by biases such as the choice of region to image and the tissue structure present in the images. The Kontextual function is available in the Statial package.

In this example we demonstrate how cell type hierarchies can be used as a means to derive appropriate “contexts” for the evaluation of cell localisation. We then demonstrate the types of conclusions which Kontextual enables.

9.2.1 Using cell type hierarchies to define a “context”

A cell type hierarchy may be used to define the “context” in which cell type relationships are evaluated within. A cell type hierarchy defines how cell types are functionally related to one another. The bottom of the hierarchy represents homogeneous populations of a cell type (child), and the cell populations at the nodes of the hierarchy represent broader parent populations with shared generalised function. For example, CD4 T cells may be considered a child population to the Immune parent population.

There are two ways to define the cell type hierarchy. First, they can be defined based on our biological understanding of the cell types. We can represent this by creating a named list containing the names of each parent and the associated vector of child cell types.

Note

The all vector must be created to include cell types which do not have a parent e.g. the undefined cell type in this data set.

# Examine all cell types in image
unique(kerenSPE$cellType)
 [1] "Keratin_Tumour" "dn_T_CD3"       "B_cell"         "CD4_T_cell"    
 [5] "DC_or_Mono"     "Unidentified"   "Macrophages"    "CD8_T_cell"    
 [9] "Other_Immune"   "Endothelial"    "Mono_or_Neu"    "Mesenchymal"   
[13] "Neutrophils"    "NK"             "Tumour"         "DC"            
[17] "Tregs"         
# Named list of parents and their child cell types
biologicalHierarchy = list(
  "tumour" = c("Keratin_Tumour", "Tumour"),
  "tcells" = c("dn_T_CD3", "CD4_T_cell", "CD8_T_cell", "Tregs"),
  "myeloid" = c("DC_or_Mono", "DC", "Mono_or_Neu", "Macrophages", "Neutrophils"),
  "tissue" = c("Endothelial", "Mesenchymal")
)

# Adding more broader immune parent populations
biologicalHierarchy$immune = c(biologicalHierarchy$bcells,
                               biologicalHierarchy$tcells,
                               biologicalHierarchy$myeloid,
                               "NK", "Other_Immune", "B_cell")


# Creating a vector for all cellTypes
all <- unique(kerenSPE$cellType)

Alternatively, you can use the treeKoR package on Bioconductor to define these hierarchies in a data driven way.

Note

These parent populations may not be accurate as we are using a small subset of the data.

# Calculate hierarchy using treekoR
kerenTree <- treekoR::getClusterTree(t(assay(kerenSPE, "intensities")),
                            kerenSPE$cellType,
                            hierarchy_method = "hopach",
                            hopach_K = 1)

# Convert treekoR result to a name list of parents and children.
treekorParents = getParentPhylo(kerenTree)

treekorParents
$parent_1
 [1] "dn_T_CD3"     "B_cell"       "CD4_T_cell"   "DC_or_Mono"   "Macrophages" 
 [6] "CD8_T_cell"   "Other_Immune" "NK"           "DC"           "Tregs"       

$parent_2
[1] "Unidentified" "Endothelial"  "Mesenchymal" 

$parent_3
[1] "Keratin_Tumour" "Mono_or_Neu"    "Neutrophils"    "Tumour"        

9.2.2 Application on triple negative breast cancer image

Here we examine an image highlighted in the Keren 2018 manuscript where accounting for context information enabled new conclusions.

# Lets define a new cell type vector
kerenSPE$cellTypeNew <- kerenSPE$cellType

# Select for all cells that express higher than baseline level of p53
p53Pos <- assay(kerenSPE)["p53", ] > -0.300460

# Find p53+ tumour cells
kerenSPE$cellTypeNew[kerenSPE$cellType %in% biologicalHierarchy$tumour] <- "Tumour"
kerenSPE$cellTypeNew[p53Pos & kerenSPE$cellType %in% biologicalHierarchy$tumour] <- "p53_Tumour"

# Group all immune cells under the name "Immune"
kerenSPE$cellTypeNew[kerenSPE$cellType %in% biologicalHierarchy$immune] <- "Immune"

kerenSPE$x <- spatialCoords(kerenSPE)[,"x"]
kerenSPE$y <- spatialCoords(kerenSPE)[,"y"]

# Plot image 6
kerenSPE |>
  colData() |>
  as.data.frame() |>
  filter(imageID == "6") |>
  filter(cellTypeNew %in% c("Immune", "Tumour", "p53_Tumour")) |>
  arrange(cellTypeNew) |>
  ggplot(aes(x = x, y = y, color = cellTypeNew)) +
  geom_point(size = 1) +
  scale_colour_manual(values = c("Immune" = "#505050", "p53_Tumour" = "#64BC46", "Tumour" = "#D6D6D6")) +
  guides(colour = guide_legend(title = "Cell types", override.aes = list(size = 3)))

In image 6 of the Keren 2018 dataset given above, we can see that p53+ tumour cells and immune cells are dispersed. However, we can also see that p53+ tumour cells appear much more localised to immune cells relative to the tumour context (tumour cells and p53+ tumour cells).

We can calculate a context-aware spatial co-localisation metric using Kontextual. Kontextual accepts a SingleCellExperiment object, a single image, or list of images from a SingleCellExperiment object, which gets passed into the cells argument. The two cell types which will be evaluated are specified in the to and from arguments. A parent population must also be specified in the parent argument. Note the parent cell population must include the to cell type. The argument r will specify the radius which the cell relationship will be evaluated on. Kontextual supports parallel processing, and the number of cores can be specified using the cores argument. Kontextual can take a single value or multiple values for each argument and will test all combinations of the arguments specified.

We can calculate these relationships across all images for a single radius (r = 100).

p53_Kontextual <- Kontextual(
  cells = kerenSPE,
  r = 100,
  from = "Immune",
  to = "p53_Tumour",
  parent = c("p53_Tumour", "Tumour"),
  cellType = "cellTypeNew"
)

head(p53_Kontextual)
  imageID               test   original kontextual   r inhomL
1       1 Immune__p53_Tumour -16.212016  -1.681595 100  FALSE
2      10 Immune__p53_Tumour -14.715356  -1.793741 100  FALSE
3      11 Immune__p53_Tumour -11.696597  -7.461566 100  FALSE
4      12 Immune__p53_Tumour  -9.777271  -2.628700 100  FALSE
5      13 Immune__p53_Tumour -15.613023  -3.993736 100  FALSE
6      14 Immune__p53_Tumour -14.671281  -4.287914 100  FALSE

Time for this code chunk to run with 5.5 cores: 12.82 seconds

The kontextCurve function plots the L-function value and Kontextual values over a range of radii. If the points lie above the red line (expected pattern) then localisation is indicated for that radius, if the points lie below the red line then dispersion is indicated.

As seen in the following plot the L-function produces negative values over a range of radii, indicating that p53+ tumour cells and immune cells are dispersed from one another. However by taking into account the tumour context, Kontextual shows positive values over some radii, indicating localisation between p53+ tumour cells and immune cells.

curves <- kontextCurve(
  cells = kerenSPE,
  from = "Immune",
  to = "p53_Tumour",
  parent = c("p53_Tumour", "Tumour"),
  rs = seq(50, 510, 50),
  image = "6",
  cellType = "cellTypeNew",
  cores = nCores
)

kontextPlot(curves)

Time for this code chunk to run with 5.5 cores: 14.56 seconds

Alternatively, we can also test all pairwise cell relationships and their corresponding parent in the dataset. First, we create a data frame with all pairwise combinations using the parentCombinations function. This function takes in a vector of all the cells, as well as the named list of parents and children created earlier in the parentList argument. As shown below, the output is a data frame specifying the to, from, and parent arguments for Kontextual.

Note

If you want to use the treeKor defined hierarchy instead, the output of getParentPhylo can be passed to the parentList argument.

# Get all relationships between cell types and their parents
parentDf <- parentCombinations(
  all = all,
  parentList = biologicalHierarchy
)

head(parentDf)
            from     to       parent parent_name
1 Keratin_Tumour B_cell dn_T_CD3....      immune
2       dn_T_CD3 B_cell dn_T_CD3....      immune
3     CD4_T_cell B_cell dn_T_CD3....      immune
4     DC_or_Mono B_cell dn_T_CD3....      immune
5   Unidentified B_cell dn_T_CD3....      immune
6    Macrophages B_cell dn_T_CD3....      immune

9.2.3 Calculating all pairwise relationships

Rather than specifying to, from, and parent in Kontextual, the output from parentCombinations can be inputted into Kontextual using the parentDf argument to examine all pairwise relationships in the dataset.

# Running Kontextual on all relationships across all images.
kerenKontextual <- Kontextual(
  cells = kerenSPE,
  parentDf = parentDf,
  r = 100,
  cores = nCores
)

head(kerenKontextual)
  imageID                           test   original kontextual   r inhomL
1       1 Keratin_Tumour__B_cell__immune -32.547645 -20.812972 100  FALSE
2      10 Keratin_Tumour__B_cell__immune -32.497368 -26.795851 100  FALSE
3      11 Keratin_Tumour__B_cell__immune   4.829389   1.697818 100  FALSE
4      12 Keratin_Tumour__B_cell__immune -13.468515  -8.937432 100  FALSE
5      13 Keratin_Tumour__B_cell__immune -21.768051 -14.206190 100  FALSE
6      14 Keratin_Tumour__B_cell__immune         NA         NA 100  FALSE

Time for this code chunk to run with 5.5 cores: 82.94 seconds

For every pairwise relationship (named accordingly: from__to__parent) Kontextual outputs the L-function values (original) and the Kontextual value. The relationships where the L-function and Kontextual disagree (e.g. one metric is positive and the other is negative) represent relationships where adding context information results in different conclusions on the spatial relationship between the two cell types.

9.2.4 Associating the relationships with survival outcomes

We can then examine whether Kontexual can be associated with patient outcomes or groupings. For this, we can use the spicy function from the spicyR package. First, the Kontextual results must be converted from a data.frame to a wide matrix. This can be done with the prepMatrix function.

Note

To extract the original L-function values, specify column = "original" in prepMatrix.

# Converting Kontextual result into data matrix
kontextMat <- prepMatrix(kerenKontextual)

# Ensuring rownames of kontextMat match up with the image IDs of the SCE object
kontextMat <- kontextMat[kerenSPE$imageID |> unique(), ]

# Replace NAs with 0
kontextMat[is.na(kontextMat)] <- 0

Finally, both the SingleCellExperiment object and the Kontextual matrix are passed into the spicy function, with condition = "survival". The resulting coefficients and p values can be obtained by accessing the survivalResults slot.

kerenSPE$event = 1 - kerenSPE$Censored
kerenSPE$survival = Surv(kerenSPE$`Survival_days_capped*`, kerenSPE$event)

# Running survival analysis using spicy
survivalResults = spicy(cells = kerenSPE,
                        alternateResult = kontextMat,
                        condition = "survival",
                        weights = TRUE)

head(survivalResults$survivalResults, 10)
# A tibble: 10 × 4
   test                                    coef se.coef     p.value
   <chr>                                  <dbl>   <dbl>       <dbl>
 1 dn_T_CD3__Tregs__immune               0.0187 0.00716 0.000000568
 2 Other_Immune__Tregs__immune           0.0255 0.00885 0.000000696
 3 Tregs__dn_T_CD3__immune               0.0189 0.00716 0.00000176 
 4 Neutrophils__CD8_T_cell__immune       0.0178 0.00652 0.00000177 
 5 DC__dn_T_CD3__immune                 -0.0195 0.00662 0.00000730 
 6 DC__dn_T_CD3__tcells                 -0.0171 0.00759 0.0000165  
 7 Other_Immune__Keratin_Tumour__tumour  0.137  0.0409  0.0000207  
 8 CD4_T_cell__Tregs__immune             0.0216 0.00902 0.0000290  
 9 Other_Immune__DC__myeloid            -0.0181 0.00664 0.0000490  
10 Other_Immune__DC__immune             -0.0186 0.00669 0.0000652  

Time for this code chunk to run with 5.5 cores: 3.84 seconds

The survival results can also be visualised using the signifPlot function. The “Context” slot in the legend describes the parent population with respect to which the Kontextual metric is calculated.

signifPlot(survivalResults)

As we can see from the results, dn_T_CD3__Tregs__immune is the one of the most significant pairwise relationships which contributes to patient survival. That is the relationship between double negative CD3 T cells and regulatory T cells, relative to the parent population of immune cells. We can see that there is a positive coefficient associated with this relationship, which tells us an increase in localisation of these cell types relative to immune cells leads to worse survival outcomes for patients.

The association between dn_T_CD3__Tregs__immune and survival can also be visualised on a Kaplan-Meier curve. First, we extract survival data from the SingleCellExperiment object and create a survival vector.

# Extracting survival data
survData <- kerenSPE |>
  colData() |>
  data.frame() |>
  select(imageID, survival) |>
  unique()

# Creating survival vector
kerenSurv <- survData$survival
names(kerenSurv) <- survData$imageID

kerenSurv
    1     2     3     4     5     6     7     8     9    10    11    12    13 
 2612   745 3130+ 2523+ 1683+ 2275+   584   946 3767+ 3822+ 3774+ 4353+  1072 
   14    15    16    17    18    19    20    21    23    24    25    26    27 
4145+  1754   530  2842 5063+ 3725+ 4761+   635    91   194 4785+ 4430+  3658 
   28    29    31    32    33    34    35    36    37    39    40    41 
3767+  1319  1009 1568+ 1738+ 2832+ 2759+ 3063+ 2853+ 2096+  3573 3355+ 

Next, we extract the Kontextual values of this relationship across all images. We then determine if double negative CD3 T cells and regulatory T cells cells are relatively attracted or avoiding in each image by comparing the Kontextual value in each image to the median Kontextual value.

Finally, we plot a Kaplan-Meier curve using the ggsurvfit package. As shown below, when double negative CD3 T cells and regulatory T cells are more localised to one another relative to the immune cell population, patients tend to have worse survival outcomes.

# Selecting most significant relationship
survRelationship <- kontextMat[["dn_T_CD3__Tregs__immune"]]
survRelationship <- ifelse(survRelationship > median(survRelationship), "Localised", "Dispersed")

# Plotting Kaplan-Meier curve
survfit2(kerenSurv ~ survRelationship) |>
  ggsurvfit() +
  ggtitle("dn_T_CD3__Tregs__immune")

In the next section, we will look at how we can identify spatial domains, or regions of co-localisation between cell types, and how they can be associated with clinical outcomes.

9.3 sessionInfo

R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.4.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Sydney
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] ggsurvfit_1.1.0             treekoR_1.14.0             
 [3] tibble_3.2.1                survival_3.7-0             
 [5] dplyr_1.1.4                 imcRtools_1.12.0           
 [7] SpatialDatasets_1.4.0       ExperimentHub_2.14.0       
 [9] AnnotationHub_3.14.0        BiocFileCache_2.14.0       
[11] dbplyr_2.5.0                SpatialExperiment_1.16.0   
[13] SingleCellExperiment_1.28.1 SummarizedExperiment_1.36.0
[15] Biobase_2.66.0              GenomicRanges_1.58.0       
[17] GenomeInfoDb_1.42.0         IRanges_2.40.0             
[19] S4Vectors_0.44.0            BiocGenerics_0.52.0        
[21] MatrixGenerics_1.18.0       matrixStats_1.4.1          
[23] ggplot2_3.5.1               Statial_1.8.0              
[25] spicyR_1.18.0              

loaded via a namespace (and not attached):
  [1] vroom_1.6.5                 tiff_0.1-12                
  [3] dcanr_1.22.0                goftest_1.2-3              
  [5] DT_0.33                     Biostrings_2.74.0          
  [7] HDF5Array_1.34.0            TH.data_1.1-2              
  [9] vctrs_0.6.5                 spatstat.random_3.3-2      
 [11] digest_0.6.37               png_0.1-8                  
 [13] shape_1.4.6.1               proxy_0.4-27               
 [15] ggrepel_0.9.6               deldir_2.0-4               
 [17] magick_2.8.5                MASS_7.3-61                
 [19] reshape2_1.4.4              httpuv_1.6.15              
 [21] foreach_1.5.2               withr_3.0.2                
 [23] ggfun_0.1.7                 xfun_0.49                  
 [25] ggpubr_0.6.0                doRNG_1.8.6.1              
 [27] memoise_2.0.1               RTriangle_1.6-0.14         
 [29] cytomapper_1.18.0           ggbeeswarm_0.7.2           
 [31] RProtoBufLib_2.18.0         systemfonts_1.1.0          
 [33] tidytree_0.4.6              zoo_1.8-12                 
 [35] GlobalOptions_0.1.2         Formula_1.2-5              
 [37] KEGGREST_1.46.0             promises_1.3.2             
 [39] httr_1.4.7                  rstatix_0.7.2              
 [41] rhdf5filters_1.18.0         rhdf5_2.50.0               
 [43] rstudioapi_0.17.1           UCSC.utils_1.2.0           
 [45] units_0.8-5                 generics_0.1.3             
 [47] concaveman_1.1.0            curl_6.0.1                 
 [49] zlibbioc_1.52.0             ggraph_2.2.1               
 [51] polyclip_1.10-7             GenomeInfoDbData_1.2.13    
 [53] SparseArray_1.6.0           fftwtools_0.9-11           
 [55] xtable_1.8-4                stringr_1.5.1              
 [57] doParallel_1.0.17           evaluate_1.0.1             
 [59] S4Arrays_1.6.0              hms_1.1.3                  
 [61] colorspace_2.1-1            filelock_1.0.3             
 [63] spatstat.data_3.1-4         magrittr_2.0.3             
 [65] readr_2.1.5                 later_1.4.1                
 [67] viridis_0.6.5               ggtree_3.14.0              
 [69] lattice_0.22-6              spatstat.geom_3.3-4        
 [71] XML_3.99-0.17               scuttle_1.16.0             
 [73] ggupset_0.4.0               class_7.3-22               
 [75] svgPanZoom_0.3.4            pillar_1.9.0               
 [77] simpleSeg_1.8.0             nlme_3.1-166               
 [79] iterators_1.0.14            EBImage_4.48.0             
 [81] compiler_4.4.1              beachmat_2.22.0            
 [83] stringi_1.8.4               sf_1.0-19                  
 [85] tensor_1.5                  minqa_1.2.8                
 [87] ClassifyR_3.10.5            plyr_1.8.9                 
 [89] crayon_1.5.3                abind_1.4-8                
 [91] gridGraphics_0.5-1          locfit_1.5-9.10            
 [93] sp_2.1-4                    graphlayouts_1.2.1         
 [95] bit_4.5.0                   terra_1.7-78               
 [97] sandwich_3.1-1              codetools_0.2-20           
 [99] multcomp_1.4-26             e1071_1.7-16               
[101] GetoptLong_1.0.5            plotly_4.10.4              
[103] mime_0.12                   MultiAssayExperiment_1.32.0
[105] splines_4.4.1               circlize_0.4.16            
[107] Rcpp_1.0.13-1               knitr_1.49                 
[109] blob_1.2.4                  utf8_1.2.4                 
[111] clue_0.3-66                 BiocVersion_3.20.0         
[113] lme4_1.1-35.5               fs_1.6.5                   
[115] nnls_1.6                    ggplotify_0.1.2            
[117] ggsignif_0.6.4              Matrix_1.7-1               
[119] scam_1.2-17                 statmod_1.5.0              
[121] tzdb_0.4.0                  svglite_2.1.3              
[123] tweenr_2.0.3                pkgconfig_2.0.3            
[125] pheatmap_1.0.12             tools_4.4.1                
[127] cachem_1.1.0                RSQLite_2.3.8              
[129] viridisLite_0.4.2           DBI_1.2.3                  
[131] numDeriv_2016.8-1.1         fastmap_1.2.0              
[133] rmarkdown_2.29              scales_1.3.0               
[135] grid_4.4.1                  shinydashboard_0.7.2       
[137] broom_1.0.7                 patchwork_1.3.0            
[139] BiocManager_1.30.25         carData_3.0-5              
[141] farver_2.1.2                tidygraph_1.3.1            
[143] mgcv_1.9-1                  yaml_2.3.10                
[145] ggthemes_5.1.0              cli_3.6.3                  
[147] purrr_1.0.2                 hopach_2.66.0              
[149] lifecycle_1.0.4             mvtnorm_1.3-2              
[151] backports_1.5.0             BiocParallel_1.40.0        
[153] cytolib_2.18.0              gtable_0.3.6               
[155] rjson_0.2.23                parallel_4.4.1             
[157] ape_5.8                     limma_3.62.1               
[159] jsonlite_1.8.9              edgeR_4.4.0                
[161] bitops_1.0-9                bit64_4.5.2                
[163] Rtsne_0.17                  FlowSOM_2.14.0             
[165] yulab.utils_0.1.8           spatstat.utils_3.1-1       
[167] BiocNeighbors_2.0.0         ranger_0.17.0              
[169] flowCore_2.18.0             bdsmatrix_1.3-7            
[171] spatstat.univar_3.1-1       lazyeval_0.2.2             
[173] shiny_1.9.1                 ConsensusClusterPlus_1.70.0
[175] htmltools_0.5.8.1           diffcyt_1.26.0             
[177] rappdirs_0.3.3              distances_0.1.11           
[179] glue_1.8.0                  XVector_0.46.0             
[181] RCurl_1.98-1.16             treeio_1.30.0              
[183] classInt_0.4-10             coxme_2.2-22               
[185] jpeg_0.1-10                 gridExtra_2.3              
[187] boot_1.3-31                 igraph_2.1.1               
[189] R6_2.5.1                    tidyr_1.3.1                
[191] ggiraph_0.8.11              labeling_0.4.3             
[193] ggh4x_0.2.8                 cluster_2.1.6              
[195] rngtools_1.5.2              Rhdf5lib_1.28.0            
[197] aplot_0.2.3                 nloptr_2.1.1               
[199] DelayedArray_0.32.0         tidyselect_1.2.1           
[201] vipor_0.4.7                 ggforce_0.4.2              
[203] raster_3.6-30               car_3.1-3                  
[205] AnnotationDbi_1.68.0        munsell_0.5.1              
[207] KernSmooth_2.23-24          data.table_1.16.2          
[209] htmlwidgets_1.6.4           ComplexHeatmap_2.22.0      
[211] RColorBrewer_1.1-3          rlang_1.1.4                
[213] spatstat.sparse_3.1-0       spatstat.explore_3.3-3     
[215] lmerTest_3.1-3              uuid_1.2-1                 
[217] colorRamps_2.3.4            ggnewscale_0.5.0           
[219] fansi_1.0.6                 beeswarm_0.4.0