An introduction to the SpatialFeatures Package
Guan Gui, Shila Ghazanfar
School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, AustraliaCharles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, AustraliaSydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, AustraliaSource:
vignettes/SpatialFeatures.Rmd
SpatialFeatures.Rmd
SpatialFeatures
The R package SpatialFeatures
contains functions to
extract features from molecule-resolved spatial transcriptomics data
using objects from the MoleculeExperiment
class.
Load Required Libraries
library(SpatialFeatures)
library(MoleculeExperiment)
library(SingleCellExperiment)
library(ggplot2)
library(dplyr)
Create Example Dataset
Load an example MoleculeExperiment
object with Xenium
data:
repoDir <- system.file("extdata", package = "MoleculeExperiment")
repoDir <- paste0(repoDir, "/xenium_V1_FF_Mouse_Brain")
me <- readXenium(repoDir, keepCols = "essential")
me
#> MoleculeExperiment class
#>
#> molecules slot (1): detected
#> - detected:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- features (137): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> ---- molecules (962)
#> ---- location range: [4900,4919.98] x [6400.02,6420]
#> -- sample2:
#> ---- features (143): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> ---- molecules (777)
#> ---- location range: [4900.01,4919.98] x [6400.16,6419.97]
#>
#>
#> boundaries slot (1): cell
#> - cell:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (5): 67500 67512 67515 67521 67527
#> -- sample2:
#> ---- segments (9): 65043 65044 ... 65070 65071
Quick start
The spatialFeatures
function performs three key steps.
First, it generates new boundaries corresponding to the feature types.
Second, it calculates entropy-based metrics for each of the feature
types. Third, it combines all this information and returns a
SingleCellExperiment object.
SpatialFeatures can be run with the following basic command. This will extract out all feature types as a single assay.
se = spatialFeatures(me)
se
#> class: SingleCellExperiment
#> dim: 178 14
#> metadata(0):
#> assays(4): subsector subconcentric supersector superconcentric
#> rownames(178): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> rowData names(0):
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#> sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
You can also calculate gene counts within the spatialFeatures function, either as additional concatenated features, or with each feature type belonging to its own assay.
se = spatialFeatures(me, includeCounts = TRUE, concatenateFeatures = FALSE)
se
#> class: SingleCellExperiment
#> dim: 178 14
#> metadata(0):
#> assays(5): subsector subconcentric supersector superconcentric
#> genecount
#> rownames(178): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> rowData names(0):
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#> sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
Step-by-step SpatialFeatures
Generate new boundaries based on subcellular and supercellular features
This function loadBoundaries(me)
processes the given
data (me
) and returns feature data based on multiple assay
types, including: - Sub-Sector - Sub-Concentric - Super-Sector -
Super-Concentric
me <- SpatialFeatures::loadBoundaries(me)
me
#> MoleculeExperiment class
#>
#> molecules slot (1): detected
#> - detected:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- features (137): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> ---- molecules (962)
#> ---- location range: [4900,4919.98] x [6400.02,6420]
#> -- sample2:
#> ---- features (143): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> ---- molecules (777)
#> ---- location range: [4900.01,4919.98] x [6400.16,6419.97]
#>
#>
#> boundaries slot (5): cell subsector subconcentric supersector
#> superconcentric
#> - cell:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (5): 67500 67512 67515 67521 67527
#> -- sample2:
#> ---- segments (9): 65043 65044 ... 65070 65071
#> - subsector:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (60): 67500_01 67500_02 ... 67527_11 67527_12
#> -- sample2:
#> ---- segments (108): 65043_01 65043_02 ... 65071_11 65071_12
#> - subconcentric:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (30): 67500_01 67500_02 ... 67527_05 67527_06
#> -- sample2:
#> ---- segments (54): 65043_01 65043_02 ... 65071_05 65071_06
#> - supersector:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (55): 67500_01 67500_02 ... 67527_10 67527_11
#> -- sample2:
#> ---- segments (99): 65043_01 65043_02 ... 65071_10 65071_11
#> - superconcentric:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (35): 67500_01 67500_02 ... 67527_06 67527_07
#> -- sample2:
#> ---- segments (63): 65043_01 65043_02 ... 65071_06 65071_07
Draw the Feature Boundary Plots
Visualize the spatial molecule and cell boundary map with SpatialFeatures boundaries.
ggplot_me() +
# add molecule points and colour by feature name
geom_point_me(me, byColour = "feature_id", size = 0.1) +
# add nuclei segments and colour the border with red
geom_polygon_me(me, assayName = "subsector", fill = NA, colour = "red") +
# add cell segments and colour by cell id
geom_polygon_me(me, byFill = "segment_id", colour = "black", alpha = 0.1) +
# zoom in to selected patch area
coord_cartesian(xlim = c(4875, 4950), ylim = c(6385, 6455))
ggplot_me() +
# add molecule points and colour by feature name
geom_point_me(me, byColour = "feature_id", size = 0.1) +
# add nuclei segments and colour the border with red
geom_polygon_me(me, assayName = "subconcentric", fill = NA, colour = "red") +
# add cell segments and colour by cell id
geom_polygon_me(me, byFill = "segment_id", colour = "black", alpha = 0.1) +
# zoom in to selected patch area
coord_cartesian(xlim = c(4875, 4950), ylim = c(6385, 6455))
ggplot_me() +
# add molecule points and colour by feature name
geom_point_me(me, byColour = "feature_id", size = 0.1) +
# add nuclei segments and colour the border with red
geom_polygon_me(me, assayName = "supersector", fill = NA, colour = "blue") +
# add cell segments and colour by cell id
geom_polygon_me(me, byFill = "segment_id", colour = "black", alpha = 0.1) +
# zoom in to selected patch area
coord_cartesian(xlim = c(4875, 4950), ylim = c(6385, 6455))
ggplot_me() +
# add molecule points and colour by feature name
geom_point_me(me, byColour = "feature_id", size = 0.1) +
# add nuclei segments and colour the border with red
geom_polygon_me(me, assayName = "superconcentric", fill = NA, colour = "blue") +
# add cell segments and colour by cell id
geom_polygon_me(me, byFill = "segment_id", colour = "black", alpha = 0.1) +
# zoom in to selected patch area
coord_cartesian(xlim = c(4875, 4950), ylim = c(6385, 6455))
Create Entropy Matrix
This function EntropyMatrix()
computes the entropy of a
counts matrix from a MoleculeExperiment
object based on the
given assay type.
ent = SpatialFeatures::EntropyMatrix(me, c("subsector", "subconcentric", "supersector", "superconcentric"), nCores = 1)
lapply(ent, head, n = 4)
#> $subsector
#> sample1.67500 sample1.67512 sample1.67515 sample1.67521
#> 2010300C02Rik 0.5623351 0.6931472 0.6365142 0.6931472
#> Acsbg1 0.6931472 0.6931472 0.6931472 0.0000000
#> Adamts2 0.0000000 0.0000000 0.0000000 0.0000000
#> Adamtsl1 0.0000000 0.0000000 0.0000000 0.0000000
#> sample1.67527 sample2.65043 sample2.65044 sample2.65051
#> 2010300C02Rik 0 0 0 0.0000000
#> Acsbg1 0 0 0 0.6931472
#> Adamts2 0 0 0 0.0000000
#> Adamtsl1 0 0 0 0.0000000
#> sample2.65055 sample2.65063 sample2.65064 sample2.65067
#> 2010300C02Rik 1.5498260 0 0.6931472 0
#> Acsbg1 0.6931472 0 0.0000000 0
#> Adamts2 0.0000000 0 0.0000000 0
#> Adamtsl1 0.0000000 0 0.0000000 0
#> sample2.65070 sample2.65071
#> 2010300C02Rik 0 0.0000000
#> Acsbg1 0 0.6931472
#> Adamts2 0 0.0000000
#> Adamtsl1 0 0.0000000
#>
#> $subconcentric
#> sample1.67500 sample1.67512 sample1.67515 sample1.67521
#> 2010300C02Rik 0.5623351 0.6931472 0.6365142 0
#> Acsbg1 0.6931472 0.0000000 0.6931472 0
#> Adamts2 0.0000000 0.0000000 0.0000000 0
#> Adamtsl1 0.0000000 0.0000000 0.0000000 0
#> sample1.67527 sample2.65043 sample2.65044 sample2.65051
#> 2010300C02Rik 0 0 0 0
#> Acsbg1 0 0 0 0
#> Adamts2 0 0 0 0
#> Adamtsl1 0 0 0 0
#> sample2.65055 sample2.65063 sample2.65064 sample2.65067
#> 2010300C02Rik 1.2770343 0 0.6931472 0
#> Acsbg1 0.6931472 0 0.0000000 0
#> Adamts2 0.0000000 0 0.0000000 0
#> Adamtsl1 0.0000000 0 0.0000000 0
#> sample2.65070 sample2.65071
#> 2010300C02Rik 0 0
#> Acsbg1 0 0
#> Adamts2 0 0
#> Adamtsl1 0 0
#>
#> $supersector
#> sample1.67500 sample1.67512 sample1.67515 sample1.67521
#> 2010300C02Rik 0 1.332179 0.6931472 0.6931472
#> Acsbg1 0 0.000000 0.6931472 0.6931472
#> Adamts2 0 0.000000 0.0000000 0.0000000
#> Adamtsl1 0 0.000000 0.0000000 0.0000000
#> sample1.67527 sample2.65043 sample2.65044 sample2.65051
#> 2010300C02Rik 0 0.5623351 1.039721 0.5982696
#> Acsbg1 0 0.0000000 0.000000 0.9502705
#> Adamts2 0 0.0000000 0.000000 0.0000000
#> Adamtsl1 0 0.0000000 0.000000 0.0000000
#> sample2.65055 sample2.65063 sample2.65064 sample2.65067
#> 2010300C02Rik 0.000000 0.6931472 0 0.0000000
#> Acsbg1 1.559581 1.0114043 0 0.6931472
#> Adamts2 0.000000 0.0000000 0 0.0000000
#> Adamtsl1 0.000000 0.0000000 0 0.0000000
#> sample2.65070 sample2.65071
#> 2010300C02Rik 0.0000000 0.0000000
#> Acsbg1 0.5623351 0.6365142
#> Adamts2 0.0000000 0.0000000
#> Adamtsl1 0.0000000 0.0000000
#>
#> $superconcentric
#> sample1.67500 sample1.67512 sample1.67515 sample1.67521
#> 2010300C02Rik 1.5607104 1.05492 0.6931472 0.0000000
#> Acsbg1 0.6931472 0.00000 0.6931472 0.6931472
#> Adamts2 0.0000000 0.00000 0.0000000 0.0000000
#> Adamtsl1 0.0000000 0.00000 0.0000000 0.0000000
#> sample1.67527 sample2.65043 sample2.65044 sample2.65051
#> 2010300C02Rik 0.0000000 1.039721 1.386294 1.549826
#> Acsbg1 0.6931472 0.000000 0.000000 1.054920
#> Adamts2 0.0000000 0.000000 0.000000 0.000000
#> Adamtsl1 0.0000000 0.000000 0.000000 0.000000
#> sample2.65055 sample2.65063 sample2.65064 sample2.65067
#> 2010300C02Rik 0.6931472 0.6931472 1.386294 0.0000000
#> Acsbg1 1.3208883 1.3296613 0.000000 0.6931472
#> Adamts2 0.0000000 0.0000000 0.000000 0.0000000
#> Adamtsl1 0.0000000 0.0000000 0.000000 0.0000000
#> sample2.65070 sample2.65071
#> 2010300C02Rik 0.6931472 0.0000000
#> Acsbg1 0.6931472 0.6365142
#> Adamts2 0.0000000 0.0000000
#> Adamtsl1 0.0000000 0.0000000
Create SingleCellExperiment
Object
This function EntropySingleCellExperiment()
Convert the
entropy matrix into a SingleCellExperiment
object.
se = SpatialFeatures::EntropySingleCellExperiment(ent, me)
se
#> class: SingleCellExperiment
#> dim: 178 14
#> metadata(0):
#> assays(4): subsector subconcentric supersector superconcentric
#> rownames(178): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> rowData names(0):
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#> sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
If you also want the gene counts, you can include the parameter
se = EntropySingleCellExperiment(ent, me, includeCounts = TRUE)
se
#> class: SingleCellExperiment
#> dim: 178 14
#> metadata(0):
#> assays(5): subsector subconcentric supersector superconcentric
#> genecount
#> rownames(178): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> rowData names(0):
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#> sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
Inspect the SpatialFeatures SingleCellExperiment Object
Check the structure and content of the SingleCellExperiment object. This is a SingleCellExperiment with four assays corresponding to the different types of features that have been extracted.
se
#> class: SingleCellExperiment
#> dim: 178 14
#> metadata(0):
#> assays(5): subsector subconcentric supersector superconcentric
#> genecount
#> rownames(178): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> rowData names(0):
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#> sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
colData(se)
#> DataFrame with 14 rows and 5 columns
#> Cell Sample_id x_central y_central
#> <character> <character> <numeric> <numeric>
#> sample1.67500 sample1.67500 sample1 4896.19 6415.14
#> sample1.67512 sample1.67512 sample1 4909.74 6411.80
#> sample1.67515 sample1.67515 sample1 4912.42 6402.93
#> sample1.67521 sample1.67521 sample1 4916.01 6415.75
#> sample1.67527 sample1.67527 sample1 4923.68 6414.74
#> ... ... ... ... ...
#> sample2.65063 sample2.65063 sample2 4928.03 6415.25
#> sample2.65064 sample2.65064 sample2 4896.37 6406.56
#> sample2.65067 sample2.65067 sample2 4923.71 6406.08
#> sample2.65070 sample2.65070 sample2 4902.14 6395.17
#> sample2.65071 sample2.65071 sample2 4920.63 6398.99
#> boundaries
#> <AsIs>
#> sample1.67500 4896.85:6413.46,4895.15:6413.89,4894.30:6417.71
#> sample1.67512 4901.95:6417.07,4901.52:6418.56,4900.89:6419.41
#> sample1.67515 4916.40:6412.61,4912.15:6420.05,4910.45:6430.89
#> sample1.67521 4911.51:6397.95,4907.48:6400.71,4904.07:6407.30
#> sample1.67527 4929.57:6408.36,4924.69:6409.85,4921.29:6409.85
#> ... ...
#> sample2.65063 4904.71:6399.65,4898.76:6401.35,4894.30:6407.94
#> sample2.65064 4905.77:6404.11,4906.20:6407.73,4907.05:6412.19
#> sample2.65067 4908.96:6395.61,4904.93:6399.44,4905.56:6403.26
#> sample2.65070 4919.59:6408.36,4915.98:6411.34,4915.76:6411.98
#> sample2.65071 4922.35:6404.54,4919.80:6407.94,4920.23:6413.46
Alternatively, we could use the concatenateFeatures
argument to extract out all features as a single assay, as shown below.
In this case, the type of feature is recorded in the
rowData
of the SingleCellExperiment
object.
se_concat = EntropySingleCellExperiment(ent, me,
includeCounts = TRUE,
concatenateFeatures = TRUE)
se_concat
#> class: SingleCellExperiment
#> dim: 890 14
#> metadata(0):
#> assays(1): spatialFeatures
#> rownames(890): subsector_2010300C02Rik subsector_Acsbg1 ...
#> genecount_Zfp536 genecount_Zfpm2
#> rowData names(2): FeatureCategory FeatureGene
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#> sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
rowData(se_concat)
#> DataFrame with 890 rows and 2 columns
#> FeatureCategory FeatureGene
#> <character> <character>
#> subsector_2010300C02Rik subsector 2010300C02Rik
#> subsector_Acsbg1 subsector Acsbg1
#> subsector_Adamts2 subsector Adamts2
#> subsector_Adamtsl1 subsector Adamtsl1
#> subsector_Adgrl4 subsector Adgrl4
#> ... ... ...
#> genecount_Vip genecount Vip
#> genecount_Vwc2l genecount Vwc2l
#> genecount_Wfs1 genecount Wfs1
#> genecount_Zfp536 genecount Zfp536
#> genecount_Zfpm2 genecount Zfpm2
Finish
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] parallel stats4 stats graphics grDevices utils datasets
#> [8] methods base
#>
#> other attached packages:
#> [1] ggplot2_3.5.1 SpatialFeatures_0.99.1
#> [3] purrr_1.0.2 dplyr_1.1.4
#> [5] SingleCellExperiment_1.26.0 SummarizedExperiment_1.34.0
#> [7] Biobase_2.64.0 GenomicRanges_1.56.2
#> [9] GenomeInfoDb_1.40.1 IRanges_2.38.1
#> [11] S4Vectors_0.42.1 BiocGenerics_0.50.0
#> [13] MatrixGenerics_1.16.0 matrixStats_1.4.1
#> [15] MoleculeExperiment_1.4.1 BiocStyle_2.32.1
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.2.1 EBImage_4.46.0 farver_2.1.2
#> [4] bitops_1.0-9 fastmap_1.2.0 RCurl_1.98-1.16
#> [7] digest_0.6.37 lifecycle_1.0.4 terra_1.7-83
#> [10] magrittr_2.0.3 compiler_4.4.1 rlang_1.1.4
#> [13] sass_0.4.9 tools_4.4.1 utf8_1.2.4
#> [16] yaml_2.3.10 data.table_1.16.2 knitr_1.48
#> [19] labeling_0.4.3 S4Arrays_1.4.1 htmlwidgets_1.6.4
#> [22] bit_4.5.0 DelayedArray_0.30.1 abind_1.4-8
#> [25] BiocParallel_1.38.0 withr_3.0.1 desc_1.4.3
#> [28] grid_4.4.1 fansi_1.0.6 colorspace_2.1-1
#> [31] scales_1.3.0 cli_3.6.3 rmarkdown_2.28
#> [34] crayon_1.5.3 ragg_1.3.3 generics_0.1.3
#> [37] httr_1.4.7 rjson_0.2.23 cachem_1.1.0
#> [40] zlibbioc_1.50.0 BiocManager_1.30.25 XVector_0.44.0
#> [43] tiff_0.1-12 vctrs_0.6.5 Matrix_1.7-0
#> [46] jsonlite_1.8.9 bookdown_0.41 fftwtools_0.9-11
#> [49] bit64_4.5.2 systemfonts_1.1.0 jpeg_0.1-10
#> [52] magick_2.8.5 locfit_1.5-9.10 jquerylib_0.1.4
#> [55] glue_1.8.0 pkgdown_2.1.1 codetools_0.2-20
#> [58] gtable_0.3.5 UCSC.utils_1.0.0 munsell_0.5.1
#> [61] tibble_3.2.1 pillar_1.9.0 htmltools_0.5.8.1
#> [64] GenomeInfoDbData_1.2.12 R6_2.5.1 textshaping_0.4.0
#> [67] evaluate_1.0.1 lattice_0.22-6 highr_0.11
#> [70] png_0.1-8 SpatialExperiment_1.14.0 bslib_0.8.0
#> [73] Rcpp_1.0.13 SparseArray_1.4.8 xfun_0.48
#> [76] fs_1.6.4 pkgconfig_2.0.3