Convert a transcript (molecule) or boundary dataframe to the ME list format
Source:R/dataframeToMEList.R
dataframeToMEList.Rd
The goal of this function is to standardise transcript and boundary files for input to a MoleculeExperiment object.
Usage
dataframeToMEList(
df,
dfType = NULL,
assayName = NULL,
sampleCol = "sample_id",
factorCol = NULL,
xCol = "x_location",
yCol = "y_location",
keepCols = "essential",
scaleFactor = 1
)
Arguments
- df
A data.frame containing the transcript information or the boundary information. NOTE: this dataframe should, at a minimum, have the following 4 columns: sample_id, factorCol (e.g., feature_id in transcripts, or cell_id in boundaries), x_location and y_location.
- dfType
Character string specifying contents of the dataframe. Can be either "molecules" or "boundaries".
- assayName
Character string specifying the name with which to identify the information later on in an ME object.
- sampleCol
Character string specifying the name of the column with the sample id.
- factorCol
Character string specifying the name of the column with the factors with which to group the data in the lists. When working with molecules, this column would be e.g., "feature_id" in xenium. When working with boundaries, this column would be e.g., "cell_id" in xenium.
- xCol
Character string specifying the name of the column with global x coordinates.
- yCol
Character string specifying the name of the column with global y coordinates.
- keepCols
Character string which can be either "essential" or "all". If "essential", the function will only work with the x and y location information.
- scaleFactor
Integer specifying the scale factor by which to change the scale of the x and y locations (e.g., to change from pixel to micron). The default value is 1.
Examples
moleculesDf <- data.frame(
sample_id = rep(c("sample1", "sample2"), times = c(30, 20)),
features = rep(c("gene1", "gene2"), times = c(20, 30)),
x_coords = runif(50),
y_coords = runif(50)
)
moleculesMEList <- dataframeToMEList(moleculesDf,
dfType = "molecules",
assayName = "detected",
sampleCol = "sample_id",
factorCol = "features",
xCol = "x_coords",
yCol = "y_coords")
moleculesMEList
#> $detected
#> $detected$sample1
#> $detected$sample1$gene1
#> # A tibble: 20 × 2
#> x_location y_location
#> <dbl> <dbl>
#> 1 0.568 0.453
#> 2 0.273 0.426
#> 3 0.973 0.766
#> 4 0.694 0.0527
#> 5 0.391 0.325
#> 6 0.0696 0.176
#> 7 0.102 0.843
#> 8 0.132 0.862
#> 9 0.283 0.569
#> 10 0.373 0.641
#> 11 0.985 0.548
#> 12 0.617 0.198
#> 13 0.0465 0.367
#> 14 0.830 0.158
#> 15 0.452 0.825
#> 16 0.804 0.836
#> 17 0.668 0.416
#> 18 0.116 0.917
#> 19 0.651 0.193
#> 20 0.114 0.510
#>
#> $detected$sample1$gene2
#> # A tibble: 10 × 2
#> x_location y_location
#> <dbl> <dbl>
#> 1 0.957 0.816
#> 2 0.109 0.822
#> 3 0.741 0.582
#> 4 0.979 0.218
#> 5 0.0887 0.116
#> 6 0.441 0.643
#> 7 0.473 0.546
#> 8 0.675 0.777
#> 9 0.406 0.315
#> 10 0.206 0.367
#>
#>
#> $detected$sample2
#> $detected$sample2$gene2
#> # A tibble: 20 × 2
#> x_location y_location
#> <dbl> <dbl>
#> 1 0.386 0.145
#> 2 0.728 0.258
#> 3 0.104 0.768
#> 4 0.0229 0.421
#> 5 0.701 0.351
#> 6 0.574 0.725
#> 7 0.498 0.627
#> 8 0.749 0.744
#> 9 0.331 0.739
#> 10 0.936 0.461
#> 11 0.554 0.893
#> 12 0.873 0.611
#> 13 0.942 0.286
#> 14 0.659 0.702
#> 15 0.956 0.297
#> 16 0.694 0.0900
#> 17 0.262 0.322
#> 18 0.0142 0.0419
#> 19 0.992 0.814
#> 20 0.494 0.476
#>
#>
#>