Given an error matrix, identify the k that maximises the accuracy for cells belonging to a provided labelling/grouping. If no labelling given, expect a cell-cell similarity network to identify the k that maximises the accuracy for cells within that neighbourhood. If neither are given, simply treat all cells as if they have the same labelling/grouping
Arguments
- E
An error matrix with rows corresponding to cells and columns corresponding to candidate k values, with values themselves corresponding to error values (either binary for single classification, or continuous after multiple classification).
- labels
Group labels for cells.
- local
A neighbourhood index representation, as typically output using BiocNeighbors::findKNN().
- outputPerCell
Logical whether to return adaptive k for each cell, not just for each label type (used for when labels is given).
- ...
Includes return_colnames, whether to give the colnames of the best selected, or just the index, which is default TRUE.
Examples
E <- matrix(runif(100), 20, 5)
colnames(E) <- paste0("K_", 1:5)
# generate cell labels
labels <- factor(rep(letters[1:2], each = 10))
# generate nearest neighbourhood index representation
data <- matrix(rpois(10 * 20, 10), 10, 20) # 10 genes, 20 cells
local <- BiocNeighbors::findKNN(t(data), k = 5, get.distance = FALSE)$index
#> Warning: detected tied distances to neighbors, see ?'BiocNeighbors-ties'
best_k_labels <- getAdaptiveK(E,
labels = labels
)
best_k_local <- getAdaptiveK(E,
local = local
)