Train and test scClassify model

scClassify(
exprsMat_train = NULL,
cellTypes_train = NULL,
exprsMat_test = NULL,
cellTypes_test = NULL,
tree = "HOPACH",
algorithm = "WKNN",
selectFeatures = "limma",
similarity = "pearson",
cutoff_method = c("dynamic", "static"),
weighted_ensemble = FALSE,
weights = NULL,
weighted_jointClassification = TRUE,
cellType_tree = NULL,
k = 10,
topN = 50,
hopach_kmax = 5,
pSig = 0.01,
prob_threshold = 0.7,
cor_threshold_static = 0.5,
cor_threshold_high = 0.7,
returnList = TRUE,
parallel = FALSE,
BPPARAM = BiocParallel::SerialParam(),
verbose = FALSE
)

Arguments

exprsMat_train A matrix of log-transformed expression matrix of reference dataset A vector of cell types of reference dataset A list or a matrix indicates the expression matrices of the query datasets A list or a vector indicates cell types of the query datasets (Optional). A vector indicates the method to build hierarchical tree, set as "HOPACH" by default. This should be one of "HOPACH" and "HC" (using hclust). A vector indicates the KNN method that are used, set as "WKNN" by default. Thisshould be one or more of "WKNN", "KNN", "DWKNN". A vector indicates the gene selection method, set as "limma" by default. This should be one or more of "limma", "DV", "DD", "chisq", "BI". A vector indicates the similarity measure that are used, set as "pearson" by default. This should be one or more of "pearson", "spearman", "cosine", "jaccard", kendall", "binomial", "weighted_rank","manhattan" A vector indicates the method to cutoff the correlation distribution. Set as "dynamic" by default. A logical input indicates in ensemble learning, whether the results is combined by a weighted score for each base classifier. A vector indicates the weights for ensemble A logical input indicates in joint classification using multiple training datasets, whether the results is combined by a weighted score for each training model. A list indicates the cell type tree provided by user. (By default, it is NULL) (Only for one training data input) An integer indicates the number of neighbour An integer indicates the top number of features that are selected An integer between 1 and 9 specifying the maximum number of children at each node in the HOPACH tree. A numeric indicates the cutoff of pvalue for features A numeric indicates the probability threshold for KNN/WKNN/DWKNN. A numeric indicates the static correlation threshold. A numeric indicates the highest correlation threshold A logical input indicates whether the output will be class of list A logical input indicates whether running in paralllel or not A BiocParallelParam class object from the BiocParallel package is used. Default is SerialParam(). A logical input indicates whether the intermediate steps will be printed

Value

A list of the results, including testRes storing the results of the testing information, and trainRes storing the training model inforamtion.

Examples


data("scClassify_example")
xin_cellTypes <- scClassify_example$xin_cellTypes exprsMat_xin_subset <- scClassify_example$exprsMat_xin_subset
wang_cellTypes <- scClassify_example$wang_cellTypes exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset

scClassify_res <- scClassify(exprsMat_train = exprsMat_xin_subset,
cellTypes_train = xin_cellTypes,
exprsMat_test = list(wang = exprsMat_wang_subset),
cellTypes_test = list(wang = wang_cellTypes),
tree = "HOPACH",
algorithm = "WKNN",
selectFeatures = c("limma"),
similarity = c("pearson"),
returnList = FALSE,
verbose = FALSE)