Train and test scClassify model
scClassify( exprsMat_train = NULL, cellTypes_train = NULL, exprsMat_test = NULL, cellTypes_test = NULL, tree = "HOPACH", algorithm = "WKNN", selectFeatures = "limma", similarity = "pearson", cutoff_method = c("dynamic", "static"), weighted_ensemble = FALSE, weights = NULL, weighted_jointClassification = TRUE, cellType_tree = NULL, k = 10, topN = 50, hopach_kmax = 5, pSig = 0.01, prob_threshold = 0.7, cor_threshold_static = 0.5, cor_threshold_high = 0.7, returnList = TRUE, parallel = FALSE, BPPARAM = BiocParallel::SerialParam(), verbose = FALSE )
exprsMat_train | A matrix of log-transformed expression matrix of reference dataset |
---|---|
cellTypes_train | A vector of cell types of reference dataset |
exprsMat_test | A list or a matrix indicates the expression matrices of the query datasets |
cellTypes_test | A list or a vector indicates cell types of the query datasets (Optional). |
tree | A vector indicates the method to build hierarchical tree, set as "HOPACH" by default. This should be one of "HOPACH" and "HC" (using hclust). |
algorithm | A vector indicates the KNN method that are used, set as "WKNN" by default. Thisshould be one or more of "WKNN", "KNN", "DWKNN". |
selectFeatures | A vector indicates the gene selection method, set as "limma" by default. This should be one or more of "limma", "DV", "DD", "chisq", "BI". |
similarity | A vector indicates the similarity measure that are used, set as "pearson" by default. This should be one or more of "pearson", "spearman", "cosine", "jaccard", kendall", "binomial", "weighted_rank","manhattan" |
cutoff_method | A vector indicates the method to cutoff the correlation distribution. Set as "dynamic" by default. |
weighted_ensemble | A logical input indicates in ensemble learning, whether the results is combined by a weighted score for each base classifier. |
weights | A vector indicates the weights for ensemble |
weighted_jointClassification | A logical input indicates in joint classification using multiple training datasets, whether the results is combined by a weighted score for each training model. |
cellType_tree | A list indicates the cell type tree provided by user. (By default, it is NULL) (Only for one training data input) |
k | An integer indicates the number of neighbour |
topN | An integer indicates the top number of features that are selected |
hopach_kmax | An integer between 1 and 9 specifying the maximum number of children at each node in the HOPACH tree. |
pSig | A numeric indicates the cutoff of pvalue for features |
prob_threshold | A numeric indicates the probability threshold for KNN/WKNN/DWKNN. |
cor_threshold_static | A numeric indicates the static correlation threshold. |
cor_threshold_high | A numeric indicates the highest correlation threshold |
returnList | A logical input indicates whether the output will be class of list |
parallel | A logical input indicates whether running in paralllel or not |
BPPARAM | A |
verbose | A logical input indicates whether the intermediate steps will be printed |
A list of the results, including testRes storing the results of the testing information, and trainRes storing the training model inforamtion.
data("scClassify_example") xin_cellTypes <- scClassify_example$xin_cellTypes exprsMat_xin_subset <- scClassify_example$exprsMat_xin_subset wang_cellTypes <- scClassify_example$wang_cellTypes exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset scClassify_res <- scClassify(exprsMat_train = exprsMat_xin_subset, cellTypes_train = xin_cellTypes, exprsMat_test = list(wang = exprsMat_wang_subset), cellTypes_test = list(wang = wang_cellTypes), tree = "HOPACH", algorithm = "WKNN", selectFeatures = c("limma"), similarity = c("pearson"), returnList = FALSE, verbose = FALSE)