Train and test scClassify model

scClassify(
  exprsMat_train = NULL,
  cellTypes_train = NULL,
  exprsMat_test = NULL,
  cellTypes_test = NULL,
  tree = "HOPACH",
  algorithm = "WKNN",
  selectFeatures = "limma",
  similarity = "pearson",
  cutoff_method = c("dynamic", "static"),
  weighted_ensemble = FALSE,
  weights = NULL,
  weighted_jointClassification = TRUE,
  cellType_tree = NULL,
  k = 10,
  topN = 50,
  hopach_kmax = 5,
  pSig = 0.01,
  prob_threshold = 0.7,
  cor_threshold_static = 0.5,
  cor_threshold_high = 0.7,
  returnList = TRUE,
  parallel = FALSE,
  BPPARAM = BiocParallel::SerialParam(),
  verbose = FALSE
)

Arguments

exprsMat_train	A matrix of log-transformed expression matrix of reference dataset
cellTypes_train	A vector of cell types of reference dataset
exprsMat_test	A list or a matrix indicates the expression matrices of the query datasets
cellTypes_test	A list or a vector indicates cell types of the query datasets (Optional).
tree	A vector indicates the method to build hierarchical tree, set as "HOPACH" by default. This should be one of "HOPACH" and "HC" (using hclust).
algorithm	A vector indicates the KNN method that are used, set as "WKNN" by default. Thisshould be one or more of "WKNN", "KNN", "DWKNN".
selectFeatures	A vector indicates the gene selection method, set as "limma" by default. This should be one or more of "limma", "DV", "DD", "chisq", "BI".
similarity	A vector indicates the similarity measure that are used, set as "pearson" by default. This should be one or more of "pearson", "spearman", "cosine", "jaccard", kendall", "binomial", "weighted_rank","manhattan"
cutoff_method	A vector indicates the method to cutoff the correlation distribution. Set as "dynamic" by default.
weighted_ensemble	A logical input indicates in ensemble learning, whether the results is combined by a weighted score for each base classifier.
weights	A vector indicates the weights for ensemble
weighted_jointClassification	A logical input indicates in joint classification using multiple training datasets, whether the results is combined by a weighted score for each training model.
cellType_tree	A list indicates the cell type tree provided by user. (By default, it is NULL) (Only for one training data input)
k	An integer indicates the number of neighbour
topN	An integer indicates the top number of features that are selected
hopach_kmax	An integer between 1 and 9 specifying the maximum number of children at each node in the HOPACH tree.
pSig	A numeric indicates the cutoff of pvalue for features
prob_threshold	A numeric indicates the probability threshold for KNN/WKNN/DWKNN.
cor_threshold_static	A numeric indicates the static correlation threshold.
cor_threshold_high	A numeric indicates the highest correlation threshold
returnList	A logical input indicates whether the output will be class of list
parallel	A logical input indicates whether running in paralllel or not
BPPARAM	A `BiocParallelParam` class object from the `BiocParallel` package is used. Default is SerialParam().
verbose	A logical input indicates whether the intermediate steps will be printed

Value

A list of the results, including testRes storing the results of the testing information, and trainRes storing the training model inforamtion.

Examples


data("scClassify_example")
xin_cellTypes <- scClassify_example$xin_cellTypes
exprsMat_xin_subset <- scClassify_example$exprsMat_xin_subset
wang_cellTypes <- scClassify_example$wang_cellTypes
exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset

scClassify_res <- scClassify(exprsMat_train = exprsMat_xin_subset,
cellTypes_train = xin_cellTypes,
exprsMat_test = list(wang = exprsMat_wang_subset),
cellTypes_test = list(wang = wang_cellTypes),
tree = "HOPACH",
algorithm = "WKNN",
selectFeatures = c("limma"),
similarity = c("pearson"),
returnList = FALSE,
verbose = FALSE)