A function to perform pairwise cross validation
crissCrossValidate.Rd
This function has been designed to perform cross-validation and model prediction on datasets in a pairwise manner.
Usage
crissCrossValidate(
measurements,
outcomes,
nFeatures = 20,
selectionMethod = "auto",
selectionOptimisation = "Resubstitution",
trainType = c("modelTrain", "modelTest"),
performanceType = "auto",
doRandomFeatures = FALSE,
classifier = "auto",
nFolds = 5,
nRepeats = 20,
nCores = 1,
verbose = 0
)
Arguments
- measurements
A
list
of eitherDataFrame
,data.frame
ormatrix
class measurements.- outcomes
A
list
of vectors that respectively correspond to outcomes of the samples inmeasurements
list.- nFeatures
The number of features to be used for modelling.
- selectionMethod
Default:
"auto"
. A character keyword of the feature algorithm to be used. If"auto"
, t-test (two categories) / F-test (three or more categories) ranking and topnFeatures
optimisation is done. Otherwise, the ranking method is per-feature Cox proportional hazards p-value.- selectionOptimisation
A character of "Resubstitution", "Nested CV" or "none" specifying the approach used to optimise nFeatures.
- trainType
Default:
"modelTrain"
. A keyword specifying whether a fully trained model is used to make predictions on the test set or if only the feature identifiers are chosen using the training data set and a number of training-predictions are made by cross-validation in the test set.- performanceType
Default:
"auto"
. If"auto"
, then balanced accuracy for classification or C-index for survival. Otherwise, any one of the options described incalcPerformance
may otherwise be specified.- doRandomFeatures
Default:
FALSE
. Whether to perform random feature selection to establish a baseline performance. EitherFALSE
orTRUE
are permitted values.- classifier
Default:
"auto"
. A character keyword of the modelling algorithm to be used. If"auto"
, then a random forest is used for a classification task or Cox proportional hazards model for a survival task.- nFolds
A numeric specifying the number of folds to use for cross-validation.
- nRepeats
A numeric specifying the the number of repeats or permutations to use for cross-validation.
- nCores
A numeric specifying the number of cores used if the user wants to use parallelisation.
- verbose
Default: 0. A number between 0 and 3 for the amount of progress messages to give. A higher number will produce more messages as more lower-level functions print messages.