Perform a Single Classification — runTest • ClassifyR

For a data set of features and samples, the classification process is run. It consists of data transformation, feature selection, classifier training and testing.

Usage

# S4 method for class 'matrix'
runTest(measurementsTrain, outcomeTrain, measurementsTest, outcomeTest, ...)

# S4 method for class 'DataFrame'
runTest(
  measurementsTrain,
  outcomeTrain,
  measurementsTest,
  outcomeTest,
  crossValParams = CrossValParams(),
  modellingParams = ModellingParams(),
  characteristics = S4Vectors::DataFrame(),
  ...,
  verbose = 1,
  .iteration = NULL
)

# S4 method for class 'MultiAssayExperiment'
runTest(measurementsTrain, measurementsTest, outcomeColumns, ...)

Arguments

measurementsTrain: Either a matrix, DataFrame or MultiAssayExperiment containing the training data. For a matrix or DataFrame, the rows are samples, and the columns are features.
...: Variables not used by the matrix nor the MultiAssayExperiment method which are passed into and used by the DataFrame method or passed onwards to prepareData.
outcomeTrain: Either a factor vector of classes, a Surv object, or a character string, or vector of such strings, containing column name(s) of column(s) containing either classes or time and event information about survival. If column names of survival information, time must be in first column and event status in the second.
measurementsTest: Same data type as measurementsTrain, but only the test samples.
outcomeTest: Same data type as outcomeTrain, but for only the test samples.
crossValParams: An object of class CrossValParams, specifying the kind of cross-validation to be done, if nested cross-validation is used to tune any parameters.
modellingParams: An object of class ModellingParams, specifying the class rebalancing, transformation (if any), feature selection (if any), training and prediction to be done on the data set.
characteristics: A DataFrame describing the characteristics of the classification used. First column must be named "charateristic" and second column must be named "value". Useful for automated plot annotation by plotting functions within this package. Transformation, selection and prediction functions provided by this package will cause the characteristics to be automatically determined and this can be left blank.
verbose: Default: 1. A number between 0 and 3 for the amount of progress messages to give. A higher number will produce more messages as more lower-level functions print messages.
.iteration: Not to be set by a user. This value is used to keep track of the cross-validation iteration, if called by runTests.
outcomeColumns: If measurementsTrain is a MultiAssayExperiment, the names of the column (class) or columns (survival) in the table extracted by colData(data) that contain(s) the samples' outcome to use for prediction.

Value

If called directly by the user rather than being used internally by runTests, a ClassifyResult object. Otherwise a list of different aspects of the result which is passed back to runTests.

Details

This function only performs one classification and prediction. See runTests for a driver function that enables a number of different cross-validation schemes to be applied and uses this function to perform each iteration.

Author

Dario Strbenac

Examples


  #if(require(sparsediscrim))
  #{
    data(asthma)
    CVparams <- CrossValParams(tuneMode = "Resubstitution")
    tuneList <- list(nFeatures = seq(5, 25, 5))
    attr(tuneList, "performanceType") <- "Balanced Error"
    selectParams <- SelectParams("limma", tuneParams = tuneList)
    modellingParams <- ModellingParams(selectParams = selectParams)
    trainIndices <- seq(1, nrow(measurements), 2)
    testIndices <- seq(2, nrow(measurements), 2)
    
    runTest(measurements[trainIndices, ], classes[trainIndices],
            measurements[testIndices, ], classes[testIndices],
            crossValParams = CVparams, modellingParams = modellingParams)
#> An object of class 'ClassifyResult'.
#> Characteristics:
#>     characteristic             value
#>               topN                 5
#>  Balanced Accuracy 0.852941176470588
#>     Selection Name  Moderated t-test
#>    Classifier Name      Diagonal LDA
#>   Cross-validation   Independent Set
#> Features: List of length 1 of feature identifiers.
#> Predictions: A data frame of 95 rows.
#> Performance Measures: None calculated yet.
  #}