Skip to contents

Collects and checks necessary parameters required for cross-validation by runTests.

Usage

CrossValParams(
  samplesSplits = c("Permute k-Fold", "Permute Percentage Split", "Leave-k-Out",
    "k-Fold"),
  permutations = 100,
  percentTest = 25,
  folds = 5,
  leave = 2,
  tuneMode = c("none", "Resubstitution", "Nested CV"),
  performanceType = "auto",
  adaptiveResamplingDelta = NULL,
  parallelParams = bpparam()
)

Arguments

samplesSplits

Default: "Permute k-Fold". A character value specifying what kind of sample splitting to do.

permutations

Default: 100. Number of times to permute the data set before it is split into training and test sets. Only relevant if samplesSplits is either "Permute k-Fold" or "Permute Percentage Split".

percentTest

The percentage of the data set to assign to the test set, with the remainder of the samples belonging to the training set. Only relevant if samplesSplits is "Permute Percentage Split".

folds

The number of approximately equal-sized folds to partition the samples into. Only relevant if samplesSplits is "Permute k-Fold" or "k-Fold".

leave

The number of samples to generate all possible combination of and use as the test set. Only relevant if samplesSplits is "Leave-k-Out". If set to 1, it is the traditional leave-one-out cross-validation, sometimes written as LOOCV.

tuneMode

Default: None. The cross-validation scheme to use for selecting any tuning parameters. Valid values are "Resubstitution", "Nested CV", "none".

performanceType

Default: "auto". The performance metric to use if tuneMode is not "none".

adaptiveResamplingDelta

Default: NULL. If not null, adaptive resampling of training samples is performed and this number is the difference in consecutive iterations that the class probability or risk of all samples must change less than for the iterative process to stop. 0.01 was used in the original publication.

parallelParams

An instance of BiocParallelParam specifying the kind of parallelisation to use. Default is to use two cores less than the total number of cores the computer has, if it has four or more cores, otherwise one core, as is the default of bpparam. To make results fully reproducible, please choose a specific back-end depending on your operating system and also set RNGseed to a number.

Author

Dario Strbenac

Examples


  CrossValParams() # Default is 100 permutations and 5 folds of each.
#> An object of class "CrossValParams"
#> Slot "samplesSplits":
#> [1] "Permute k-Fold"
#> 
#> Slot "permutations":
#> [1] 100
#> 
#> Slot "percentTest":
#> NULL
#> 
#> Slot "folds":
#> [1] 5
#> 
#> Slot "leave":
#> NULL
#> 
#> Slot "tuneMode":
#> [1] "none"
#> 
#> Slot "performanceType":
#> [1] "auto"
#> 
#> Slot "adaptiveResamplingDelta":
#> NULL
#> 
#> Slot "parallelParams":
#> class: MulticoreParam
#>   bpisup: FALSE; bpnworkers: 30; bptasks: 0; bpjobname: BPJOB
#>   bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
#>   bpRNGseed: ; bptimeout: NA; bpprogressbar: FALSE
#>   bpexportglobals: TRUE; bpexportvariables: FALSE; bpforceGC: FALSE
#>   bpfallback: TRUE
#>   bplogdir: NA
#>   bpresultdir: NA
#>   cluster type: FORK
#> 
  snow <- SnowParam(workers = 2, RNGseed = 999)
  CrossValParams("Leave-k-Out", leave = 2, parallelParams = snow)
#> An object of class "CrossValParams"
#> Slot "samplesSplits":
#> [1] "Leave-k-Out"
#> 
#> Slot "permutations":
#> NULL
#> 
#> Slot "percentTest":
#> NULL
#> 
#> Slot "folds":
#> NULL
#> 
#> Slot "leave":
#> [1] 2
#> 
#> Slot "tuneMode":
#> [1] "none"
#> 
#> Slot "performanceType":
#> [1] "auto"
#> 
#> Slot "adaptiveResamplingDelta":
#> NULL
#> 
#> Slot "parallelParams":
#> class: SnowParam
#>   bpisup: FALSE; bpnworkers: 2; bptasks: 0; bpjobname: BPJOB
#>   bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
#>   bpRNGseed: 999; bptimeout: NA; bpprogressbar: FALSE
#>   bpexportglobals: TRUE; bpexportvariables: TRUE; bpforceGC: FALSE
#>   bpfallback: TRUE
#>   bplogdir: NA
#>   bpresultdir: NA
#>   cluster type: SOCK
#> 
  # Fully reproducible Leave-2-out cross-validation on 4 cores,
  # even if feature selection or classifier use random sampling.