Skip to contents

Introduction

There might be a new transformation / selection / modelling algorithm that’s not a part of ClassifyR but might be useful to others. This guide explains the necessary steps to perform before making a pull request to get new functionality added.

The core framework dispatches the data as a DataFrame object, so the new function should be able to accept such a variable type as its first argument.

Steps to Add a New Function

  1. Define an ordinary R function and attach a name to it. This will be useful in automated annotation of modelling results later. A basic outline is:
newFancyClassifier <- function(measurementsTrain, classesTrain, alpha, ..., verbose = 3)
{
  # Build a model.  
}
attr(newFancyClassifier, "name") <- "newFancyClassifier"
  1. Open the file constants.R. In the .ClassifyRenvir[["functionsTable"]] two-column matrix, add a row specifying the function name and a nice title to use for plot labelling. For example,
.ClassifyRenvir[["functionsTable"]] <- matrix(
    ...        ...
"newFancyClassifier", "Fancy Classifier"),
    ...        ...
  1. Add a keyword that users will use when the want to use the function in cross-validation to one of the keywords two-column matrices along with a short one-sentence description. In the example, the new function is a classifier, so it is added to the end of the "classifyKeywords" matrix.
.ClassifyRenvir[["classifyKeywords"]] <- matrix(
    ...        ...
"fancy", "Very good classifier that predicts amazingly well."),
    ...        ...
  1. Depending on what kind of functionality the new function provides, add an entry into either .selectionKeywordToFunction or .classifierKeywordToParams functions in utilities.R. The first function maps a feature selection keyword to a wrapper function and the second maps a classifier keyword into a pair of TrainParams and PredictParams objects, which specify the functions of training and prediction stages.

  2. If the function is a classifier, open simpleParams.R and define the TrainParams and PredictParams parameter sets there. If the classifier has any tuning parameters, it would be good to specify a small range of values to try, since crossValidate for simplicity doesn’t allow anything other than a keyword to be specified by the user.

  3. Open the vignette in vignettes/ClassifyR.Rmd and look for the section heading #### Provided Methods for Feature Selection and Classification. Add the new function to the appropriate table.

Now that the new function is defined and described, users will be able to discover it using available() at the R command line and results will automatically store nice names for the functions used and they will automatically be able to be shown in plots.