Skip to contents

An object containing a dataset and methods for evaluating analytical tasks against ground truths for the dataset.

Value

A Trio object

Public fields

data

The data

auxData

The auxiliary data in the data

metrics

The metric for evaluating tasks against the gold standards

cachePath

The path to the data cache

dataSource

The data repository that the data were retrieved from

dataSourceID

The dataset ID for dataSouce

splitIndices

Indices for cross-validation

splitSeed

The seed used to generate the split indices

verbose

Set the verbosity of Trio. Defaults to FALSE.

Methods


Method new()

Create a Trio object

Usage

Trio$new(
  datasetID = NULL,
  data = NULL,
  dataLoader = NULL,
  cachePath = FALSE,
  verbose = FALSE
)

Arguments

datasetID

A string specifying a dataset, either a name from curated-trio-data or a format string of the form source:source_id.

data

An object to use as the Trio dataset.

dataLoader

A custom loading fuction that takes the path of a downloaded file and returns a single dataset, ready to be used in evaluation tasks.

cachePath

The path to the data cache

verbose

Set the verbosity of Trio. Defaults to FALSE.


Method addAuxData()

Add a gold standard to the Trio.

Usage

Trio$addAuxData(name, auxData, metrics, args = NULL)

Arguments

name

A string specifying the name of the gold standard.

auxData

The auxiliary data. An object to be compared or a function to be run on the data.

metrics

A list of one or more metrics names used to campare gs with the input to evaluate.

args

A named list of parameters and values to be passed to the function.


Method addMetric()

Add a metric to the Trio.

Usage

Trio$addMetric(name, metric, args = NULL)

Arguments

name

A string specifying the name of the metric.

metric

The metric. A function to be run on the input to evaluate to compare it with the gold standard. Should be of the form f(x, y, ...). Where x is the "truth" and y is the output to be evaluated. Otherwise input a wrapper function of the desired metric.

args

A named list of parameters and values to be passed to the function.


Method getMetrics()

Get metrics by gold standard name.

Usage

Trio$getMetrics(auxDataName)

Arguments

auxDataName

A string specifying the name of the gold standard.


Method getAuxData()

Get auxiliary data by name.

Usage

Trio$getAuxData(name)

Arguments

name

A string specifying the name of the auxiliary data.


Method evaluate()

Evalute against gold standards

Usage

Trio$evaluate(input, splitIndex = NULL)

Arguments

input

A named list of objects to be evaluated against gold standards.

splitIndex

An optional index for subsetting data during evaluation using the indices created by the split method.


Method split()

Create a cross-validation indices.

Usage

Trio$split(
  y,
  n_fold = 5L,
  n_repeat = 1L,
  stratify = TRUE,
  seed = NULL,
  overwrite = FALSE
)

Arguments

y

A variable to use for statified sampling. If stratify is false, a vector the length of the data.

n_fold

Number of folds. Defaults to 5L.

n_repeat

Number of repeats. Defaults to 1L.

stratify

If TRUE, uses stratified sampling. Defaults to TRUE.

seed

An optional seed for split generation. Defaults to NULL. If NULL, the seed is set to the current time.

overwrite

If TRUE, overwrites the current split. Defaults to FALSE.


Method print()

Print method to display key information about the Trio object.

Usage

Trio$print()


Method clone()

The objects of this class are cloneable with this method.

Usage

Trio$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

trio <- Trio$new("figshare:26054188/47112109", cachePath = tempdir())