Transform a Table of Feature Abundances into a Table of Feature Set Abundances.
featureSetSummary.Rd
Represents a feature set by the mean or median feature measurement of a feature set for all features belonging to a feature set.
Usage
# S4 method for matrix
featureSetSummary(
measurements,
location = c("median", "mean"),
featureSets,
minimumOverlapPercent = 80,
verbose = 3
)
# S4 method for DataFrame
featureSetSummary(
measurements,
location = c("median", "mean"),
featureSets,
minimumOverlapPercent = 80,
verbose = 3
)
# S4 method for MultiAssayExperiment
featureSetSummary(
measurements,
target = NULL,
location = c("median", "mean"),
featureSets,
minimumOverlapPercent = 80,
verbose = 3
)
Arguments
- measurements
Either a
matrix
,DataFrame
orMultiAssayExperiment
containing the training data. For amatrix
, the rows are samples, and the columns are features. If of typeDataFrame
orMultiAssayExperiment
, the data set is subset to only those features of typenumeric
.- location
Default: The median. The type of location to summarise a set of features belonging to a feature set by.
- featureSets
An object of type
FeatureSetCollection
which defines the feature sets.- minimumOverlapPercent
The minimum percentage of overlapping features between the data set and a feature set defined in
featureSets
for that feature set to not be discarded from the anaylsis.- verbose
Default: 3. A number between 0 and 3 for the amount of progress messages to give. This function only prints progress messages if the value is 3.
- target
If the input is a
MultiAssayExperiment
, this specifies which data set will be transformed. Can either be an integer index or a character string specifying the name of the table. Must have length 1.
Value
The same class of variable as the input variable measurements
is, with the individual features summarised to feature sets. The number of
samples remains unchanged, so only one dimension of measurements
is
altered.
Details
This feature transformation method is unusual because the mean or median feature of a feature set for one sample may be different to another sample, whereas most other feature transformation methods do not result in different features being compared between samples during classification.
References
Network-based biomarkers enhance classical approaches to prognostic gene expression signatures, Rebecca L Barter, Sarah-Jane Schramm, Graham J Mann and Yee Hwa Yang, 2014, BMC Systems Biology, Volume 8 Supplement 4 Article S5, https://bmcsystbiol.biomedcentral.com/articles/10.1186/1752-0509-8-S4-S5.
Examples
sets <- list(Adhesion = c("Gene 1", "Gene 2", "Gene 3"),
`Cell Cycle` = c("Gene 8", "Gene 9", "Gene 10"))
featureSets <- FeatureSetCollection(sets)
# Adhesion genes have a median gene difference between classes.
genesMatrix <- matrix(c(rnorm(5, 9, 0.3), rnorm(5, 7, 0.3), rnorm(5, 8, 0.3),
rnorm(5, 6, 0.3), rnorm(10, 7, 0.3), rnorm(70, 5, 0.1)),
nrow = 10)
rownames(genesMatrix) <- paste("Patient", 1:10)
colnames(genesMatrix) <- paste("Gene", 1:10)
classes <- factor(rep(c("Poor", "Good"), each = 5)) # But not used for transformation.
featureSetSummary(genesMatrix, featureSets = featureSets)
#> Summarising features to feature sets.
#> Adhesion Cell Cycle
#> Patient 1 8.343683 5.089886
#> Patient 2 7.889564 5.032990
#> Patient 3 8.204776 5.004191
#> Patient 4 7.935472 5.006646
#> Patient 5 7.853649 4.939587
#> Patient 6 6.663139 5.048671
#> Patient 7 7.056860 4.987702
#> Patient 8 7.093365 4.936592
#> Patient 9 6.955994 5.082145
#> Patient 10 7.252357 5.010829