A function to perform location/scale adjustment to data as the input of RUVIII which also provides the option to select optimal RUVk according to the silhouette coefficient

scRUVIII(
  Y = Y,
  M = M,
  ctl = ctl,
  fullalpha = NULL,
  k = k,
  cell_type = NULL,
  batch = NULL,
  return_all_RUV = TRUE,
  BPPARAM = SerialParam(),
  BSPARAM = ExactParam(),
  svd_k = 50
)

Arguments

Y

The unnormalised SC data. A m by n matrix, where m is the number of observations and n is the number of features.

M

The replicate mapping matrix. The mapping matrix has m rows (one for each observation), and each column represents a set of replicates. The (i, j)-th entry of the mapping matrix is 1 if the i-th observation is in replicate set j, and 0 otherwise. See ruv::RUVIII for more details.

ctl

An index vector to specify the negative controls. Either a logical vector of length n or a vector of integers.

fullalpha

Not used. Please ignore.

k

The number of unwanted factors to remove. This is inherited from the ruvK argument from the scMerge::scMerge function.

cell_type

An optional vector indicating the cell type information for each cell in the batch-combined matrix. If it is NULL, pseudo-replicate procedure will be run to identify cell type.

batch

Batch information inherited from the scMerge::scMerge function.

return_all_RUV

Whether to return extra information on the RUV function, inherited from the scMerge::scMerge function

BPPARAM

A BiocParallelParam class object from the BiocParallel package is used. Default is SerialParam().

BSPARAM

A BiocSingularParam class object from the BiocSingular package is used. Default is ExactParam().

svd_k

If BSPARAM is set to RandomParam or IrlbaParam class from BiocSingular package, then svd_k will be used to used to reduce the computational cost of singular value decomposition. Default to 50.

Value

A list consists of:

  • RUV-normalised matrices: If k has multiple values, then the RUV-normalised matrices using all the supplied k values will be returned.

  • optimal_ruvK: The optimal RUV k value as determined by silhouette coefficient.

Author

Yingxin Lin, Kevin Wang

Examples

L = ruvSimulate(m = 200, n = 1000, nc = 100, nCelltypes = 3, nBatch = 2, lambda = 0.1, sce = FALSE)
Y = t(log2(L$Y + 1L)); M = L$M; ctl = L$ctl; batch = L$batch;
res = scRUVIII(Y = Y, M = M, ctl = ctl, k = c(5, 10, 15, 20), batch = batch)
#> Selecting optimal RUVk 
#> No cell type info, replicate matrix will be used as cell type info 
#> optimal ruvK:5