genNEMoE.Rd
This function generate a data follows a mixture distribution with different sample size and variables in the dataset.
genNEMoE( n = NULL, p = NULL, q = 30, K0 = 2, Sigma = NULL, eta = 0.5, c_g = 1, c_e = 1, s1 = 3, s2 = 4, p_L = c(10, 20, 50), fix_X = NULL, gen_Micro = "zinLDA", prev_filt = 0.3, var_filt = 1e-06, method = "comp", scale = T, link = "probit", beta_max = 100, ... )
n | Number of samples when generating the dataset. By default is 200. |
---|---|
p | Number of variables for experts network input. By default is 30. |
q | Number of variables for gating netwrok By defult is 20. |
K0 | Number of components for latent class in dataset. By default is 2. |
Sigma | Covariance matrix for gating network input. If it is NULL, will take identity matrix as covariance. |
eta | Coefficient of separation parameters in generating data for gating networks. By default is 0.5. |
c_g | Coefficient of signal strength parameters in generating data for gating networks. By default is 1. |
c_e | Coefficient of signal strength parameters in experts data for gating networks. By default is 1. |
s1 | Number of non-zeros coefficient in experts network input. By default is 5. |
s2 | Number of non-zeros coefficient in gating network input By default is 5. |
p_L | A numeric vector of length (L-1), each entries indicate number of variables in each level. |
fix_X | Fixed microbiome input matrix. If NULL, will generate using zinLDA model. |
gen_Micro | A character indicates which model used in generate microbiome data, can be chosen from "zinLDA", "dm" and "mgauss", means zero-inflated latent Dirichelet allocation model, Dirichlet multinomial model and multivariate gaussian model. |
prev_filt | The threshold of prevalence of selected that have non-zero coefficients. By default is 0.3. |
var_filt | The threshold of variance of selected that have non-zero coefficients. By default is 1e-6. |
method | The transformation method used for construct relationship in experts network. If method = "comp", use prepositional data. If method = "asin", use arcsin transformed compositional data. If method = "clr", use central log ratio transformed compositional data. By default is "comp". |
scale | Logical variable to indicate whether to use scaled coefficient. By default is TRUE |
link | the method for generating response |
beta_max | Maximal number of coefficients for experts network. |
... | other parameters can be passed to genNEMoE. i.e. parameters in zinLDA (K = 5, Alpha = 10, Pi = 0.4, a = 0.05, b = 10) |
A list contain the generated microbiome dataX
,
nutrition dataW
,
health response y
, coefficients of experts network beta
,
coefficients of gating network gamma
,
simulated observed logits pi
, simulated latent group latent
and simulated response probability y_prob
.
dat <- genNEMoE(n = 10, p = 10000, q = 30)