genNEMoE.RdThis function generate a data follows a mixture distribution with different sample size and variables in the dataset.
genNEMoE( n = NULL, p = NULL, q = 30, K0 = 2, Sigma = NULL, eta = 0.5, c_g = 1, c_e = 1, s1 = 3, s2 = 4, p_L = c(10, 20, 50), fix_X = NULL, gen_Micro = "zinLDA", prev_filt = 0.3, var_filt = 1e-06, method = "comp", scale = T, link = "probit", beta_max = 100, ... )
| n | Number of samples when generating the dataset. By default is 200. |
|---|---|
| p | Number of variables for experts network input. By default is 30. |
| q | Number of variables for gating netwrok By defult is 20. |
| K0 | Number of components for latent class in dataset. By default is 2. |
| Sigma | Covariance matrix for gating network input. If it is NULL, will take identity matrix as covariance. |
| eta | Coefficient of separation parameters in generating data for gating networks. By default is 0.5. |
| c_g | Coefficient of signal strength parameters in generating data for gating networks. By default is 1. |
| c_e | Coefficient of signal strength parameters in experts data for gating networks. By default is 1. |
| s1 | Number of non-zeros coefficient in experts network input. By default is 5. |
| s2 | Number of non-zeros coefficient in gating network input By default is 5. |
| p_L | A numeric vector of length (L-1), each entries indicate number of variables in each level. |
| fix_X | Fixed microbiome input matrix. If NULL, will generate using zinLDA model. |
| gen_Micro | A character indicates which model used in generate microbiome data, can be chosen from "zinLDA", "dm" and "mgauss", means zero-inflated latent Dirichelet allocation model, Dirichlet multinomial model and multivariate gaussian model. |
| prev_filt | The threshold of prevalence of selected that have non-zero coefficients. By default is 0.3. |
| var_filt | The threshold of variance of selected that have non-zero coefficients. By default is 1e-6. |
| method | The transformation method used for construct relationship in experts network. If method = "comp", use prepositional data. If method = "asin", use arcsin transformed compositional data. If method = "clr", use central log ratio transformed compositional data. By default is "comp". |
| scale | Logical variable to indicate whether to use scaled coefficient. By default is TRUE |
| link | the method for generating response |
| beta_max | Maximal number of coefficients for experts network. |
| ... | other parameters can be passed to genNEMoE. i.e. parameters in zinLDA (K = 5, Alpha = 10, Pi = 0.4, a = 0.05, b = 10) |
A list contain the generated microbiome dataX,
nutrition dataW,
health response y, coefficients of experts network beta,
coefficients of gating network gamma,
simulated observed logits pi, simulated latent group latent
and simulated response probability y_prob.
dat <- genNEMoE(n = 10, p = 10000, q = 30)