Mixed effect neural network: Genotypes -> Unobserved intemediate traits -> Phenotyes
Tips: please center the phenotypes to have zero mean.
Example(a): fully-connected neural networks, all intemediate traits are unobserved
- nonlinear function (to define relationship between middle layer and phenotye): tanh (other supported activation functions: "sigmoid", "relu", "leakyrelu", "linear")
- number of nodes in the middle layer: 3
- Bayesian model: multiple independent single-trait BayesC (to sample marker effects on intemediate traits). Note, to use multi-trait Bayesian Alphabet models, please set
mega_trait=false
inrunMCMC()
function. - sample the unobserved intemediate traits in the middle layer: Hamiltonian Monte Carlo
# Step 1: Load packages
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets,Random
Random.seed!(123)
# Step 2: Read data
phenofile = dataset("phenotypes.csv") #get example data path
genofile = dataset("genotypes.csv") #get example data path
phenotypes = CSV.read(phenofile,DataFrame,delim = ',',header=true,missingstrings=["NA"]) #read phenotypes (output layer)
genotypes = get_genotypes(genofile,separator=',',method="BayesC"); #read genotypes (input layer)
# Step 3: Build Model Equations
model_equation ="y1 = intercept + genotypes" #name of phenotypes is "y1" in the phenotypes data
#name of genotypes is "genotypes" (user-defined in the previous step)
#the single-trait mixed model used between input and each node in middle layer is: middle node = intercept + genotypes
model = build_model(model_equation,
num_hidden_nodes=3, #number of nodes in middle layer is 3
nonlinear_function="tanh"); #tanh function is used to approximate relationship between middle layer and phenotype
# Step 4: Run Analysis
out=runMCMC(model,phenotypes,chain_length=5000);
# Step 5: Check Accuruacy
results = innerjoin(out["EBV_NonLinear"], phenotypes, on = :ID)
accuruacy = cor(results[!,:EBV],results[!,:bv1])
Example output files
The i-th middle nodes will be named as "trait name"+"i". In our example, the observed trait is named "y1", and there are 3 middle nodes, so the middle nodes are named as "y11", "y12", and "y13", respectively.
Below is a list of files containing estimates and standard deviations for variables of interest.
file name | description |
---|---|
EBV_NonLinear.txt | estimated breeding values for observed trait |
EBV_y11.txt | estimated breeding values for middle node 1 |
EBV_y12.txt | estimated breeding values for middle node 2 |
EBV_y13.txt | estimated breeding values for middle node 3 |
genetic_variance.txt | estimated genetic variance-covariance of all middle nodes |
heritability.txt | estimated heritability of all middle nodes |
location_parameters.txt | estimated bias of all middle nodes |
neuralnetworksbiasandweights.txt. | estimated bias of phenotypes and weights between middle nodes and phenotypes |
pi_genotypes.txt | estimated pi of all middle nodes |
markereffectsgenotypes.txt | estimated marker effects of all middle nodes |
residual_variance.txt | estimated residual variance-covariance for all middle nodes |
Below is a list of files containing MCMC samples for variables of interest.
file name | description |
---|---|
MCMCsamplesEBV_NonLinear.txt | MCMC samples from the posterior distribution of breeding values for phenotypes |
MCMCsamplesEBV_y11.txt | MCMC samples from the posterior distribution of breeding values for middle node 1 |
MCMCsamplesEBV_y12.txt | MCMC samples from the posterior distribution of breeding values for middle node 2 |
MCMCsamplesEBV_y13.txt | MCMC samples from the posterior distribution of breeding values for middle node 3 |
MCMCsamplesgenetic_variance.txt | MCMC samples from the posterior distribution of genetic variance-covariance for all middle nodes |
MCMCsamplesheritability.txt | MCMC samples from the posterior distribution of heritability for all middle nodes |
MCMCsamplesmarkereffectsgenotypes_y11 | MCMC samples from the posterior distribution of marker effects for middle node 1 |
MCMCsamplesmarkereffectsgenotypes_y12 | MCMC samples from the posterior distribution of marker effects for middle node 2 |
MCMCsamplesmarkereffectsgenotypes_y13 | MCMC samples from the posterior distribution of marker effects for middle node 3 |
MCMCsamplesmarkereffectsvariances_genotypes.txt | MCMC samples from the posterior distribution of marker effect variance for all middle nodes |
MCMCsamplesneuralnetworksbiasandweights.txt. | MCMC samples from the posterior distribution of bias of observed trait and weights between middle nodes and phenotypes |
MCMCsamplespi_genotypes.txt | MCMC samples from the posterior distribution of pi for all middle nodes |
MCMCsamplesresidual_variance.txt | MCMC samples from the posterior distribution of residual variance-covariance for all middle nodes |