Library
Index
Main.XSim.CohortMain.XSim.breedMain.XSim.build_genomeMain.XSim.build_phenomeMain.XSim.genetic_evaluationMain.XSim.mateMain.XSim.select
Contents
Main.XSim.build_genome — FunctionQuick Start
Quick setup by assigning number of markers and chromosomes.
build_genome(;n_loci ::Int64=-1,
n_chr ::Int64=10,
species ::String="none",
args...)Arguments
n_loci: Number of simulated markers for each chromosomen_chr: Number of simulated chromosome with length of 100 centimorganspecies: Infer genetic position (Morgan) by pre-load linkage maps, available species are: ["cattle", and "pig"]
Examples
julia> build_genome(n_chr = 2,
n_loci = 10000)
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 2
[ Info:
[ Info: Chromosome Length (cM): 200.0
[ Info: [100.0, 100.0]
[ Info:
[ Info: Number of Loci : 20000
[ Info: [10000, 10000]
[ Info:
[ Info: Genotyping Error : 0.0
[ Info: Mutation Rate : 0.0
[ Info:Define genome by a file or a DataFrame
Define genome by providing a formatted dataframe or a path to the file.
build_genome(dt ::DataFrame;
species ::String="none",
args...)
build_genome(filename::String;
species ::String="none",
args...)Arguments
dt: ADataFramewith required columns ofchrandcM, orchrandbpifspeciesis provided for the inference.filename: A filepath to the file containing genome information.species: Adjust genetic position (Morgan) by pre-load linkage maps, available species are: ["cattle", and "pig"]
Example of the DataFrame
4×7 DataFrame
Row │ id chr bp cM MAF eff_1 eff_2
│ String Int64 Int64 Float64 Float64 Float64 Float64
─────┼────────────────────────────────────────────────────────────
1 │ snp_1 1 1818249 50.8 0.5 0.1 0.0
2 │ snp_2 1 6557697 80.3 0.5 0.0 0.0
3 │ snp_3 2 2298800 39.2 0.5 0.2 0.0
4 │ snp_4 2 5015698 66.3 0.5 0.0 0.5Examples
By a filepath
julia> build_genome("path/map.csv";
rate_mutation=0.005, rate_error=0.01)or a dataframe
julia> using DataFrames
julia> data = CSV.read("path/map.csv", DataFrame)
julia> build_genome(data;
rate_mutation=0.005, rate_error=0.01)
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 2
[ Info:
[ Info: Chromosome Length (cM): 146.6
[ Info: [80.3, 66.3]
[ Info:
[ Info: Number of Loci : 4
[ Info: [2, 2]
[ Info:
[ Info: Genotyping Error : 0.01
[ Info: Mutation Rate : 0.005
[ Info:Use cattle genome as reference to infer the genetic positions
julia> build_genome("path/map.csv"; species="cattle")
[ Info: Arias,J.A. et al. (2009) A high density linkage map of the bovine genome. BMC Genetics, 10, 18.
[ Info: Reference Genome : Btau 4.0
[ Info: SNP Chip : Affymetrix GeneChip Bovine Mapping 10K SNP kit
┌ Warning: The provided genetic distances will be replaced with ones infered from preloaded linkage maps
└ @ XSim ~/Dropbox/projects/XSim/src/objects/global.jl:118
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 2
[ Info:
[ Info: Chromosome Length (cM): 16.8
[ Info: [15.1, 1.7]
[ Info:
[ Info: Number of Loci : 4
[ Info: [2, 2]
[ Info:
[ Info: Genotyping Error : 0.0
[ Info: Mutation Rate : 0.0
[ Info:
Explict Definition
Define genome by providing genetic information of each loci explicitly.
build_genome(chromosome ::Array{Int64, 1},
bp ::Array{Int64, 1},
cM ::Array{Float64, 1},
maf ::Array{Float64, 1};
rate_mutation ::Float64=0.0,
rate_error ::Float64=0.0)Arguments
chromosome: Chromosome codesbp: Physical positionscM: Genetic positionsmaf: Minor allele frequenciesrate_mutation: Mutation raterate_error: Error rate of genotyping
Examples
julia> ch = [1, 1, 2, 2, 2]
julia> bp = [130, 205, 186, 503, 780]
julia> cM = [85.7, 149.1, 37.4, 83.6, 134.3]
julia> maf = [0.5, 0.5, 0.5, 0.5, 0.5]
julia> build_genome(ch, bp, cM, maf)
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 2
[ Info:
[ Info: Chromosome Length (cM): 283.4
[ Info: [149.1, 134.3]
[ Info:
[ Info: Number of Loci : 5
[ Info: [2, 3]
[ Info:
[ Info: Genotyping Error : 0.0
[ Info: Mutation Rate : 0.0
[ Info:Main.XSim.build_phenome — FunctionQuick Start
Quick setup by assigning number of QTL.
build_phenome(n_qtls ::Union{Array{Int64, 1}, Int64};
args...)Arguments
n_qtls: Number of simulatedQTLs. It can be an array of integers for multiple traits.vg: Genetic (co)variances ofQTLs.h2: Heritability of simulated traits. This will define the residual (co)variances.
Examples
Single trait
julia> build_phenome(10)
[ Info: --------- Phenome Summary ---------
[ Info: Number of Traits : 1
[ Info: Heritability (h2) : [0.5]
┌ Info:
│ Genetic_Variance =
│ 1×1 Array{Float64,2}:
└ 1.0
┌ Info:
│ Residual_Variance =
│ 1×1 Array{Float64,2}:
└ 1.0
[ Info: Number of QTLs : [10]Multi-trait with additional information
julia> build_phenome([10, 15];
vg = [1 .5
.5 1],
h2 = [.3, .8])
[ Info: --------- Phenome Summary ---------
[ Info: Number of Traits : 2
[ Info: Heritability (h2) : [0.3, 0.8]
┌ Info:
│ Genetic_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.5
└ 0.5 1.0
┌ Info:
│ Residual_Variance =
│ 2×2 Array{Float64,2}:
│ 2.33333 0.0
└ 0.0 0.25
[ Info: Number of QTLs : [10 25]────────────────────────────────────────────────────────────────
Define phenome by a file or a DataFrame
Define genome by providing a formatted dataframe or a path to the file.
build_phenome(dt ::DataFrame; args...)
build_phenome(filename ::String; args...)Arguments
dt: ADataFramewith required columns ofeff_prefixed specifying marker effects.filename: A filepath to the file containing phenome information.
Example of the DataFrame
4×7 DataFrame
Row │ id chr bp cM MAF eff_1 eff_2
│ String Int64 Int64 Float64 Float64 Float64 Float64
─────┼────────────────────────────────────────────────────────────
1 │ snp 1 1 1818249 50.8 0.5 0.1 0.0
2 │ snp 2 1 6557697 80.3 0.5 0.0 -0.3
3 │ snp_3 2 2298800 39.2 0.5 0.2 0.0
4 │ snp 4 2 5015698 66.3 0.5 0.0 0.5Examples
By a filepath
julia> build_phenome("path/map.csv", h2 = [0.3, 0.5])or a dataframe
julia> using DataFrames
julia> data = CSV.read("path/map.csv", DataFrame)
julia> build_phenome(data, h2 = [0.3, 0.5])
[ Info: --------- Phenome Summary ---------
[ Info: Number of Traits : 2
[ Info: Heritability (h2) : [0.3, 0.5]
┌ Info:
│ Genetic_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.0
└ 0.0 1.0
┌ Info:
│ Residual_Variance =
│ 2×2 Array{Float64,2}:
│ 2.33333 0.0
└ 0.0 1.0
[ Info: Number of QTLs : [2 2]────────────────────────────────────────────────────────────────
Define phenome by a matrix of QTL effects
build_phenome(QTL_effects ::Union{Array{Float64}, SparseMatrixCSC}; args...)Arguments
QTL_effects: A matrix storing marker effects with the dimension of individuals by markers.
Examples
julia> effects = [0.1 0.0
0.0 -0.3
0.2 0.0
0.0 0.5]
julia> build_phenome(effects)
[ Info: --------- Phenome Summary ---------
[ Info: Number of Traits : 2
[ Info: Heritability (h2) : [0.5, 0.5]
┌ Info:
│ Genetic_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.0
└ 0.0 1.0
┌ Info:
│ Residual_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.0
└ 0.0 1.0
[ Info: Number of QTLs : [2 2]It's also possible to add additional information such as heritability.
julia> build_phenome(effects, h2=[0.1, 0.8])
[ Info: --------- Phenome Summary ---------
[ Info: Number of Traits : 2
[ Info: Heritability (h2) : [0.1, 0.8]
┌ Info:
│ Genetic_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.0
└ 0.0 1.0
┌ Info:
│ Residual_Variance =
│ 2×2 Array{Float64,2}:
│ 9.0 0.0
└ 0.0 0.25
[ Info: Number of QTLs : [2 2]Main.XSim.Cohort — TypeInitialize a cohort by population size
Cohort(n::Int64=0)Arguments
n: An integer to assign the population size.
Examples
julia> cohort = Cohort(5)
[ Info: Cohort (5 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [1.265 1.697]
[ Info:
[ Info: Variance of breeding values:
[ Info: [1.6 1.4]──────────────────────────────────────────────────────────────
Initialize a cohort by genotypes/haplotypes files
Cohort(genetic_data ::Union{DataFrame, Array{Int64}}; args...)
Cohort(filename ::String; args...)Arguments
genetic_data: Adataframe/2D-arraythat stores genotypes/haplotypes in the dimension of individuals by markers.filename: Afilepathto a file storing genotypes/haplotypes data.n: Number of lines to be loaded from the file. The default value is-1and the entire file will be loaded.random: By default it's set totrueto randomly selectnlines (individuals) from the file to generate the cohort.alter_maf: It will update MAF based on the provided genotypes if it's set totrue(default).
Example of the demo_genotypes.csv and demo_haplotypes.csv
Both demo files store marker information for 5 individuals and 4 markers. Use DATA("demo_genotypes.csv") to interact with demo files.
# demo_genotypes.csv
# rows: individuals, columns: markers
# Homozygote is coded as 0 and 2, otherwise is coded as 1
2,0,0,1
0,0,1,0
0,1,0,2
1,1,0,2
2,0,2,0
# demo_haplotypes.csv
# rows: individuals, columns: alleles
# Reference allele is coded as 0, otherwise is coded as 1
1,1,0,0,0,0,1,0
0,0,0,0,1,0,0,0
0,0,0,1,0,0,1,1
1,0,1,0,0,0,1,1
1,1,0,0,1,1,0,0Example
# Load entire file
julia> cohort = Cohort("demo_haplotypes.csv")
julia> get_genotypes(cohort)
5×4 Array{Int64,2}:
2 0 0 1
2 0 2 0
0 0 1 0
1 1 0 2
0 1 0 2
# Randomly load 3 individuals with a dataframe.
julia> data = DATA("demo_haplotypes.csv", header=false)
julia> cohort = Cohort(data, random=true, n=3)
julia> get_genotypes(cohort)
3×4 Array{Int64,2}:
2 0 2 0
0 1 0 2
1 1 0 2
# Replace marker MAF by the provided file
julia> cohort = Cohort("demo_haplotypes.csv", alter_maf=true)
[ Info: MAF has been updated based on provided haplotypes/genotypes
[ Info: Cohort (5 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [1.418]
[ Info:
[ Info: Variance of breeding values:
[ Info: [2.012]──────────────────────────────────────────────────────────────
Functions that insepct Cohort properties:
All the listed functions can take a keyword argument ID=true to insert individuals' IDs as the first column.
Genotypes
Genotype matrix in the dimension of individuals by markers
julia> get_genotypes(cohort)
5×4 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
0 0 1 0
2 0 2 0
2 0 0 1
0 1 0 2
1 1 0 2QTLs
QTLs matrix in the dimension of individuals by markers
julia> get_QTLs(cohort)
5×3 Array{Int64,2}:
2 2 0
0 0 2
0 1 0
1 0 2
2 0 1Breeding values
Breeding values in the dimenstion individuals by traits
julia> get_BVs(cohort)
5×2 LinearAlgebra.Adjoint{Float64,Array{Float64,2}}:
1.26491 0.0
3.79473 0.0
1.26491 1.21268
0.0 1.69775
0.632456 1.69775Pedigree
Pedigree matrix, listed columns are in the order of individuals' ID, sire ID, and dam ID.
julia> get_pedigree(cohort)
5×3 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0Minor Allele Frequencies (MAF)
In the case where we have 3 QTLs out of 4 markers, we want to compare their allel frequencies.
julia> get_MAF(cohort)
4-element Array{Float64,1}:
0.5
0.2
0.3
0.5Phenotypes
Simulate cohort phenotypes based on the defined phenome. h2 and ve can be assigned specifically for this one-time simulation.
julia> get_phenotypes(cohort)
5×1 Array{Float64,2}:
1.1126064336992942
-0.8337021175232547
-0.363621019381922
4.042256656472762
1.7828800511223049Main.XSim.mate — FunctionMating function
mate(cohort_A ::Cohort,
cohort_B ::Cohort;
nA ::Int64=cohort_A.n,
nB_per_A ::Int64=1,
n_per_mate ::Int64=1,
replace_A ::Bool =false,
replace_B ::Bool =false,
ratio_malefemale ::Union{Float64, Int64}=0,
scheme ::String ="none",
args...)
mate(cohort::Cohort; args...) = mate(cohort, cohort; args...)Arguments
Positional arguments
cohort_A: Acohortobject that is treated as common mating parents.cohort_B: Acohortobject that is a mating pool from which individuals are sampled to mate withcohort_A.
Keyword arguments
nA:nAindividuals will be sampled fromcohort_Aand treated as common parents.nB_per_A:nB_per_Aindividuals sampled fromcohort_Bwill mate with each individual fromcohort_A.n_per_mate:n_per_mateprogenies will be reproduced from each pair of mating parent.replace_A: Whether the sampling is replacable incohort_A.replace_B: Whether the sampling is replacable incohort_B.ratio_malefemale: By default, two cohorts which are male and female progenies will be returned.ratio_malefemaledefined the progenies ratio of males over females. Ifratio_malefemale=0, only one cohort will be returned.scheme: Available options are ["random", "diallel cross", "selfing", "DH"]. See the examples for more details.
Outputs
By default, two cohort objects will be returned. The first cohort is assumed to be male progenies and the other cohort are female progenies. The size of two cohorts will folow the ratio raiot_malefemale. When ratio_malefemale is set to 0, only one cohort will be returned.
Example
Random mating (Default)
Initialize cohorts
julia> cohort_A = Cohort(5)
julia> cohort_B = Cohort(10)Define mating events
julia> args = Dict(:nA => cohort_A.n,
:nB_per_A => 1,
:replace_A => false,
:replace_B => false,
:n_per_mate => 1)
julia> progenies = mate(cohort_A, cohort_B; args...)
# Equivalent
julia> progenies = mate(cohort_A, cohort_B)
# Equivalent
julia> progenies = mate(cohort_A, cohort_B; scheme="random")
# Equivalent
julia> progenies = cohort_A * cohort_BCheck the pedigree to see if the mating goes as desired.
julia> get_pedigree(progenies)
5×3 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
19 1 8
16 2 6
17 3 10
20 4 15
18 5 14Diallel cross
Initialize cohorts
julia> cohort_A = Cohort(2)
julia> cohort_B = Cohort(5)Define mating events
julia> args = Dict(:nA => sires.n,
:nB_per_A => dams.n,
:replace_A => false,
:replace_B => false,
:n_per_mate => 1,
:ratio_malefemale=> 1)
julia> male, female = mate(cohort_A, cohort_B; args...)
# Equivalent
julia> male, female = mate(cohort_A, cohort_B; scheme="diallel cross")Check the pedigree to see if the mating goes as desired.
julia> get_pedigree(male + female)
10×3 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
12 2 7
10 2 6
11 2 4
14 1 3
15 1 5
9 2 5
13 1 6
17 1 4
8 2 3
16 1 7Selfing
Initialize cohorts
julia> parents = Cohort(5)In the selfing scheme, only one cohort is required.
julia> args = Dict(:nA => 3,
:replace_A => false,
:n_per_mate => 5,
:scheme => "selfing")
julia> progenies = mate(parents; args...)Inspect the pedigree to verify the mating behavior
julia> get_pedigree(progenies)
15×3 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
6 4 4
7 4 4
8 4 4
9 4 4
10 4 4
11 1 1
12 1 1
13 1 1
14 1 1
15 1 1
16 5 5
17 5 5
18 5 5
19 5 5
20 5 5Main.XSim.select — FunctionSelection function
select(cohort ::Cohort,
n ::Int64,
criteria ::Union{String, Array} = "phenotypes";
h2 ::Union{Array{Float64}, Float64}=GLOBAL("h2"),
ve ::Union{Array{Float64}, Float64}=GLOBAL("Ve"),
weights ::Array{Float64, 1} =[1.0],
return_log ::Bool =false,
is_random ::Bool =false,
silent ::Bool =GLOBAL("silent")
select(cohort::Cohort, ratio::Float64; args...)Arguments
Positional arguments
cohort: Acohortfrom which individuals are selected.n:nindividuals are selected.ratio:ratioportion of individuals are selected.criteria:Criteriathat will be used for the selecition. Default "phenotypes", the options are ["phenotypes", "GBLUP", array]. If set to "GBLUP", a genetic evaluation is carried out byJWASand the estimated breeding values will be thecriteria. It's also avaialbe to provdie thecriteria(e.g., phenotypes matrix) directly for the selection.
Keyword arguments
h2: The heritabilityh2of the simulated phenotypes.ve: The residual covarianceveof the simulated phenotypes.weight: Linear coefficients of traits for the selection. The selection is more sensitive to traits with greaterweight. Negativereturn_log: Defaultfalse. Settrueto return selection differential and selection response besides the selected cohort.silent: Defaultfalse. Settrueto mute the log messages.
Outputs
A selected cohort object will be returned. If return_log is set to true, a dictionary object containing the selected cohort, selection differential, and selection response will be returned.
─────────────────────────────────────────────────────────
Example
Single Trait Selection
Set demo genome and phenome with single traits controlled by 50 QTLs.
julia> build_demo()
julia> build_phenome(50)
julia> summary()
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 10
[ Info:
[ Info: Chromosome Length (cM): 1500.0
[ Info: [150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0]
[ Info:
[ Info: Number of Loci : 1000
[ Info: [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]
[ Info:
[ Info: Genotyping Error : 0.0
[ Info: Mutation Rate : 0.0
[ Info:
[ Info: --------- Phenome Summary ---------
[ Info: Number of Traits : 1
[ Info: Heritability (h2) : [0.5]
┌ Info:
│ Genetic_Variance =
│ 1×1 Array{Float64,2}:
└ 1.0
┌ Info:
│ Residual_Variance =
│ 1×1 Array{Float64,2}:
└ 1.0
[ Info: Number of QTLs : [50]Initialize a cohort with 100 individuals
julia> cohort = Cohort(100)Select 30 individuals
# Select top 30 individuals
julia> cohort_s = select(cohort, 30)
# Equivalent
julia> cohort_s = select(cohort, 0.3)
[ Info: --------- Selection Summary ---------
[ Info: Select 30 individuals out of 100 individuals
[ Info: Selection differential (P): [1.174]
[ Info: Selection response (G): [0.843]
┌ Info:
│ Residual_Variance =
│ 1×1 Array{Float64,2}:
└ 1.0
[ Info: --------- Offsprings Summary ---------
[ Info: Cohort (30 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [1.448]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.367]Assign Heritability h2 or Residual Covariance ve
# Assign heritability
julia> progenies = select(cohort, 30, h2=0.1)
# Equivalent in the case where genetic variance `vg` is 1.0
julia> progenies = select(cohort, 30, ve=9.0)
[ Info: --------- Selection Summary ---------
[ Info: Select 30 individuals out of 100 individuals
[ Info: Selection differential (P): [1.182]
[ Info: Selection response (G): [0.338]
┌ Info:
│ Residual_Variance =
│ 1×1 Array{Float64,2}:
└ 9.0
[ Info: --------- Offsprings Summary ---------
[ Info: Cohort (30 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [0.956]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.643]Negative Selection
Set is_positive=false to rank individuals in ascending order
julia> progenies = select(cohort, 30, is_positive=false)
[ Info: --------- Selection Summary ---------
[ Info: Select 30 individuals out of 100 individuals
[ Info: Selection differential (P): [-1.19]
[ Info: Selection response (G): [-0.89]
┌ Info:
│ Residual_Variance =
│ 1×1 Array{Float64,2}:
└ 1.0
[ Info: --------- Offsprings Summary ---------
[ Info: Cohort (30 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [-0.24]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.566]Random Selection
julia> progenies = select(cohort, 30, is_random=true)
[ Info: --------- Selection Summary ---------
[ Info: Select 30 individuals out of 100 individuals
[ Info: Selection differential (P): [-0.06]
[ Info: Selection response (G): [-0.191]
┌ Info:
│ Residual_Variance =
│ 1×1 Array{Float64,2}:
└ 1.0
[ Info: --------- Offsprings Summary ---------
[ Info: Cohort (30 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [0.441]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.946]Selection wiht Multiple Parameters
It's possible specify multiple parameters described above in one selection. User can either enclose parameters as keyword arguments, or pass them through a dictionary object.
# Keyword args
julia> progenies = select(cohort, 30, h2=0.3, is_positive=false)
# Equivalent
julia> args = Dict(:h2=>0.3,
:is_positive=>false)
julia> progenies = select(cohort, 30; args...)
[ Info: --------- Selection Summary ---------
[ Info: Select 30 individuals out of 100 individuals
[ Info: Selection differential (P): [-1.086]
[ Info: Selection response (G): [-0.486]
┌ Info:
│ Residual_Variance =
│ 1×1 Array{Float64,2}:
└ 2.3333333333333335
[ Info: --------- Offsprings Summary ---------
[ Info: Cohort (30 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [0.154]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.818]─────────────────────────────────────────────────────────
Multi-Trait Selection
Set demo genome and phenome with single traits controlled by 50 QTLs.
julia> build_demo()
julia> summary()
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 10
[ Info:
[ Info: Chromosome Length (cM): 1500.0
[ Info: [150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0]
[ Info:
[ Info: Number of Loci : 1000
[ Info: [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]
[ Info:
[ Info: Genotyping Error : 0.0
[ Info: Mutation Rate : 0.0
[ Info:
[ Info: --------- Phenome Summary ---------
[ Info: Number of Traits : 2
[ Info: Heritability (h2) : [0.5, 0.5]
┌ Info:
│ Genetic_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.0
└ 0.0 1.0
┌ Info:
│ Residual_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.0
└ 0.0 1.0
[ Info: Number of QTLs : [3 8]Initialize a cohort with 100 individuals
julia> cohort = Cohort(100)Assign Heritabilities for Multiple Traits
julia> progenies = select(cohort, 30, h2=[0.9, 0.3])
[ Info: --------- Selection Summary ---------
[ Info: Select 30 individuals out of 100 individuals
[ Info: Selection differential (P): [0.468 1.028]
[ Info: Selection response (G): [0.383 0.636]
┌ Info:
│ Residual_Variance =
│ 2×2 Array{Float64,2}:
│ 0.111111 0.0
└ 0.0 2.33333
[ Info: --------- Offsprings Summary ---------
[ Info: Cohort (30 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [-0.889 0.28]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.947 0.625]Assign Trait Correlations via Residual Covariance
julia> progenies = select(cohort, 30, ve=[1 0.3
0.3 1])
[ Info: --------- Selection Summary ---------
[ Info: Select 30 individuals out of 100 individuals
[ Info: Selection differential (P): [0.866 0.925]
[ Info: Selection response (G): [0.662 0.762]
┌ Info:
│ Residual_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.3
└ 0.3 1.0
[ Info: --------- Offsprings Summary ---------
[ Info: Cohort (30 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [-0.608 0.406]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.549 0.476]Derive Selection Index for Multiple Traits
Assigning a vector to the parameter weights to derive a selection index which is a linear combintation of the weights and the phenotypes. In this example, we demonstrate two traits with the heritability of 0.3 and 0.8, respectively. And we can select traits with more weight on the second trait which is more heritable, and negatively select the first trait.
julia> progenies = select(cohort, 30, h2=[.3, .8], weights=[-0.1, 0.9])
[ Info: --------- Selection Summary ---------
[ Info: Select 30 individuals out of 100 individuals
[ Info: Selection differential (P): [-0.318 1.027]
[ Info: Selection response (G): [-0.233 0.869]
┌ Info:
│ Residual_Variance =
│ 2×2 Array{Float64,2}:
│ 2.33333 0.0
└ 0.0 0.25
[ Info: --------- Offsprings Summary ---------
[ Info: Cohort (30 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [-1.508 0.513]
[ Info:
[ Info: Variance of breeding values:
[ Info: [1.053 0.458]Main.XSim.breed — FunctionWrap-up function: breed()
breed(cohort_A ::Cohort,
cohort_B ::Cohort;
n_gens ::Int64=1,
n_select ::Int64=cohort_A.n + cohort_B.n,
n_select_A ::Int64=cohort_A.n,
n_select_B ::Int64=cohort_B.n,
select_all_gens ::Bool=false,
args...)
breed(cohort::Cohort, n_gens::Int64, args...) =
breed(cohort, cohort, n_gens; args...)Arguments
Positional arguments
cohort_A: Acohortobject that is treated as common mating parents. It's asssumed to be sires/males parents.cohort_B: Acohortobject that is a mating pool from which individuals are sampled to mate withcohort_A. It's assumed to be dams/female parents.
Keywords arguments
n_gens: An integer specifying how many mate-select generations are carried out.n_select: Used whenratio_malefemaleis set to0.n_selectindividuals are selected as parents for the next generation.n_select_A: Used whenratio_malefemaleis not0.n_select_Awill be selected as male parents for the next generation.n_select_B: Used whenratio_malefemaleis not0.n_select_Bwill be selected as female parents for the next generation.select_all_gens: Default "false" and parents are not included in the next generation pool for selections. Setselect_all_gensto "true" if the selections consider individuals from all generations.
Outputs
By default, two cohort objects will be returned. The first cohort is assumed to be male progenies and the other cohort are female progenies. The size of two cohorts will folow the ratio raiot_malefemale. When ratio_malefemale is set to 0, only one cohort will be returned.
Examples
We can have 10 sires and mate each sire with 5 dams for 5 generations. In each generation, we randomly select 10 male progenies as sires and all female progenies as dams for the next generation. We can derive such breeding scheme as below:
julia> args = Dict(# mating arguments
:nA => 10,
:nB_per_A => 5,
:n_per_mate => 2,
:ratio_malefemale => 1.0,
# selection arguments
:is_random => true,
# breeding arguments
:n_gens => 5,
:nA_select => 10,
:select_all_gens => true)Simulate 10 sires and 50 dams as founders.
julia> sires = Founders(10)
julia> dams = Founders(50)Breed cohorts based on the defined arguments.
julia> sires, dams = breed(sires, dams; args...)
[ Info: Gen 0 -> Mean of BVs: [1.665 2.745], Variance of BVs: [1.008 0.479]
[ Info: Gen 1 -> Mean of BVs: [1.719 2.715], Variance of BVs: [0.96 0.546]
[ Info: Gen 2 -> Mean of BVs: [1.754 2.777], Variance of BVs: [0.936 0.724]
[ Info: Gen 3 -> Mean of BVs: [1.766 2.796], Variance of BVs: [0.927 0.775]
[ Info: Gen 4 -> Mean of BVs: [1.773 2.771], Variance of BVs: [0.951 0.781]
[ Info: Gen 5 -> Mean of BVs: [1.813 2.751], Variance of BVs: [0.952 0.804]
([ Info: Cohort (60 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [1.855 2.593]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.884 0.697]
, [ Info: Cohort (300 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [1.805 2.782]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.969 0.822]The result is equivalent to the following mate-select iterations:
julia> for _ in 1:5
males, females = mate(sires, dams, args...)
sires += select(males, n_sires, args...)
dams += females
endMain.XSim.genetic_evaluation — FunctionGenetic evaluation
genetic_evaluation(cohort ::Cohort,
phenotypes ::DataFrame=DataFrame();
model_equation ::String="",
covariates ::String="",
random_iid ::String="",
random_str ::String="",
methods ::String="GBLUP",
add_genotypes ::Bool=true,
idx_missing_p ::Any=[],
return_out ::Bool=false,
args...)Arguments
cohort: The evaluatedcohort.phenotypes: Adataframewiht columns ofid, traits, and other studied factors.h2: Default 0.5. Heritability for simulating cohort phenotypes if nophenotypeis provided.ve: Default 1. Residual variance for simulating cohort phenotypes if nophenotypeis provided.model_equation: Default "y ~ intercept". The equation for fitting the breeding values.covariates: Specifies terms inphenotypesas continuous factors. Must be included inmodel_equationas well.random_iid: Specifies terms inphenotypesas random effects (i.i.d.). Must be included inmodel_equationas well.random_str: Specifies terms inphenotypesas random effects (pedigree). Must be included inmodel_equationas well.methods: Default "GBLUP". Defines what models/methods used in the genetic evaluation.add_genotypes: Defaulttrue. Genotypes will be included in the equation.idx_missing_p: A vector assigning which individuals are not phenotyped.return_out: Defaultfalse. Set totrueto return the completeJWASoutputs.
Outputs
A n by t matrix containing breeding values will be return if return_out = false, where n is number of individuals and t is number of evaluated traits. A complete JWAS outputs will be returned if return_out = true.
Example of the phenotypes dataframe
Row │ ID y1 y2 factor_1 factor_2
│ String Float64? Float64? Int64 Int64
─────┼─────────────────────────────────────────────────────────────
1 │ 1 0.88976 -0.0798048 1 1
2 │ 2 -0.783203 0.988616 1 1
3 │ 3 missing missing 1 1
4 │ 4 missing missing 1 1
5 │ 5 missing missing 1 1
6 │ 6 missing missing 1 2
7 │ 7 missing missing 1 2
8 │ 8 -1.76058 0.277289 1 2
9 │ 9 0.938871 2.57784 1 2
10 │ 10 0.37026 3.15993 1 2
11 │ 11 -1.91869 0.0935064 2 3
12 │ 12 -0.89847 1.8987 2 3
13 │ 13 1.69663 -0.949513 2 3
14 │ 14 3.48862 0.654378 2 3
15 │ 15 1.39615 1.80355 2 3
16 │ 16 2.31685 2.13446 2 4
17 │ 17 -3.81017 0.0186156 2 4
18 │ 18 -1.71216 -0.0976809 2 4
19 │ 19 1.80917 1.34104 2 4
20 │ 20 -0.504771 3.22665 2 4Examples
We will use demo genome and phenome for the example. A cohort with 20 individuals is simulated.
julia> build_demo()
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 10
[ Info:
[ Info: Chromosome Length (cM):
[ Info: [150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0, 150.0]
[ Info:
[ Info: Number of Loci : 1000
[ Info: [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]
[ Info:
[ Info: Genotyping Error : 0.0
[ Info: Mutation Rate : 0.0
[ Info:
[ Info: --------- Phenome Summary ---------
[ Info: Number of Traits : 2
[ Info: Heritability (h2) : [0.5, 0.5]
┌ Info:
│ Genetic_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.0
└ 0.0 1.0
┌ Info:
│ Residual_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.0
└ 0.0 1.0
[ Info: Number of QTLs : [3 8]
julia> cohort = Cohort(20)
[ Info: Cohort (20 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [-0.862 -0.913]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.814 1.448]─────────────────────────────────────────────────────────
Basic usage
By default, GBLUP will be performed without providing any argument.
julia> out = genetic_evaluation(cohort)
20×2 Array{Float64,2}:
42.2908 -5.89224
6.38041 -4.5895
19.5065 -57.4513
24.05 -84.5086
11.5925 45.5247
-11.5926 -1.91879
-29.2152 -35.5977
-1.92563 63.8841
19.8686 76.2722
22.6728 45.5179
-58.2754 42.5953
28.556 -21.0314
11.5918 25.1126
-0.517653 -0.237724
-11.4712 -47.3073
-10.9575 -3.58754
-11.7835 -9.62802
-17.7895 -22.0821
-39.5114 -3.46026
6.53008 -1.61416The breeding values out can be used as criteria for the selection directly.
julia> select(cohort, 5, out)
[ Info: --------- Selection Summary ---------
[ Info: Select 5 individuals out of 20 individuals
[ Info: Selection differential (P): [0.783 1.192]
[ Info: Selection response (G): [0.344 1.063]
┌ Info:
│ Residual_Variance =
│ 2×2 Array{Float64,2}:
│ 1.0 0.0
└ 0.0 1.0
[ Info: --------- Offsprings Summary ---------
[ Info: Cohort (5 individuals)
[ Info:
[ Info: Mean of breeding values:
[ Info: [-0.552 0.367]
[ Info:
[ Info: Variance of breeding values:
[ Info: [0.181 0.539]Conditions of the evaluation can be further defined by h2 or ve.
julia> out = genetic_evaluation(cohort, ve = [1 0.5
0.5 1])Set return_out = true to obtain the complete JWAS outputs.
julia> out = genetic_evaluation(cohort, return_out = true)
Dict{Any,Any} with 7 entries:
"EBV_y2" => 20×3 DataFrame…
"EBV_y1" => 20×3 DataFrame…
"heritability" => 2×3 DataFrame…
"location parameters" => 2×5 DataFrame…
"residual variance" => 4×3 DataFrame…
"marker effects geno" => 40×5 DataFrame…
"genetic_variance" => 4×3 DataFrame…─────────────────────────────────────────────────────────
Customized phenotypes and factors.
Obtain JWAS-compatible dataframe.
dt_p = get_phenotypes(cohort, "JWAS")Assign un-phenotyped individuals
julia> idx = 3:6
julia> allowmissing!(dt_p);
julia> dt_p[idx, 2:end] .= missing;
julia> first(dt_p, 10)
10×3 DataFrame
Row │ ID y1 y2
│ String? Float64? Float64?
─────┼──────────────────────────────────────────
1 │ 1 -0.0933375 -0.882781
2 │ 2 -1.50748 -2.12898
3 │ 3 missing missing
4 │ 4 missing missing
5 │ 5 missing missing
6 │ 6 missing missing
7 │ 7 -0.431361 -0.624666
8 │ 8 -1.17867 0.415607
9 │ 9 0.266733 1.02123
10 │ 10 -1.15588 0.737982Provide customized phenotypes for the evaluation
julia> out = genetic_evaluation(dams, dt_p)It's equivalent to set idx_missing_p = 3:6 for the missing phenotypes.
julia> out = genetic_evaluation(dams, dt_p, idx_missing_p = 3:6)
The marker IDs are set to 1,2,...,#markers
The individual IDs is set to 1,2,...,#observations
Genotype informatin:
#markers: 1000; #individuals: 20
The folder results is created to save results.
Checking genotypes...
Checking phenotypes...
Individual IDs (strings) are provided in the first column of the phenotypic data.
Phenotypes for 16 individuals are used in the analysis.These individual IDs are saved in the file IDs_for_individuals_with_phenotypes.txt.We can simulated factor_1 and factor_2 as fixed and random effects, respectively. And we use both factor_1 and factor_2 to fit y1, and factor_1 alone to fit y2.
julia> dt_p[:, "factor_1"] = [i for i in 1:4 for j in 1:5];
julia> dt_p[:, "factor_2"] = [i for i in 1:2 for j in 1:10];
julia> out = genetic_evaluation(cohort, dt_p,
model_equation="y1 = intercept + factor_1 + factor_2
y2 = intercept + factor_1",
random_iid="factor_2",
return_out=true)
factor_2 is not found in model equation 2.
The marker IDs are set to 1,2,...,#markers
The individual IDs is set to 1,2,...,#observations
Genotype informatin:
#markers: 1000; #individuals: 20
The folder results is created to save results.
Checking genotypes...
Checking phenotypes...
Individual IDs (strings) are provided in the first column of the phenotypic data.
Phenotypes for 16 individuals are used in the analysis.These individual IDs are saved in the file IDs_for_individuals_with_phenotypes.txt.
Prior information for random effect variance is not provided and is generated from the data.
Pi (Π) is not provided.
Pi (Π) is generated assuming all markers have effects on all traits.
A Linear Mixed Model was build using model equations:
y1 = intercept + factor_1 + factor_2
y2 = intercept + factor_1
Model Information:
Term C/F F/R nLevels
intercept factor fixed 1
factor_1 factor fixed 4
factor_2 factor random 2
MCMC Information:
chain_length 100
burnin 0
starting_value true
printout_frequency 101
output_samples_frequency 1
constraint false
missing_phenotypes true
update_priors_frequency 0
seed false
Hyper-parameters Information:
random effect variances (y1:factor_2):
0.45
residual variances:
1.0f0 0.0f0
0.0f0 1.0f0
Genomic Information:
complete genomic data (i.e., non-single-step analysis)
Genomic Category geno
Method GBLUP
genetic variances (genomic):
1.0 0.0
0.0 1.0
estimateScale false
Degree of freedom for hyper-parameters:
residual variances: 6.000
random effect variances: 5.000
marker effect variances: 6.000
The version of Julia and Platform in use:
Julia Version 1.5.4
Commit 69fcb5745b (2021-03-11 19:13 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin18.7.0)
CPU: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
Environment:
JULIA_EDITOR = code
JULIA_NUM_THREADS =
The analysis has finished. Results are saved in the returned variable and text files. MCMC samples are saved in text files.
Dict{Any,Any} with 7 entries:
"EBV_y2" => 20×3 DataFrame…
"EBV_y1" => 20×3 DataFrame…
"heritability" => 2×3 DataFrame…
"location parameters" => 12×5 DataFrame…
"residual variance" => 4×3 DataFrame…
"marker effects geno" => 32×5 DataFrame…
"genetic_variance" => 4×3 DataFrame…