Build Genome
build_genome()
defines the genetic information including numbers of chromosomes and loci, genetic position, physical position, minor allele frequency, mutation rates, and genotyping error rates.
Quick Start
Quick setup by assigning number of markers
and chromosomes
.
build_genome(;n_loci ::Int64=-1,
n_chr ::Int64=10,
species ::String="none",
args...)
Arguments
n_loci
: Number of simulated markers for each chromosomen_chr
: Number of simulated chromosome with length of 100 centimorganspecies
: Infer genetic position (Morgan) by pre-load linkage maps. Available species are: ['pig', 'cattle', 'maize', 'rice'].
Examples
julia> build_genome(n_chr = 2,
n_loci = 10000)
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 2
[ Info:
[ Info: Chromosome Length (cM): 200.0
[ Info: [100.0, 100.0]
[ Info:
[ Info: Number of Loci : 20000
[ Info: [10000, 10000]
[ Info:
[ Info: Genotyping Error : 0.0
[ Info: Mutation Rate : 0.0
[ Info:
Build by a File or a DataFrame
Define genome by providing a formatted dataframe or a path to the file.
build_genome(dt ::DataFrame;
species ::String="none",
args...)
build_genome(filename::String;
species ::String="none",
args...)
Arguments
dt
: ADataFrame
with required columns ofchr
andcM
, orchr
andbp
ifspecies
is provided for the inference.filename
: A filepath to the file containing genome information.species
: Adjust genetic position (Morgan) by pre-load linkage maps, available species are: ["cattle", and "pig"]
Example of the DataFrame
4×7 DataFrame
Row │ id chr bp cM MAF eff_1 eff_2
│ String Int64 Int64 Float64 Float64 Float64 Float64
─────┼────────────────────────────────────────────────────────────
1 │ snp_1 1 1818249 50.8 0.5 0.1 0.0
2 │ snp_2 1 6557697 80.3 0.5 0.0 0.0
3 │ snp_3 2 2298800 39.2 0.5 0.2 0.0
4 │ snp_4 2 5015698 66.3 0.5 0.0 0.5
Examples
The preloaded file can be located through:
# Filepath to the preloaded map file
julia> filepath = PATH("map")
Build genome by a filepath
julia> build_genome(filepath;
rate_mutation=0.005, rate_error=0.01)
or by a dataframe
julia> using DataFrames
julia> data = CSV.read(filepath, DataFrame)
julia> build_genome(data;
rate_mutation=0.005, rate_error=0.01)
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 2
[ Info:
[ Info: Chromosome Length (cM): 146.6
[ Info: [80.3, 66.3]
[ Info:
[ Info: Number of Loci : 4
[ Info: [2, 2]
[ Info:
[ Info: Genotyping Error : 0.01
[ Info: Mutation Rate : 0.005
[ Info:
Use cattle genome as reference to infer the genetic positions
julia> build_genome(filepath; species="cattle")
[ Info: Arias,J.A. et al. (2009) A high density linkage map of the bovine genome. BMC Genetics, 10, 18.
[ Info: Reference Genome : Btau 4.0
[ Info: SNP Chip : Affymetrix GeneChip Bovine Mapping 10K SNP kit
┌ Warning: The provided genetic distances will be replaced with ones infered from preloaded linkage maps
└ @ XSim ~/Dropbox/projects/XSim/src/objects/global.jl:118
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 2
[ Info:
[ Info: Chromosome Length (cM): 16.8
[ Info: [15.1, 1.7]
[ Info:
[ Info: Number of Loci : 4
[ Info: [2, 2]
[ Info:
[ Info: Genotyping Error : 0.0
[ Info: Mutation Rate : 0.0
[ Info:
Build by Pre-Loaded Genome
Define genome by pre-loaded genome from reference.
build_genome(;species :: String="none")
Arguments
species
: Adjust genetic position (Morgan) by pre-load linkage maps, available species are: ["cattle", and "pig"]
Examples
julia> build_genome(species="pig")
[ Info: Tortereau,F. et al. (2012) A high density recombination map of the pig reveals a correlation between sex-specific recombination and GC content. BMC Genomics, 13, 586.
[ Info: Reference Genome : Sscrofa 10.2
[ Info: SNP Chip : PorcineSNP60 BeadChip
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 18
[ Info:
[ Info: Chromosome Length (cM):
[ Info: [98.0, 94.7, 96.8, 92.2, 89.3, 124.1, 112.8, 94.8, 95.4, 84.4, 64.6, 77.0, 97.6, 106.6, 93.5, 66.5, 53.9, 49.1]
[ Info:
[ Info: Number of Loci : 45292
[ Info: [6580, 2356, 1938, 3682, 2217, 1766, 3489, 2100, 2538, 1281, 1805, 1072, 3529, 4053, 2612, 1513, 1646, 1115]
[ Info:
[ Info: Genotyping Error : 0.0
[ Info: Mutation Rate : 0.0
[ Info:
Build by Explicit Definition
Define genome by providing genetic information of each loci explicitly.
build_genome(chromosome ::Array{Int64, 1},
bp ::Array{Int64, 1},
cM ::Array{Float64, 1},
maf ::Array{Float64, 1};
rate_mutation ::Float64=0.0,
rate_error ::Float64=0.0)
Arguments
chromosome
: Chromosome codesbp
: Physical positionscM
: Genetic positionsmaf
: Minor allele frequenciesrate_mutation
: Mutation raterate_error
: Error rate of genotyping
Examples
julia> ch = [1, 1, 2, 2, 2]
julia> bp = [130, 205, 186, 503, 780]
julia> cM = [85.7, 149.1, 37.4, 83.6, 134.3]
julia> maf = [0.5, 0.5, 0.5, 0.5, 0.5]
julia> build_genome(ch, bp, cM, maf)
[ Info: --------- Genome Summary ---------
[ Info: Number of Chromosome : 2
[ Info:
[ Info: Chromosome Length (cM): 283.4
[ Info: [149.1, 134.3]
[ Info:
[ Info: Number of Loci : 5
[ Info: [2, 3]
[ Info:
[ Info: Genotyping Error : 0.0
[ Info: Mutation Rate : 0.0
[ Info: