simulated genotype
Recently Published Documents


TOTAL DOCUMENTS

3
(FIVE YEARS 2)

H-INDEX

2
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Daniel A Reidenbach ◽  
Avantika Lal ◽  
Lotfi Slim ◽  
Ohad Mosafi ◽  
Johnny Israeli

Many computational methods aim to identify genetic variants associated with diseases and complex traits. Due to the absence of ground truth data, simulated genotype and phenotype data is needed to benchmark these methods. However, phenotypes are frequently simulated as an additive function of randomly selected variants, neglecting biological complexity such as non-random occurrence of causal SNPs, epistatic effects, heritability and dominance. Including such features would improve benchmarking studies and accelerate the development of methods for genetic analysis. Here, we describe GEPSi (GWAS Epistatic Phenotype Simulator), a user-friendly python package to simulate phenotype data based on user-supplied genotype data for a population. GEPSi incorporates diverse biological parameters such as heritability, dominance, population stratification and epistatic interactions between SNPs. We demonstrate the use of this package to compare machine learning methods for GWAS analysis. GEPSi is freely available under an Apache 2.0 license, and can be downloaded from https://github.com/clara-parabricks/GEPSi.


2019 ◽  
Vol 10 (2) ◽  
pp. 783-795 ◽  
Author(s):  
Pascal Duenk ◽  
Piter Bijma ◽  
Mario P. L. Calus ◽  
Yvonne C. J. Wientjes ◽  
Julius H. J. van der Werf

Average effects of alleles can show considerable differences between populations. The magnitude of these differences can be measured by the additive genetic correlation between populations (rg). This rg can be lower than one due to the presence of non-additive genetic effects together with differences in allele frequencies between populations. However, the relationship between the nature of non-additive effects, differences in allele frequencies, and the value of rg remains unclear, and was therefore the focus of this study. We simulated genotype data of two populations that have diverged under drift only, or under drift and selection, and we simulated traits where the genetic model and magnitude of non-additive effects were varied. Results showed that larger differences in allele frequencies and larger non-additive effects resulted in lower values of rg. In addition, we found that with epistasis, rg decreases with an increase of the number of interactions per locus. For both dominance and epistasis, we found that, when non-additive effects became extremely large, rg had a lower bound that was determined by the type of inter-allelic interaction, and the difference in allele frequencies between populations. Given that dominance variance is usually small, our results show that it is unlikely that true rg values lower than 0.80 are due to dominance effects alone. With realistic levels of epistasis, rg dropped as low as 0.45. These results may contribute to the understanding of differences in genetic expression of complex traits between populations, and may help in explaining the inefficiency of genomic trait prediction across populations.


Sign in / Sign up

Export Citation Format

Share Document