scholarly journals GALLO: An R package for genomic annotation and integration of multiple data sources in livestock for positional candidate loci

GigaScience ◽  
2020 ◽  
Vol 9 (12) ◽  
Author(s):  
Pablo A S Fonseca ◽  
Aroa Suárez-Vega ◽  
Gabriele Marras ◽  
Ángela Cánovas

Abstract Background The development of high-throughput sequencing and genotyping methodologies has enabled the identification of thousands of genomic regions associated with several complex traits. The integration of multiple sources of biological information is a crucial step required to better understand patterns regulating the development of these traits. Findings Genomic Annotation in Livestock for positional candidate LOci (GALLO) is an R package developed for the accurate annotation of genes and quantitative trait loci (QTLs) located in regions identified in common genomic analyses performed in livestock, such as genome-wide association studies and transcriptomics using RNA sequencing. Moreover, GALLO allows the graphical visualization of gene and QTL annotation results, data comparison among different grouping factors (e.g., methods, breeds, tissues, statistical models, studies), and QTL enrichment in different livestock species such as cattle, pigs, sheep, and chickens. Conclusions Consequently, GALLO is a useful package for annotation, identification of hidden patterns across datasets, and data mining previously reported associations, as well as the efficient examination of the genetic architecture of complex traits in livestock.

Heredity ◽  
2021 ◽  
Author(s):  
Yasuhiro Sato ◽  
Eiji Yamamoto ◽  
Kentaro K. Shimizu ◽  
Atsushi J. Nagano

AbstractAn increasing number of field studies have shown that the phenotype of an individual plant depends not only on its genotype but also on those of neighboring plants; however, this fact is not taken into consideration in genome-wide association studies (GWAS). Based on the Ising model of ferromagnetism, we incorporated neighbor genotypic identity into a regression model, named “Neighbor GWAS”. Our simulations showed that the effective range of neighbor effects could be estimated using an observed phenotype when the proportion of phenotypic variation explained (PVE) by neighbor effects peaked. The spatial scale of the first nearest neighbors gave the maximum power to detect the causal variants responsible for neighbor effects, unless their effective range was too broad. However, if the effective range of the neighbor effects was broad and minor allele frequencies were low, there was collinearity between the self and neighbor effects. To suppress the false positive detection of neighbor effects, the fixed effect and variance components involved in the neighbor effects should be tested in comparison with a standard GWAS model. We applied neighbor GWAS to field herbivory data from 199 accessions of Arabidopsis thaliana and found that neighbor effects explained 8% more of the PVE of the observed damage than standard GWAS. The neighbor GWAS method provides a novel tool that could facilitate the analysis of complex traits in spatially structured environments and is available as an R package at CRAN (https://cran.rproject.org/package=rNeighborGWAS).


2019 ◽  
Author(s):  
Yasuhiro Sato ◽  
Eiji Yamamoto ◽  
Kentaro K. Shimizu ◽  
Atsushi J. Nagano

ABSTRACTAn increasing number of field studies have shown that the phenotype of an individual plant depends not only on its genotype but also on those of neighboring plants; however, this fact is not taken into consideration in genome-wide association studies (GWAS). Based on the Ising model of ferromagnetism, we incorporated neighbor genotypic identity into a regression model, named “Neighbor GWAS”. Our simulations showed that the effective range of neighbor effects could be estimated using an observed phenotype from when the proportion of phenotypic variation explained (PVE) by neighbor effects peaked. The spatial scale of the first nearest neighbors gave the maximum power to detect the causal variants responsible for neighbor effects, unless their effective range was too broad. However, if the effective range of the neighbor effects was broad and minor allele frequencies were low, there was collinearity between the self and neighbor effects. To suppress the false positive detection of neighbor effects, the fixed effect and variance components involved in the neighbor effects should be tested in comparison with a standard GWAS model. We applied neighbor GWAS to field herbivory data from 199 accessions of Arabidopsis thaliana and found that neighbor effects explained 8% more of the PVE of the observed damage than standard GWAS. The neighbor GWAS method provides a novel tool that could facilitate the analysis of complex traits in spatially structured environments and is available as an R package at CRAN (https://cran.rproject.org/package=rNeighborGWAS).


2020 ◽  
Vol 98 (Supplement_3) ◽  
pp. 161-161
Author(s):  
Hanna Wackel ◽  
Cedric Gondro ◽  
Donghyun Shin

Abstract The identification of quantitative trait loci (QTL) within different breeds of a species is important for polygenic traits such as meat quality and reproductive traits. If different breeds are selected for the same phenotype, the genetic regions that ultimately undergo positive selection will not necessarily be the same. One of the most common ways to identify these QTL is through genome wide association studies (GWAS). Outlining differences in significant QTL of complex traits can give insights into selection in one breed by using information from another. The objective of this study was to estimate heritabilities, identify QTL within purebred Yorkshire (YK) and Landrace (LR) populations by use of GWAS and to compare significant SNP between the breeds. 8,202 animals in total (5,053 Yorkshire and 3,149 Landrace) were genotyped with a 50k Illumina SNP chip, then phased and imputed to correct for any missing SNP calls using EAGLE and MINIMAC. The R package gwaR was used to estimate variance components by using a GBLUP model with fixed effects of parity, a contemporary group and sex. The response variables considered were carcass traits, specifically, meat percent (MP), backfat average (BFA), backfat depth (BFD), daily weight (DW) and day-90 weight (DW90) and were assessed per breed. Heritability estimates of each trait can be found in Table 1 and were in-line with previous studies. Significant SNP of each trait were compared between the two breeds by estimating the p-value of each SNP using gwaR. The breeds showed similar significant signals, but differences arose within BFD with additional significant peaks in LR. This could be due to real genotypic differences or could be an effect of the difference in sample size. The comparison between these two breeds can lead to insights in other pig breeds as well guide more informed selection decisions in the future.


2020 ◽  
Vol 36 (14) ◽  
pp. 4222-4224
Author(s):  
Zhong Wang ◽  
Nating Wang ◽  
Zilu Wang ◽  
Libo Jiang ◽  
Yaqun Wang ◽  
...  

Abstract Summary Genome-wide association studies (GWAS), particularly designed with thousands and thousands of single-nucleotide polymorphisms (SNPs) (big p) genotyped on tens of thousands of subjects (small n), are encountered by a major challenge of p ≪ n. Although the integration of longitudinal information can significantly enhance a GWAS’s power to comprehend the genetic architecture of complex traits and diseases, an additional challenge is generated by an autocorrelative process. We have developed several statistical models for addressing these two challenges by implementing dimension reduction methods and longitudinal data analysis. To make these models computationally accessible to applied geneticists, we wrote an R package of computer software, HiGwas, designed to analyze longitudinal GWAS datasets. Functions in the package encompass single SNP analyses, significance-level adjustment, preconditioning and model selection for a high-dimensional set of SNPs. HiGwas provides the estimates of genetic parameters and the confidence intervals of these estimates. We demonstrate the features of HiGwas through real data analysis and vignette document in the package. Availability and implementation https://github.com/wzhy2000/higwas. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 12 ◽  
Author(s):  
Binglan Li ◽  
Marylyn D. Ritchie

Since their inception, genome-wide association studies (GWAS) have identified more than a hundred thousand single nucleotide polymorphism (SNP) loci that are associated with various complex human diseases or traits. The majority of GWAS discoveries are located in non-coding regions of the human genome and have unknown functions. The valley between non-coding GWAS discoveries and downstream affected genes hinders the investigation of complex disease mechanism and the utilization of human genetics for the improvement of clinical care. Meanwhile, advances in high-throughput sequencing technologies reveal important genomic regulatory roles that non-coding regions play in the transcriptional activities of genes. In this review, we focus on data integrative bioinformatics methods that combine GWAS with functional genomics knowledge to identify genetically regulated genes. We categorize and describe two types of data integrative methods. First, we describe fine-mapping methods. Fine-mapping is an exploratory approach that calibrates likely causal variants underneath GWAS signals. Fine-mapping methods connect GWAS signals to potentially causal genes through statistical methods and/or functional annotations. Second, we discuss gene-prioritization methods. These are hypothesis generating approaches that evaluate whether genetic variants regulate genes via certain genetic regulatory mechanisms to influence complex traits, including colocalization, mendelian randomization, and the transcriptome-wide association study (TWAS). TWAS is a gene-based association approach that investigates associations between genetically regulated gene expression and complex diseases or traits. TWAS has gained popularity over the years due to its ability to reduce multiple testing burden in comparison to other variant-based analytic approaches. Multiple types of TWAS methods have been developed with varied methodological designs and biological hypotheses over the past 5 years. We dive into discussions of how TWAS methods differ in many aspects and the challenges that different TWAS methods face. Overall, TWAS is a powerful tool for identifying complex trait-associated genes. With the advent of single-cell sequencing, chromosome conformation capture, gene editing technologies, and multiplexing reporter assays, we are expecting a more comprehensive understanding of genomic regulation and genetically regulated genes underlying complex human diseases and traits in the future.


2021 ◽  
Vol 22 (5) ◽  
pp. 2556
Author(s):  
Isabelle M. McGrath ◽  
Sally Mortlock ◽  
Grant W. Montgomery

There is substantial genetic variation for common traits associated with reproductive lifespan and for common diseases influencing female fertility. Progress in high-throughput sequencing and genome-wide association studies (GWAS) have transformed our understanding of common genetic risk factors for complex traits and diseases influencing reproductive lifespan and fertility. The data emerging from GWAS demonstrate the utility of genetics to explain epidemiological observations, revealing shared biological pathways linking puberty timing, fertility, reproductive ageing and health outcomes. The observations also identify unique genetic risk factors specific to different reproductive diseases impacting on female fertility. Sequencing in patients with primary ovarian insufficiency (POI) have identified mutations in a large number of genes while GWAS have revealed shared genetic risk factors for POI and ovarian ageing. Studies on age at menopause implicate DNA damage/repair genes with implications for follicle health and ageing. In addition to the discovery of individual genes and pathways, the increasingly powerful studies on common genetic risk factors help interpret the underlying relationships and direction of causation in the regulation of reproductive lifespan, fertility and related traits.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Chao-Yu Guo ◽  
Reng-Hong Wang ◽  
Hsin-Chou Yang

AbstractAfter the genome-wide association studies (GWAS) era, whole-genome sequencing is highly engaged in identifying the association of complex traits with rare variations. A score-based variance-component test has been proposed to identify common and rare genetic variants associated with complex traits while quickly adjusting for covariates. Such kernel score statistic allows for familial dependencies and adjusts for random confounding effects. However, the etiology of complex traits may involve the effects of genetic and environmental factors and the complex interactions between genes and the environment. Therefore, in this research, a novel method is proposed to detect gene and gene-environment interactions in a complex family-based association study with various correlated structures. We also developed an R function for the Fast Gene-Environment Sequence Kernel Association Test (FGE-SKAT), which is freely available as supplementary material for easy GWAS implementation to unveil such family-based joint effects. Simulation studies confirmed the validity of the new strategy and the superior statistical power. The FGE-SKAT was applied to the whole genome sequence data provided by Genetic Analysis Workshop 18 (GAW18) and discovered concordant and discordant regions compared to the methods without considering gene by environment interactions.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Gregory R. Keele ◽  
Jeremy W. Prokop ◽  
Hong He ◽  
Katie Holl ◽  
John Littrell ◽  
...  

AbstractChronic kidney disease (CKD), which can ultimately progress to kidney failure, is influenced by genetics and the environment. Genes identified in human genome wide association studies (GWAS) explain only a small proportion of the heritable variation and lack functional validation, indicating the need for additional model systems. Outbred heterogeneous stock (HS) rats have been used for genetic fine-mapping of complex traits, but have not previously been used for CKD traits. We performed GWAS for urinary protein excretion (UPE) and CKD related serum biochemistries in 245 male HS rats. Quantitative trait loci (QTL) were identified using a linear mixed effect model that tested for association with imputed genotypes. Candidate genes were identified using bioinformatics tools and targeted RNAseq followed by testing in a novel in vitro model of human tubule, hypoxia-induced damage. We identified two QTL for UPE and five for serum biochemistries. Protein modeling identified a missense variant within Septin 8 (Sept8) as a candidate for UPE. Sept8/SEPTIN8 expression increased in HS rats with elevated UPE and tubulointerstitial injury and in the in vitro hypoxia model. SEPTIN8 is detected within proximal tubule cells in human kidney samples and localizes with acetyl-alpha tubulin in the culture system. After hypoxia, SEPTIN8 staining becomes diffuse and appears to relocalize with actin. These data suggest a role of SEPTIN8 in cellular organization and structure in response to environmental stress. This study demonstrates that integration of a rat genetic model with an environmentally induced tubule damage system identifies Sept8/SEPTIN8 and informs novel aspects of the complex gene by environmental interactions contributing to CKD risk.


2016 ◽  
Vol 283 (1835) ◽  
pp. 20160569 ◽  
Author(s):  
M. E. Goddard ◽  
K. E. Kemper ◽  
I. M. MacLeod ◽  
A. J. Chamberlain ◽  
B. J. Hayes

Complex or quantitative traits are important in medicine, agriculture and evolution, yet, until recently, few of the polymorphisms that cause variation in these traits were known. Genome-wide association studies (GWAS), based on the ability to assay thousands of single nucleotide polymorphisms (SNPs), have revolutionized our understanding of the genetics of complex traits. We advocate the analysis of GWAS data by a statistical method that fits all SNP effects simultaneously, assuming that these effects are drawn from a prior distribution. We illustrate how this method can be used to predict future phenotypes, to map and identify the causal mutations, and to study the genetic architecture of complex traits. The genetic architecture of complex traits is even more complex than previously thought: in almost every trait studied there are thousands of polymorphisms that explain genetic variation. Methods of predicting future phenotypes, collectively known as genomic selection or genomic prediction, have been widely adopted in livestock and crop breeding, leading to increased rates of genetic improvement.


Sign in / Sign up

Export Citation Format

Share Document