scholarly journals The power of a multivariate approach to genome-wide association studies: an example with Drosophila melanogaster wing shape

2017 ◽  
Author(s):  
William Pitchers ◽  
Jessica Nye ◽  
Eladio J. Márquez ◽  
Alycia Kowalski ◽  
Ian Dworkin ◽  
...  

AbstractDue to the complexity of genotype-phenotype relationships, simultaneous analyses of genomic associations with multiple traits will be more powerful and more informative than a series of univariate analyses. In most cases, however, studies of genotype-phenotype relationships have analyzed only one trait at a time, even as the rapid advances in molecular tools have expanded our view of the genotype to include whole genomes. Here, we report the results of a fully integrated multivariate genome-wide association analysis of the shape of the Drosophila melanogaster wing in the Drosophila Genetic Reference Panel. Genotypic effects on wing shape were highly correlated between two different labs. We found 2,396 significant SNPs using a 5% FDR cutoff in the multivariate analyses, but just 4 significant SNPs in univariate analyses of scores on the first 20 principal component axes. A key advantage of multivariate analysis is that the direction of the estimated phenotypic effect is much more informative than a univariate one. Exploiting this feature, we show that the directions of effects were on average replicable in an unrelated panel of inbred lines. Effects of knockdowns of genes implicated in the initial screen were on average more similar than expected under a null model. Association studies that take a phenomic approach in considering many traits simultaneously are an important complement to the power of genomics. Multivariate analyses of such data are more powerful, more informative, and allow the unbiased study of pleiotropy.

2019 ◽  
Author(s):  
Michael C. Turchin ◽  
Matthew Stephens

AbstractGenome-wide association studies (GWAS) have now been conducted for hundreds of phenotypes of relevance to human health. Many such GWAS involve multiple closely-related phenotypes collected on the same samples. However, the vast majority of these GWAS have been analyzed using simple univariate analyses, which consider one phenotype at a time. This is de-spite the fact that, at least in simulation experiments, multivariate analyses have been shown to be more powerful at detecting associations. Here, we conduct multivariate association analyses on 13 different publicly-available GWAS datasets that involve multiple closely-related phenotypes. These data include large studies of anthropometric traits (GIANT), plasma lipid traits (GlobalLipids), and red blood cell traits (HaemgenRBC). Our analyses identify many new associations (433 in total across the 13 studies), many of which replicate when follow-up samples are available. Overall, our results demonstrate that multivariate analyses can help make more effective use of data from both existing and future GWAS.1Author SummaryGenome-wide association studies (GWAS) have become a common and powerful tool for identifying significant correlations between markers of genetic variation and physical traits of interest. Often these studies are conducted by comparing genetic variation against single traits one at a time (‘univariate’); however, it has previously been shown that it is possible to increase your power to detect significant associations by comparing genetic variation against multiple traits simultaneously (‘multivariate’). Despite this apparent increase in power though, researchers still rarely conduct multivariate GWAS, even when studies have multiple traits readily available. Here, we reanalyze 13 previously published GWAS using a multivariate method and find >400 additional associations. Our method makes use of univariate GWAS summary statistics and is available as a software package, thus making it accessible to other researchers interested in conducting the same analyses. We also show, using studies that have multiple releases, that our new associations have high rates of replication. Overall, we argue multivariate approaches in GWAS should no longer be overlooked and how, often, there is low-hanging fruit in the form of new associations by running these methods on data already collected.


Animals ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1531
Author(s):  
Yasemin Öner ◽  
Malena Serrano ◽  
Pilar Sarto ◽  
Laura Pilar Iguácel ◽  
María Piquer-Sabanza ◽  
...  

A genome-wide association study (GWAS) was performed to identify new single nucleotide polymorphisms (SNPs) and genes associated with mastitis resistance in Assaf sheep by using the Illumina Ovine Infinium® HD SNP BeadChip (680K). In total, 6173 records from 1894 multiparous Assaf ewes with at least three test day records and aged between 2 and 7 years old were used to estimate a corrected phenotype for somatic cell score (SCS). Then, 192 ewes were selected from the top (n = 96) and bottom (n = 96) tails of the corrected SCS phenotype distribution to be used in a GWAS. Although no significant SNPs were found at the genome level, four SNPs (rs419096188, rs415580501, rs410336647, and rs424642424) were significant at the chromosome level (FDR 10%) in two different regions of OAR19. The SNP rs419096188 was located in intron 1 of the NUP210 and close to the HDAC11 genes (61 kb apart), while the other three SNPs were totally linked and located 171 kb apart from the ARPP21 gene. These three genes were related to the immune system response. These results were validated in two SNPs (rs419096188 and rs424642424) in the total population (n = 1894) by Kompetitive Allele-Specific PCR (KASP) genotyping. Furthermore, rs419096188 was also associated with lactose content.


2019 ◽  
Author(s):  
Bongsong Kim

AbstractPopulation structure is widely perceived as a noise factor that undermines the quality of association between an SNP variable and a phenotypic variable in genome-wide association studies (GWAS). The linear model for GWAS generally accounts for population-structure variables to obtain the adjusted phenotype which has less noise. Its result is known to amplify the contrast between significant SNPs and insignificant SNPs in a resultant Manhattan plot. In fact, however, conventional GWAS practice often implements the linear model in an unusual way in that the population-structure variables are incorporated into the linear model in the form of continuous variables rather than factor variables. If the coefficients for population-structure variables change across all SNPs, then each SNP variable will be regressed against a differently adjusted phenotypic variable, making the GWAS process unreliable. Focusing on this concern, this study investigated whether accounting for population-structure variables in the linear model for GWAS can assure the adjusted phenotypes to be consistent across all SNPs. The result showed that the adjusted phenotypes resulting across all SNPs were not consistent, which is alarming considering conventional GWAS practice that accounts for population structure.


Genes ◽  
2019 ◽  
Vol 10 (6) ◽  
pp. 463 ◽  
Author(s):  
Xiaoming Ma ◽  
Congjun Jia ◽  
Donghai Fu ◽  
Min Chu ◽  
Xuezhi Ding ◽  
...  

Yak (Bos grunniens) is an important domestic animal living in high-altitude plateaus. Due to inadequate disease prevention, each year, the yak industry suffers significant economic losses. The identification of causal genes that affect blood- and immunity-related cells could provide preliminary reference guidelines for the prevention of diseases in the population of yaks. The genome-wide association studies (GWASs) utilizing a single-marker or haplotype method were employed to analyze 15 hematological traits in the genome of 315 unrelated yaks. Single-marker GWASs identified a total of 43 significant SNPs, including 35 suggestive and eight genome-wide significant SNPs, associated with nine traits. Haplotype analysis detected nine significant haplotype blocks, including two genome-wide and seven suggestive blocks, associated with seven traits. The study provides data on the genetic variability of hematological traits in the yak. Five essential genes (GPLD1, EDNRA, APOB, HIST1H1E, and HIST1H2BI) were identified, which affect the HCT, HGB, RBC, PDW, PLT, and RDWSD traits and can serve as candidate genes for regulating hematological traits. The results provide a valuable reference to be used in the analysis of blood properties and immune diseases in the yak.


Animals ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 2259
Author(s):  
Ismail Mohamed Abdalla ◽  
Xubin Lu ◽  
Mudasir Nazar ◽  
Abdelaziz Adam Idriss Arbab ◽  
Tianle Xu ◽  
...  

Feet and leg conformation traits are considered one of the most important economical traits in dairy cattle and have a great impact on the profitability of milk production. Therefore, identifying the single nucleotide polymorphisms (SNPs), genes and pathways analysis associated with these traits might contribute to the genomic selection and long-term plan selection for dairy cattle. We conducted genome-wide association studies (GWASs) using the fixed and random model circulating probability unification (FarmCPU) method to identify SNPs associated with bone quality, heel depth, rear leg side view and rear leg rear view of Chinese Holstein cows. Phenotypic measurements were collected from 1000 individuals of Chinese Holstein cattle and the GeneSeek Genomic Profiler Bovine 100 K SNP chip was utilized for individual genotyping. After quality control, 984 individual cows and 84,906 SNPs remained for GWAS work; as a result, we identified 20 significant SNPs after Bonferroni correction. Several candidate genes were identified within distances of 200 kb upstream or downstream to the significant SNPs, including ADIPOR2, INPP4A, DNMT3A, ALDH1A2, PCDH7, XKR4 and CADPS. Further bioinformatics analyses showed 34 gene ontology terms and two signaling pathways were significantly enriched (p ≤ 0.05). Many terms and pathways are related to biological quality, metabolism and development processes; these identified SNPs and genes could provide useful information about the genetic architecture of feet and leg traits, thus improving the longevity and productivity of Chinese Holstein dairy cattle.


PLoS Genetics ◽  
2012 ◽  
Vol 8 (11) ◽  
pp. e1003057 ◽  
Author(s):  
Michael M. Magwire ◽  
Daniel K. Fabian ◽  
Hannah Schweyen ◽  
Chuan Cao ◽  
Ben Longdon ◽  
...  

2021 ◽  
Author(s):  
Jack W. O’Sullivan ◽  
John P . A. Ioannidis

Abstract With the establishment of large biobanks, discovery of single nucleotide polymorphism (SNPs) associated with various phenotypes has accelerated. An open question is whether genome-wide significant SNPs identified in earlier genome-wide association studies (GWAS) are replicated in later GWAS conducted in biobanks. To address this, we examined a publicly available GWAS database and identified two, independent GWAS on the same phenotype (an earlier, “discovery” GWAS and a later, “replication” GWAS done in UK biobank). The analysis evaluated 136,318,924 SNPs (of which 6,289 reached p<5e-8 in the discovery GWAS) from 4,397,962 participants across nine phenotypes. The overall replication rate was 85.0%; although lower for binary than quantitative phenotypes (58.1% versus 94.8% respectively). There was a 18.0% decrease in SNP effect size for binary phenotypes, but a 12.0% increase for quantitative phenotypes. Using the discovery SNP effect size, phenotype trait (binary or quantitative), and discovery p-value, we built and validated a model that predicted SNP replication with area under the Receiver Operator Curve = 0.90. While non-replication may reflect lack of power rather than genuine false-positives, these results provide insights about which discovered associations are likely to be replicated across subsequent GWAS.


Author(s):  
Nana Matoba ◽  
Michael I. Love ◽  
Jason L. Stein

AbstractHuman brain structure traits have been hypothesized to be broad endophenotypes for neuropsychiatric disorders, implying that brain structure traits are comparatively ‘closer to the underlying biology’. Genome-wide association studies from large sample sizes allow for the comparison of common variant genetic architectures between traits to test the evidence supporting this claim. Endophenotypes, compared to neuropsychiatric disorders, are hypothesized to have less polygenicity, with greater effect size of each susceptible SNP, requiring smaller sample sizes to discover them. Here, we compare polygenicity and discoverability of brain structure traits, neuropsychiatric disorders, and other traits (89 in total) to directly test this hypothesis. We found reduced polygenicity (FDR = 0.01) and increased discoverability of cortical brain structure traits, as compared to neuropsychiatric disorders (FDR = 3.68×10−9). We predict that ~8M samples will be required to explain the full heritability of cortical surface area by genome-wide significant SNPs, whereas sample sizes over 20M will be required to explain the full heritability of major depressive disorder. In conclusion, we find reduced polygenicity and increased discoverability of cortical structure compared to neuropsychiatric disorders, which is consistent with brain structure satisfying the higher power criterion of endophenotypes.


Animals ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 806
Author(s):  
Yang Li ◽  
Lei Pu ◽  
Liangyu Shi ◽  
Hongding Gao ◽  
Pengfei Zhang ◽  
...  

The number of teats is related to the nursing ability of sows. In the present study, we conducted genome-wide association studies (GWAS) for traits related to teat number in Duroc pig population. Two mixed models, one for counted and another for binary phenotypic traits, were employed to analyze seven traits: the right (RTN), left (LTN), and total (TTN) teat numbers; maximum teat number on a side (MAX); left minus right side teat number (LR); the absolute value of LR (ALR); and the presence of symmetry between left and right teat numbers (SLR). We identified 11, 1, 4, 13, and 9 significant SNPs associated with traits RTN, LTN, MAX, TTN, and SLR, respectively. One significant SNP (MARC0038565) was found to be simultaneous associated with RTN, LTN, MAX and TTN. Two annotated genes (VRTN and SYNDIG1L) were located in genomic region around this SNP. Three significant SNPs were shown to be associated with TTN, RTN and MAX traits. Seven significant SNPs were simultaneously detected in two traits of TTN and RTN. Other two SNPs were only identified in TTN. These 13 SNPs were clustered in the genomic region between 96.10—98.09 Mb on chromosome 7. Moreover, nine significant SNPs were shown to be significantly associated with SLR. In total, four and 22 SNPs surpassed genome-wide significance and suggestive significance levels, respectively. Among candidate genes annotated, eight genes have documented association with the teat number relevant traits. Out of them, DPF3 genes on Sus scrofa chromosome (SSC) 7 and the NRP1 gene on SSC 10 were new candidate genes identified in this study. Our findings demonstrate the genetic mechanism of teat number relevant traits and provide a reference to further improve reproductive performances in practical pig breeding programs.


Sign in / Sign up

Export Citation Format

Share Document