Meta-GWAS for quantitative trait loci identification in soybean

Abstract We report a meta-Genome Wide Association Study involving 73 published studies in soybean (Glycine max L. [Merr.]) covering 17,556 unique accessions, with improved statistical power for robust detection of loci associated with a broad range of traits. De novo GWAS and meta-analysis were conducted for composition traits including fatty acid and amino acid composition traits, disease resistance traits, and agronomic traits including seed yield, plant height, stem lodging, seed weight, seed mottling, seed quality, flowering timing, and pod shattering. To examine differences in detectability and test statistical power between single- and multi-environment GWAS, comparison of meta-GWAS results to those from the constituent experiments were performed. Using meta-GWAS analysis and the analysis of individual studies, we report 483 peaks at 393 unique loci. Using stringent criteria to detect significant marker trait associations, 59 candidate genes were identified, including 17 agronomic traits loci, 19 for seed related traits, and 33 for disease reaction traits. This study identified potentially valuable candidate genes that affect multiple traits. The success in narrowing down the genomic region for some loci through overlapping mapping results of multiple studies is a promising avenue for community-based studies and plant breeding applications.

Download Full-text

Meta-GWAS for quantitative trait loci identification in soybean

10.1101/2020.10.17.343707 ◽

2020 ◽

Author(s):

Johnathon M. Shook ◽

Jiaoping Zhang ◽

Sarah E. Jones ◽

Arti Singh ◽

Brian W. Diers ◽

...

Keyword(s):

Quantitative Trait Loci ◽

Candidate Genes ◽

Quantitative Trait ◽

Statistical Power ◽

Genome Wide Association Study ◽

Seed Quality ◽

De Novo ◽

Agronomic Traits ◽

Robust Detection ◽

Trait Loci

ABSTRACTWe report a meta-Genome Wide Association Study involving 73 published studies in soybean (Glycine max L. [Merr.]) covering 17,556 unique accessions, with improved statistical power for robust detection of loci associated with a broad range of traits. De novo GWAS and meta-analysis were conducted for composition traits including fatty acid and amino acid composition traits, disease resistance traits, and agronomic traits including seed yield, plant height, stem lodging, seed weight, seed mottling, seed quality, flowering timing, and pod shattering. To examine differences in detectability and test statistical power between single- and multi-environment GWAS, comparison of meta-GWAS results to those from the constituent experiments were performed. Using meta-GWAS analysis and the analysis of individual studies, we report 483 quantitative trait loci (QTL) at 393 unique loci. Using stringent criteria to detect significant marker trait associations, 66 candidate genes were identified, including 17 candidate genes for agronomic traits, 19 for seed related traits, and 33 for disease reaction traits. This study identified potentially valuable candidate genes that affect multiple traits. The success in narrowing down the genomic region for some loci through overlapping mapping results of multiple studies is a promising avenue for community-based studies and plant breeding applications.

Download Full-text

Genome-wide association study and genomic selection for plant height, maturity, seed weight, and yield in soybean

10.21203/rs.2.17481/v1 ◽

2019 ◽

Author(s):

Waltram Ravelombola ◽

Jun Qin ◽

Ainong Shi ◽

Fengmin Wang ◽

Yan Feng ◽

...

Keyword(s):

Association Study ◽

Genomic Selection ◽

Candidate Genes ◽

Plant Height ◽

Seed Weight ◽

Genome Wide Association Study ◽

Agronomic Traits ◽

Snp Markers ◽

Genome Wide Association ◽

Genome Wide

Abstract Background Soybean [ Glycine max (L.) Merr.] is a legume of great interest worldwide. Enhancing genetic gain for agronomic traits via molecular approaches has been long considered as the main task for soybean breeders and geneticists. The objectives of this study were to evaluate maturity, plant height, seed weight, and yield in a diverse soybean accession panel, to conduct a genome-wide association study (GWAS) for these traits and identify SNP markers associated with the four traits, and to assess genomic selection (GS) accuracy. Results A total of 250 soybean accessions were evaluated for maturity, plant height, seed weight, and yield over three years. This panel was genotyped with a total of 10,259 high quality SNPs postulated from genotyping by sequencing (GBS). GWAS was performed using a Bayesian Information and Linkage Disequilibrium Iteratively Nested Keyway (BLINK) model, and GS was evaluated using a ridge regression best linear unbiased predictor (rrBLUP) model. The results revealed that a total of 20, 31, 37, 31, and 23 SNPs were significantly associated with the average 3-year data for maturity, plant height, seed weight, and yield, respectively; some significant SNPs were mapped into previously described loci ( E2 , E4 , and Dt1 ) affecting maturity and plant height in soybean and a new locus mapped on chromosome 20 was significantly associated with plant height; Glyma.10g228900 , Glyma.19g200800 , Glyma.09g196700 , and Glyma.09g038300 were candidate genes found in the vicinity of the top or the second best SNP for maturity, plant height, seed weight, and yield, respectively; a 11.5-Mb region of chromosome 10 was associated with both seed weight and yield; and GS accuracy was trait-, year-, and population structure-dependent. Conclusions The SNP markers identified from this study for plant height, maturity, seed weight and yield can be used to improve the four agronomic traits through marker-assisted selection (MAS) and GS in soybean breeding programs. After validation, the candidate genes can be transferred to new cultivars using SNP markers through MAS. The high GS accuracy has confirmed that the four agronomic traits can be selected in molecular breeding through GS.

Download Full-text

RAPID COMMUNICATION: Multi-breed validation study unraveled genomic regions associated with puberty traits segregating across tropically adapted breeds1

Journal of Animal Science ◽

10.1093/jas/skz121 ◽

2019 ◽

Vol 97 (7) ◽

pp. 3027-3033 ◽

Cited By ~ 4

Author(s):

Thaise P Melo ◽

Marina R S Fortes ◽

Gerardo A Fernandes Junior ◽

Lucia G Albuquerque ◽

Roberto Carvalheiro

Keyword(s):

Candidate Genes ◽

Genome Wide Association Study ◽

Meta Analysis ◽

Reference Population ◽

High Linkage Disequilibrium ◽

P Value ◽

Early Puberty ◽

Sexual Precocity ◽

Tropical Conditions ◽

Genomic Regions

Abstract An efficient strategy to improve QTL detection power is performing across-breed validation studies. Variants segregating across breeds are expected to be in high linkage disequilibrium (LD) with causal mutations affecting economically important traits. The aim of this study was to validate, in a Tropical Composite cattle (TC) population, QTL associations identified for sexual precocity traits in a Nellore and Brahman meta-analysis genome-wide association study. In total, 2,816 TC, 8,001 Nellore, and 2,210 Brahman animals were available for the analysis. For that, genomic regions significantly associated with puberty traits in the meta-analysis study were validated for the following sexual precocity traits in TC: age at first corpus luteum (AGECL), first postpartum anestrus interval (PPAI), and scrotal circumference at 18 months of age (SC). We considered validated QTL those underpinned by significant markers from the Nellore and Brahman meta-analysis (P ≤ 10–4) that were also significant for a TC trait, i.e., presenting a P-value of ≤10–3 for AGECL, PPAI, or SC. We also considered as validated QTL those regions where significant markers in the reference population were at ±250 kb from significant markers in the validation population. Using this criteria, 49 SNP were validated for AGECL, 4 for PPAI, and 14 for SC, from which 5 were in common with AGECL, totaling 62 validated SNP for these traits and 30 candidate genes surrounding them. Considering just candidate genes closest to the top SNP of each chromosome, for AGECL 8 candidate genes were identified: COL8A1, PENK, ENSBTAG00000047425, BPNT1, ADAMTS17, CCHCR1, SUFU, and ENSBTAG00000046374. For PPAI, 3 genes emerged as candidates (PCBP3, KCNK10, and MRPS5), and for SC 8 candidate genes were identified (SNORA70, TRAC, ASS1, BPNT1, LRRK1, PKHD1, PTPRM, and ENSBTAG00000045690). Several candidate regions presented here were previously associated with puberty traits in cattle. The majority of emerging candidate genes are related to biological processes involved in reproductive events, such as maintenance of gestation, and some are known to be expressed in reproductive tissues. Our results suggested that some QTL controlling early puberty seem to be segregating across cattle breeds adapted to tropical conditions.

Download Full-text

Genome-Wide Association Study Identifying Candidate Genes Influencing Important Agronomic Traits of Flax (Linum usitatissimum L.) Using SLAF-seq

Frontiers in Plant Science ◽

10.3389/fpls.2017.02232 ◽

2018 ◽

Vol 8 ◽

Cited By ~ 10

Author(s):

Dongwei Xie ◽

Zhigang Dai ◽

Zemao Yang ◽

Jian Sun ◽

Debao Zhao ◽

...

Keyword(s):

Association Study ◽

Candidate Genes ◽

Linum Usitatissimum ◽

Genome Wide Association Study ◽

Agronomic Traits ◽

Genome Wide Association ◽

Linum Usitatissimum L ◽

Genome Wide

Download Full-text

M-DATA: A statistical approach to jointly analyzing de novo mutations for multiple traits

PLoS Genetics ◽

10.1371/journal.pgen.1009849 ◽

2021 ◽

Vol 17 (11) ◽

pp. e1009849

Author(s):

Yuhan Xie ◽

Mo Li ◽

Weilai Dong ◽

Wei Jiang ◽

Hongyu Zhao

Keyword(s):

Statistical Power ◽

De Novo ◽

Expectation Maximization Algorithm ◽

Joint Analysis ◽

De Novo Mutations ◽

Multiple Traits ◽

Disease Etiology ◽

Functional Annotations ◽

Degree Of Association ◽

Shared Risk

Recent studies have demonstrated that multiple early-onset diseases have shared risk genes, based on findings from de novo mutations (DNMs). Therefore, we may leverage information from one trait to improve statistical power to identify genes for another trait. However, there are few methods that can jointly analyze DNMs from multiple traits. In this study, we develop a framework called M-DATA (Multi-trait framework for De novo mutation Association Test with Annotations) to increase the statistical power of association analysis by integrating data from multiple correlated traits and their functional annotations. Using the number of DNMs from multiple diseases, we develop a method based on an Expectation-Maximization algorithm to both infer the degree of association between two diseases as well as to estimate the gene association probability for each disease. We apply our method to a case study of jointly analyzing data from congenital heart disease (CHD) and autism. Our method was able to identify 23 genes for CHD from joint analysis, including 12 novel genes, which is substantially more than single-trait analysis, leading to novel insights into CHD disease etiology.

Download Full-text

Genomic insights into the origin, domestication and genetic basis of agronomic traits of castor bean

Genome Biology ◽

10.1186/s13059-021-02333-y ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Wei Xu ◽

Di Wu ◽

Tianquan Yang ◽

Chao Sun ◽

Zaiqing Wang ◽

...

Keyword(s):

Castor Bean ◽

Genetic Basis ◽

Genome Wide Association Study ◽

De Novo ◽

Agronomic Traits ◽

Ricinus Communis ◽

Oilseed Crop ◽

A Genome ◽

Trait Locus ◽

Wild Progenitors

Abstract Background Castor bean (Ricinus communis L.) is an important oil crop, which belongs to the Euphorbiaceae family. The seed oil of castor bean is currently the only commercial source of ricinoleic acid that can be used for producing about 2000 industrial products. However, it remains largely unknown regarding the origin, domestication, and the genetic basis of key traits of castor bean. Results Here we perform a de novo chromosome-level genome assembly of the wild progenitor of castor bean. By resequencing and analyzing 505 worldwide accessions, we reveal that the accessions from East Africa are the extant wild progenitors of castor bean, and the domestication occurs ~ 3200 years ago. We demonstrate that significant genetic differentiation between wild populations in Kenya and Ethiopia is associated with past climate fluctuation in the Turkana depression ~ 7000 years ago. This dramatic change in climate may have caused the genetic bottleneck in wild castor bean populations. By a genome-wide association study, combined with quantitative trait locus analysis, we identify important candidate genes associated with plant architecture and seed size. Conclusions This study provides novel insights of domestication and genome evolution of castor bean, which facilitates genomics-based breeding of this important oilseed crop and potentially other tree-like crops in future.

Download Full-text

Genome-wide association study uncovers new genetic loci and candidate genes underlying seed chilling-germination in maize

PeerJ ◽

10.7717/peerj.11707 ◽

2021 ◽

Vol 9 ◽

pp. e11707

Author(s):

Yinchao Zhang ◽

Peng Liu ◽

Chen Wang ◽

Na Zhang ◽

Yuxiao Zhu ◽

...

Keyword(s):

Abiotic Stress ◽

Seed Germination ◽

Association Study ◽

Candidate Genes ◽

Molecular Mechanisms ◽

Genome Wide Association Study ◽

Chilling Stress ◽

Nucleotide Polymorphisms ◽

Plant Tolerance ◽

Multiple Traits

As one of the major crops, maize (Zea mays L.) is mainly distributed in tropical and temperate regions. However, with the changes of the environments, chilling stress has become a significantly abiotic stress affecting seed germination and thus the reproductive and biomass accumulation of maize. Herein, we investigated five seed germination-related phenotypes among 300 inbred lines under low-temperature condition (10 °C). By combining 43,943 single nucleotide polymorphisms (SNPs), a total of 15 significant (P < 2.03 × 10-6) SNPs were identified to correlate with seed germination under cold stress based on the FarmCPU model in GWAS, among which three loci were repeatedly associated with multiple traits. Ten gene models were closely linked to these three variations, among which Zm00001d010454, Zm00001d010458, Zm00001d010459, and Zm00001d050021 were further verified by candidate gene association study and expression pattern analysis. Importantly, these candidate genes were previously reported to involve plant tolerance to chilling stress and other abiotic stress. Our findings contribute to the understanding of the genetic and molecular mechanisms underlying chilling germination in maize.

Download Full-text

A genome‐wide association study approach to the identification of candidate genes underlying agronomic traits in alfalfa ( Medicago sativa L.)

Plant Biotechnology Journal ◽

10.1111/pbi.13251 ◽

2019 ◽

Vol 18 (3) ◽

pp. 611-613 ◽

Cited By ~ 2

Author(s):

Zan Wang ◽

Xuemin Wang ◽

Han Zhang ◽

Lin Ma ◽

Haiming Zhao ◽

...

Keyword(s):

Medicago Sativa ◽

Association Study ◽

Candidate Genes ◽

Genome Wide Association Study ◽

Agronomic Traits ◽

Genome Wide Association ◽

Study Approach ◽

Genome Wide ◽

A Genome ◽

Medicago Sativa L

Download Full-text

Pathway-based analysis of anthocyanin diversity in diploid potato

PLoS ONE ◽

10.1371/journal.pone.0250861 ◽

2021 ◽

Vol 16 (4) ◽

pp. e0250861

Author(s):

Maria Angelica Parra-Galindo ◽

Johana Carolina Soto-Sedano ◽

Teresa Mosquera-Vásquez ◽

Federico Roda

Keyword(s):

Genome Wide Association Study ◽

Agronomic Traits ◽

Anthocyanin Biosynthesis ◽

Genomic Region ◽

Myb Transcription Factor ◽

Anthocyanin Content ◽

Gene Encoding ◽

Potential Health ◽

A Genome ◽

Dioxygenase Gene

Anthocyanin biosynthesis is one of the most studied pathways in plants due to the important ecological role played by these compounds and the potential health benefits of anthocyanin consumption. Given the interest in identifying new genetic factors underlying anthocyanin content we studied a diverse collection of diploid potatoes by combining a genome-wide association study and pathway-based analyses. By using an expanded SNP dataset, we identified candidate genes that had not been associated with anthocyanin variation in potatoes, namely a Myb transcription factor, a Leucoanthocyanidin dioxygenase gene and a vacuolar membrane protein. Importantly, a genomic region in chromosome 10 harbored the SNPs with strongest associations with anthocyanin content in GWAS. Some of these SNPs were associated with multiple anthocyanin compounds and therefore could underline the existence of pleiotropic genes or anthocyanin biosynthetic clusters. We identified multiple anthocyanin homologs in this genomic region, including four transcription factors and five enzymes that could be governing anthocyanin variation. For instance, a SNP linked to the phenylalanine ammonia-lyase gene, encoding the first enzyme in the phenylpropanoid biosynthetic pathway, was associated with all of the five anthocyanins measured. Finally, we combined a pathway analysis and GWAS of other agronomic traits to identify pathways related to anthocyanin biosynthesis in potatoes. We found that methionine metabolism and the production of sugars and hydroxycinnamic acids are genetically correlated to anthocyanin biosynthesis. The results contribute to the understanding of anthocyanins regulation in potatoes and can be used in future breeding programs focused on nutraceutical food.

Download Full-text

Meta-analysis of 2,104 trios provides support for 10 novel candidate genes for intellectual disability

10.1101/052670 ◽

2016 ◽

Cited By ~ 2

Author(s):

Stefan H. Lelieveld ◽

Margot R.F. Reijnders ◽

Rolph Pfundt ◽

Helger G. Yntema ◽

Erik-Jan Kamsteeg ◽

...

Keyword(s):

Intellectual Disability ◽

Candidate Genes ◽

De Novo ◽

Meta Analysis ◽

Statistical Analyses ◽

Id Genes

ABSTRACTTo identify novel candidate intellectual disability genes, we performed a meta-analysis on 2,637 de novo mutations, identified from the exomes of 2,104 ID trios. Statistical analyses identified 10 novel candidate ID genes, including DLG4, PPM1D, RAC1, SMAD6, SON, SOX5, SYNCRIP, TCF20, TLK2 and TRIP12. In addition, we show that these genes are intolerant to non-synonymous variation, and that mutations in these genes are associated with specific clinical ID phenotypes.

Download Full-text