scholarly journals Leveraging breeding programs and genomic data in Norway spruce (Picea abies L. Karst) for GWAS analysis

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Zhi-Qiang Chen ◽  
Yanjun Zan ◽  
Pascal Milesi ◽  
Linghua Zhou ◽  
Jun Chen ◽  
...  

Abstract Background Genome-wide association studies (GWAS) identify loci underlying the variation of complex traits. One of the main limitations of GWAS is the availability of reliable phenotypic data, particularly for long-lived tree species. Although an extensive amount of phenotypic data already exists in breeding programs, accounting for its high heterogeneity is a great challenge. We combine spatial and factor-analytics analyses to standardize the heterogeneous data from 120 field experiments of 483,424 progenies of Norway spruce to implement the largest reported GWAS for trees using 134 605 SNPs from exome sequencing of 5056 parental trees. Results We identify 55 novel quantitative trait loci (QTLs) that are associated with phenotypic variation. The largest number of QTLs is associated with the budburst stage, followed by diameter at breast height, wood quality, and frost damage. Two QTLs with the largest effect have a pleiotropic effect for budburst stage, frost damage, and diameter and are associated with MAP3K genes. Genotype data called from exome capture, recently developed SNP array and gene expression data indirectly support this discovery. Conclusion Several important QTLs associated with growth and frost damage have been verified in several southern and northern progeny plantations, indicating that these loci can be used in QTL-assisted genomic selection. Our study also demonstrates that existing heterogeneous phenotypic data from breeding programs, collected over several decades, is an important source for GWAS and that such integration into GWAS should be a major area of inquiry in the future.

BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Fernando P. Guerra ◽  
Haktan Suren ◽  
Jason Holliday ◽  
James H. Richards ◽  
Oliver Fiehn ◽  
...  

Abstract Background Populus trichocarpa is an important forest tree species for the generation of lignocellulosic ethanol. Understanding the genomic basis of biomass production and chemical composition of wood is fundamental in supporting genetic improvement programs. Considerable variation has been observed in this species for complex traits related to growth, phenology, ecophysiology and wood chemistry. Those traits are influenced by both polygenic control and environmental effects, and their genome architecture and regulation are only partially understood. Genome wide association studies (GWAS) represent an approach to advance that aim using thousands of single nucleotide polymorphisms (SNPs). Genotyping using exome capture methodologies represent an efficient approach to identify specific functional regions of genomes underlying phenotypic variation. Results We identified 813 K SNPs, which were utilized for genotyping 461 P. trichocarpa clones, representing 101 provenances collected from Oregon and Washington, and established in California. A GWAS performed on 20 traits, considering single SNP-marker tests identified a variable number of significant SNPs (p-value < 6.1479E-8) in association with diameter, height, leaf carbon and nitrogen contents, and δ15N. The number of significant SNPs ranged from 2 to 220 per trait. Additionally, multiple-marker analyses by sliding-windows tests detected between 6 and 192 significant windows for the analyzed traits. The significant SNPs resided within genes that encode proteins belonging to different functional classes as such protein synthesis, energy/metabolism and DNA/RNA metabolism, among others. Conclusions SNP-markers within genes associated with traits of importance for biomass production were detected. They contribute to characterize the genomic architecture of P. trichocarpa biomass required to support the development and application of marker breeding technologies.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
J. Baison ◽  
Linghua Zhou ◽  
Nils Forsberg ◽  
Tommy Mörling ◽  
Thomas Grahn ◽  
...  

Abstract Through the use of genome-wide association studies (GWAS) mapping it is possible to establish the genetic basis of phenotypic trait variation. Our GWAS study presents the first such effort in Norway spruce (Picea abies (L). Karst.) for the traits related to wood tracheid characteristics. The study employed an exome capture genotyping approach that generated 178 101 Single Nucleotide Polymorphisms (SNPs) from 40 018 probes within a population of 517 Norway spruce mother trees. We applied a least absolute shrinkage and selection operator (LASSO) based association mapping method using a functional multi-locus mapping approach, with a stability selection probability method as the hypothesis testing approach to determine significant Quantitative Trait Loci (QTLs). The analysis has provided 30 significant associations, the majority of which show specific expression in wood-forming tissues or high ubiquitous expression, potentially controlling tracheids dimensions, their cell wall thickness and microfibril angle. Among the most promising candidates based on our results and prior information for other species are: Picea abies BIG GRAIN 2 (PabBG2) with a predicted function in auxin transport and sensitivity, and MA_373300g0010 encoding a protein similar to wall-associated receptor kinases, which were both associated with cell wall thickness. The results demonstrate feasibility of GWAS to identify novel candidate genes controlling industrially-relevant tracheid traits in Norway spruce.


2021 ◽  
Vol 118 (25) ◽  
pp. e2023184118
Author(s):  
Yuchang Wu ◽  
Xiaoyuan Zhong ◽  
Yunong Lin ◽  
Zijie Zhao ◽  
Jiawen Chen ◽  
...  

Marginal effect estimates in genome-wide association studies (GWAS) are mixtures of direct and indirect genetic effects. Existing methods to dissect these effects require family-based, individual-level genetic, and phenotypic data with large samples, which is difficult to obtain in practice. Here, we propose a statistical framework to estimate direct and indirect genetic effects using summary statistics from GWAS conducted on own and offspring phenotypes. Applied to birth weight, our method showed nearly identical results with those obtained using individual-level data. We also decomposed direct and indirect genetic effects of educational attainment (EA), which showed distinct patterns of genetic correlations with 45 complex traits. The known genetic correlations between EA and higher height, lower body mass index, less-active smoking behavior, and better health outcomes were mostly explained by the indirect genetic component of EA. In contrast, the consistently identified genetic correlation of autism spectrum disorder (ASD) with higher EA resides in the direct genetic component. A polygenic transmission disequilibrium test showed a significant overtransmission of the direct component of EA from healthy parents to ASD probands. Taken together, we demonstrate that traditional GWAS approaches, in conjunction with offspring phenotypic data collection in existing cohorts, could greatly benefit studies on genetic nurture and shed important light on the interpretation of genetic associations for human complex traits.


2018 ◽  
Author(s):  
Pascal Milesi ◽  
Mats Berlin ◽  
Jun Chen ◽  
Marion Orsucci ◽  
Lili Li ◽  
...  

AbstractNorway spruce (Picea abies) is a dominant conifer species of major economic importance in Northern Europe. Extensive breeding programs were established to improve phenotypic traits of interest. In southern Sweden seeds used to create progeny tests were collected on about 3000 trees of outstanding phenotype (“plus” trees) across the region. Some were of local origin but many were recent introductions from the rest of the natural range. The mixed origin of the trees together with partial sequencing of the exome of >1,500 of these trees and phenotypic data retrieved from the Swedish breeding program offered us a unique opportunity to dissect the genetic basis of local adaptation of three quantitative traits (height, diameter and budburst). Through a combination of multivariate analyses and genome-wide association studies, we showed that there was a very strong effect of geographical origin on growth (height and diameter) and phenology (budburst) with trees from southern origins outperforming local provenances. Association studies also indicated that growth traits were highly polygenic and budburst somewhat less. Hence, our results suggest that assisted gene flow and genomic selection approaches could help alleviating the effect of climate change on P. abies breeding programs in Sweden.


2019 ◽  
Author(s):  
Fernando P. Guerra ◽  
Haktan Suren ◽  
Jason Holliday ◽  
James H. Richards ◽  
Oliver Fiehn ◽  
...  

Abstract Background: Populus trichocarpa is an important forest tree species for the generation of lignocellulosic ethanol. Understanding the genomic basis of biomass production and chemical composition of wood is fundamental in supporting genetic improvement programs. Considerable variation has been observed in this species for complex traits related to growth, phenology, ecophysiology and wood chemistry. Those traits are influenced by both polygenic control and environmental effects, and their genome architecture and regulation are only partially understood. Genome wide association studies (GWAS) represent an approach to advance that aim using thousands of single nucleotide polymorphisms (SNPs). Genotyping using exome capture methodologies represent an efficient approach to perform GWAS. Results: A GWAS using 461 P. trichocarpa clones, representing 101 provenances collected from Oregon and Washington, and 813K single nucleotide polymorphisms (SNPs), identified a variable number of significant SNPs in association with the assessed traits. Associated single-markers (q< 0.1) ranged from 3 to 110 per trait. The SNPs had a cumulative effect of up to 40.6% of the phenotypic variation of any given trait. Similarly, multiple-marker analyses detected between 16 and 291 significant windows for the phenotypes. The SNPs resided within genes that encode proteins belonging to different functional classes as well as in intergenic regions. Conclusion: SNP-markers within and proximal to genes associated with traits of importance for biomass production were detected. They contribute to characterize the genomic architecture of P. trichocarpa biomass required to support the development and application of marker breeding technologies.


Author(s):  
Yuchang Wu ◽  
Xiaoyuan Zhong ◽  
Yunong Lin ◽  
Zijie Zhao ◽  
Jiawen Chen ◽  
...  

AbstractMarginal effect estimates in genome-wide association studies (GWAS) are mixtures of direct and indirect genetic effects. Existing methods to dissect these effects require family-based, individual-level genetic and phenotypic data with large samples, which is difficult to obtain in practice. Here, we propose a novel statistical framework to estimate direct and indirect genetic effects using summary statistics from GWAS conducted on own and offspring phenotypes. Applied to birth weight, our method showed nearly identical results with those obtained using individual-level data. We also decomposed direct and indirect genetic effects of educational attainment (EA), which showed distinct patterns of genetic correlations with 45 complex traits. The known genetic correlations between EA and higher height, lower BMI, less active smoking behavior, and better health outcomes were mostly explained by the indirect genetic component of EA. In contrast, the consistently identified genetic correlation of autism spectrum disorder (ASD) with higher EA resides in the direct genetic component. Polygenic transmission disequilibrium test showed a significant over-transmission of the direct component of EA from healthy parents to ASD probands. Taken together, we demonstrate that traditional GWAS approaches, in conjunction with offspring phenotypic data collection in existing cohorts, could greatly benefit studies on genetic nurture and shed important light on the interpretation of genetic associations for human complex traits.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Chao-Yu Guo ◽  
Reng-Hong Wang ◽  
Hsin-Chou Yang

AbstractAfter the genome-wide association studies (GWAS) era, whole-genome sequencing is highly engaged in identifying the association of complex traits with rare variations. A score-based variance-component test has been proposed to identify common and rare genetic variants associated with complex traits while quickly adjusting for covariates. Such kernel score statistic allows for familial dependencies and adjusts for random confounding effects. However, the etiology of complex traits may involve the effects of genetic and environmental factors and the complex interactions between genes and the environment. Therefore, in this research, a novel method is proposed to detect gene and gene-environment interactions in a complex family-based association study with various correlated structures. We also developed an R function for the Fast Gene-Environment Sequence Kernel Association Test (FGE-SKAT), which is freely available as supplementary material for easy GWAS implementation to unveil such family-based joint effects. Simulation studies confirmed the validity of the new strategy and the superior statistical power. The FGE-SKAT was applied to the whole genome sequence data provided by Genetic Analysis Workshop 18 (GAW18) and discovered concordant and discordant regions compared to the methods without considering gene by environment interactions.


Nature ◽  
2021 ◽  
Vol 590 (7845) ◽  
pp. 290-299 ◽  
Author(s):  
Daniel Taliun ◽  
◽  
Daniel N. Harris ◽  
Michael D. Kessler ◽  
Jedidiah Carlson ◽  
...  

AbstractThe Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.


Animals ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 599
Author(s):  
Miguel A. Gutierrez-Reinoso ◽  
Pedro M. Aponte ◽  
Manuel Garcia-Herreros

Genomics comprises a set of current and valuable technologies implemented as selection tools in dairy cattle commercial breeding programs. The intensive progeny testing for production and reproductive traits based on genomic breeding values (GEBVs) has been crucial to increasing dairy cattle productivity. The knowledge of key genes and haplotypes, including their regulation mechanisms, as markers for productivity traits, may improve the strategies on the present and future for dairy cattle selection. Genome-wide association studies (GWAS) such as quantitative trait loci (QTL), single nucleotide polymorphisms (SNPs), or single-step genomic best linear unbiased prediction (ssGBLUP) methods have already been included in global dairy programs for the estimation of marker-assisted selection-derived effects. The increase in genetic progress based on genomic predicting accuracy has also contributed to the understanding of genetic effects in dairy cattle offspring. However, the crossing within inbred-lines critically increased homozygosis with accumulated negative effects of inbreeding like a decline in reproductive performance. Thus, inaccurate-biased estimations based on empirical-conventional models of dairy production systems face an increased risk of providing suboptimal results derived from errors in the selection of candidates of high genetic merit-based just on low-heritability phenotypic traits. This extends the generation intervals and increases costs due to the significant reduction of genetic gains. The remarkable progress of genomic prediction increases the accurate selection of superior candidates. The scope of the present review is to summarize and discuss the advances and challenges of genomic tools for dairy cattle selection for optimizing breeding programs and controlling negative inbreeding depression effects on productivity and consequently, achieving economic-effective advances in food production efficiency. Particular attention is given to the potential genomic selection-derived results to facilitate precision management on modern dairy farms, including an overview of novel genome editing methodologies as perspectives toward the future.


Sign in / Sign up

Export Citation Format

Share Document