scholarly journals Trait Association and Prediction Through Integrative K-mer Analysis

2021 ◽  
Author(s):  
Cheng He ◽  
Jacob D Washburn ◽  
Yangfan Hao ◽  
Zhiwu Zhang ◽  
Jinliang Yang ◽  
...  

Genome-wide association study (GWAS) with single nucleotide polymorphisms (SNPs) has been widely used to explore genetic controls of phenotypic traits. Here we employed an GWAS approach using k-mers, short substrings from sequencing reads. Using maize cob and kernel color traits, we demonstrated that k-mer GWAS can effectively identify associated k-mers. Co-expression analysis of kernel color k-mers and pathway genes directly found k-mers from causal genes. Analyzing complex traits of kernel oil and leaf angle resulted in k-mers from both known and candidate genes. Evolution analysis revealed most k-mers positively correlated with kernel oil were strongly selected against in maize populations, while most k-mers for upright leaf angle were positively selected. In addition, phenotypic prediction of kernel oil, leaf angle, and flowering time using k-mer data showed at least a similarly high prediction accuracy to the standard SNP-based method. Collectively, our results demonstrated the bridging role of k-mers for data integration and functional gene discovery.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Alvaro N. Barbeira ◽  
◽  
Rodrigo Bonazzola ◽  
Eric R. Gamazon ◽  
Yanyu Liang ◽  
...  

AbstractThe resources generated by the GTEx consortium offer unprecedented opportunities to advance our understanding of the biology of human diseases. Here, we present an in-depth examination of the phenotypic consequences of transcriptome regulation and a blueprint for the functional interpretation of genome-wide association study-discovered loci. Across a broad set of complex traits and diseases, we demonstrate widespread dose-dependent effects of RNA expression and splicing. We develop a data-driven framework to benchmark methods that prioritize causal genes and find no single approach outperforms the combination of multiple approaches. Using colocalization and association approaches that take into account the observed allelic heterogeneity of gene expression, we propose potential target genes for 47% (2519 out of 5385) of the GWAS loci examined.


2021 ◽  
Vol 15 ◽  
Author(s):  
Hayley S. Mountford ◽  
Amanda Hill ◽  
Anna L. Barnett ◽  
Dianne F. Newbury

The ability to finely control our movement is key to achieving many of the educational milestones and life-skills we develop throughout our lives. Despite the centrality of coordination to early development, there is a vast gap in our understanding of the underlying biology. Like most complex traits, both genetics and environment influence motor coordination, however, the specific genes, early environmental risk factors and molecular pathways are unknown. Previous studies have shown that about 5% of school-age children experience unexplained difficulties with motor coordination. These children are said to have Developmental Coordination Disorder (DCD). For children with DCD, these motor coordination difficulties significantly impact their everyday life and learning. DCD is associated with poorer academic achievement, reduced quality of life, it can constrain career opportunities and increase the risk of mental health issues in adulthood. Despite the high prevalence of coordination difficulties, many children remain undiagnosed by healthcare professionals. Compounding under-diagnosis in the clinic, research into the etiology of DCD is severely underrepresented in the literature. Here we present the first genome-wide association study to examine the genetic basis of early motor coordination in the context of motor difficulties. Using data from the Avon Longitudinal Study of Parents and Children we generate a derived measure of motor coordination from four components of the Movement Assessment Battery for Children, providing an overall measure of coordination across the full range of ability. We perform the first genome-wide association analysis focused on motor coordination (N = 4542). No single nucleotide polymorphisms (SNPs) met the threshold for genome-wide significance, however, 59 SNPs showed suggestive associations. Three regions contained multiple suggestively associated SNPs, within five preliminary candidate genes: IQSEC1, LRCC1, SYNJ2B2, ADAM20, and ADAM21. Association to the gene IQSEC1 suggests a potential link to axon guidance and dendritic projection processes as a potential underlying mechanism of motor coordination difficulties. This represents an interesting potential mechanism, and whilst further validation is essential, it generates a direct window into the biology of motor coordination difficulties. This research has identified potential biological drivers of DCD, a first step towards understanding this common, yet neglected neurodevelopmental disorder.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Y. Tilahun ◽  
T. A. Gipson ◽  
T. Alexander ◽  
M. L. McCallum ◽  
P. R. Hoyt

This paper reports an exploratory study based on quantitative genomic analysis in dairy traits of American Alpine goats. The dairy traits are quality-determining components in goat milk, cheese, ice cream, etc. Alpine goat phenotypes for quality components have been routinely recorded for many years and deposited in the Council on Dairy Cattle Breeding (CDCB) repository. The data collected were used to conduct an exploratory genome-wide association study (GWAS) from 72 female Alpine goats originating from locations throughout the U.S. Genotypes were identified with the Illumina Goat 50K single-nucleotide polymorphisms (SNP) BeadChip. The analysis used a polygenic model where the dropping criterion was a call rate≥0.95. The initial dataset was composed of ~60,000 rows of SNPs and 21 columns of phenotypic traits and composed of 53,384 scaffolds containing other informative data points used for genomic predictive power. Phenotypic association with the 50K BeadChip revealed 26,074 reads of candidate genes. These candidate genes segregated as separate novel SNPs and were identified as statistically significant regions for genome and chromosome level trait associations. Candidate genes associated differently for each of the following phenotypic traits: test day milk yield (13,469 candidate genes), test day protein yield (25,690 candidate genes), test day fat yield (25,690 candidate genes), percentage protein (25,690 candidate genes), percentage fat (25,690 candidate genes), and percentage lactose content (25,690 candidate genes). The outcome of this study supports elucidation of novel genes that are important for livestock species in association to key phenotypic traits. Validation towards the development of marker-based selection that provides precision breeding methods will thereby increase the breeding value.


2019 ◽  
Author(s):  
Alvaro N Barbeira ◽  
Rodrigo Bonazzola ◽  
Eric R Gamazon ◽  
Yanyu Liang ◽  
YoSon Park ◽  
...  

AbstractThe resources generated by the GTEx consortium offer unprecedented opportunities to advance our understanding of the biology of human diseases. Here, we present an in-depth examination of the phenotypic consequences of transcriptome regulation and a blueprint for the functional interpretation of genome-wide association study-discovered loci. Across a broad set of complex traits and diseases, we demonstrate widespread dose-dependent effects of RNA expression and splicing. We develop a data-driven framework to benchmark methods that prioritize causal genes and find no single approach outperforms the combination of multiple approaches. Using colocalization and association approaches that take into account the observed allelic heterogeneity of gene expression, we propose potential target genes for 47% (2,519 out of 5,385) of the GWAS loci examined. Our results demonstrate the translational relevance of the GTEx resources and highlight the need to increase their resolution and breadth to further our understanding of the genotype-phenotype link.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 1002
Author(s):  
Yagoub Adam ◽  
Chaimae Samtal ◽  
Jean-tristan Brandenburg ◽  
Oluwadamilare Falola ◽  
Ezekiel Adebiyi

Genome-wide association studies (GWAS) provide  huge information on statistically significant single-nucleotide polymorphisms (SNPs) associated with various human complex traits and diseases. By performing GWAS studies, scientists have successfully identified the association of hundreds of thousands to  millions of SNPs to a single phenotype. Moreover, the association of some SNPs with rare diseases has been intensively tested. However, classic GWAS studies have not yet provided solid, knowledgeable insight into functional and biological mechanisms underlying phenotypes or mechanisms of diseases. Therefore, several post-GWAS (pGWAS) methods have been recommended. Currently, there is no simple scientific document to provide a quick guide for performing pGWAS analysis. pGWAS is a crucial step for a better understanding of the biological machinery beyond the SNPs. Here, we provide an overview to performing pGWAS analysis and demonstrate the challenges behind each method. Furthermore, we direct readers to key articles for each pGWAS method and present the overall issues in pGWAS analysis.  Finally, we include a custom pGWAS pipeline to guide new users when performing their research.


2021 ◽  
Vol 22 (13) ◽  
pp. 6722
Author(s):  
Do Yoon Hyun ◽  
Raveendar Sebastin ◽  
Gi-An Lee ◽  
Kyung Jun Lee ◽  
Seong-Hoon Kim ◽  
...  

Melon (Cucumis melo L.) is an economically important horticultural crop with abundant morphological and genetic variability. Complex genetic variations exist even among melon varieties and remain unclear to date. Therefore, unraveling the genetic variability among the three different melon varieties, muskmelon (C. melo subsp. melo), makuwa (C. melo L. var. makuwa), and cantaloupes (C. melo subsp. melo var. cantalupensis), could provide a basis for evolutionary research. In this study, we attempted a systematic approach with genotyping-by-sequencing (GBS)-derived single nucleotide polymorphisms (SNPs) to reveal the genetic structure and diversity, haplotype differences, and marker-based varieties differentiation. A total of 6406 GBS-derived SNPs were selected for the diversity analysis, in which the muskmelon varieties showed higher heterozygote SNPs. Linkage disequilibrium (LD) decay varied significantly among the three melon varieties, in which more rapid LD decay was observed in muskmelon (r2 = 0.25) varieties. The Bayesian phylogenetic tree provided the intraspecific relationships among the three melon varieties that formed, as expected, individual clusters exhibiting the greatest genetic distance based on the posterior probability. The haplotype analysis also supported the phylogeny result by generating three major networks for 48 haplotypes. Further investigation for varieties discrimination allowed us to detect a total of 52 SNP markers that discriminated muskmelon from makuwa varieties, of which two SNPs were converted into cleaved amplified polymorphic sequence markers for practical use. In addition to these markers, the genome-wide association study identified two SNPs located in the genes on chromosome 6, which were significantly associated with the phenotypic traits of melon seed. This study demonstrated that a systematic approach using GBS-derived SNPs could serve to efficiently classify and manage the melon varieties in the genebank.


2020 ◽  
Vol 24 (8) ◽  
pp. 836-843
Author(s):  
A. Y. Krivoruchko ◽  
O. A. Yatsyk ◽  
E. Y. Safaryan

Genome-wide association studies allow identification of loci and polymorphisms associated with the formation of relevant phenotypes. When conducting a full genome analysis of sheep, particularly promising is the study of individuals with outstanding productivity indicators – exhibition animals, representatives of the super-elite class. The aim of this study was to identify new candidate genes for economically valuable traits based on the search for single nucleotide polymorphisms (SNPs) associated with belonging to different evaluation classes in rams of the Russian meat merino breed. Animal genotyping was performed using Ovine Infinium HD BeadChip 600K DNA, association search was performed using PLINK v. 1.07 software. Highly reliable associations were found between animals belonging to different evaluation classes and the frequency of occurrence of individual SNPs on chromosomes 2, 6, 10, 13, and 20. Most of the substitutions with high association reliability are concentrated on chromosome 10 in the region 10: 30859297–31873769. To search for candidate genes, 15 polymorphisms with the highest association reliability were selected (–log10(р) > 9). Determining the location of the analyzed SNPs relative to the latest annotation Oar_rambouillet_v1.0 allowed to identify 11 candidate genes presumably associated with the formation of a complex of phenotypic traits of animals in the exhibition group: RXFP2, ALOX5AP, MEDAG, OPN5, PRDM5, PTPRT, TRNAS-GGA, EEF1A1, FRY, ZBTB21-like, and B3GLCT-like. The listed genes encode proteins involved in the control of the cell cycle and DNA replication, regulation of cell proliferation and apoptosis, lipid and carbohydrate metabolism, the development of the inflammatory process and the work of circadian rhythms. Thus, the candidate genes under consideration can influence the formation of exterior features and productive qualities of sheep. However, further research is needed to confirm the influence of genes and determine the exact mechanisms for implementing this influence on the phenotype.


2020 ◽  
Vol 11 ◽  
Author(s):  
Waldiodio Seck ◽  
Davoud Torkamaneh ◽  
François Belzile

Increasing the understanding genetic basis of the variability in root system architecture (RSA) is essential to improve resource-use efficiency in agriculture systems and to develop climate-resilient crop cultivars. Roots being underground, their direct observation and detailed characterization are challenging. Here, were characterized twelve RSA-related traits in a panel of 137 early maturing soybean lines (Canadian soybean core collection) using rhizoboxes and two-dimensional imaging. Significant phenotypic variation (P < 0.001) was observed among these lines for different RSA-related traits. This panel was genotyped with 2.18 million genome-wide single-nucleotide polymorphisms (SNPs) using a combination of genotyping-by-sequencing and whole-genome sequencing. A total of 10 quantitative trait locus (QTL) regions were detected for root total length and primary root diameter through a comprehensive genome-wide association study. These QTL regions explained from 15 to 25% of the phenotypic variation and contained two putative candidate genes with homology to genes previously reported to play a role in RSA in other species. These genes can serve to accelerate future efforts aimed to dissect genetic architecture of RSA and breed more resilient varieties.


2021 ◽  
Vol 7 (11) ◽  
pp. eabd1239
Author(s):  
Mark Simcoe ◽  
Ana Valdes ◽  
Fan Liu ◽  
Nicholas A. Furlotte ◽  
David M. Evans ◽  
...  

Human eye color is highly heritable, but its genetic architecture is not yet fully understood. We report the results of the largest genome-wide association study for eye color to date, involving up to 192,986 European participants from 10 populations. We identify 124 independent associations arising from 61 discrete genomic regions, including 50 previously unidentified. We find evidence for genes involved in melanin pigmentation, but we also find associations with genes involved in iris morphology and structure. Further analyses in 1636 Asian participants from two populations suggest that iris pigmentation variation in Asians is genetically similar to Europeans, albeit with smaller effect sizes. Our findings collectively explain 53.2% (95% confidence interval, 45.4 to 61.0%) of eye color variation using common single-nucleotide polymorphisms. Overall, our study outcomes demonstrate that the genetic complexity of human eye color considerably exceeds previous knowledge and expectations, highlighting eye color as a genetically highly complex human trait.


Agronomy ◽  
2020 ◽  
Vol 11 (1) ◽  
pp. 27
Author(s):  
Archana Khadgi ◽  
Courtney A. Weber

Red raspberry (Rubus idaeus L.) is an expanding high-value berry crop worldwide. The presence of prickles, outgrowths of epidermal tissues lacking vasculature, on the canes, petioles, and undersides of leaves complicates both field management and harvest. The utilization of cultivars with fewer prickles or prickle-free canes simplifies production. A previously generated population segregating for prickles utilizing the s locus between the prickle-free cultivar Joan J (ss) and the prickled cultivar Caroline (Ss) was analyzed to identify the genomic region associated with prickle development in red raspberry. Genotype by sequencing (GBS) was combined with a genome-wide association study (GWAS) using fixed and random model circulating probability unification (FarmCPU) to analyze 8474 single nucleotide polymorphisms (SNPs) and identify significant markers associated with the prickle-free trait. A total of four SNPs were identified on chromosome 4 that were associated with the phenotype and were located near or in annotated genes. This study demonstrates how association genetics can be used to decipher the genetic control of important horticultural traits in Rubus, and provides valuable information about the genomic region and potential genes underlying the prickle-free trait.


Sign in / Sign up

Export Citation Format

Share Document