scholarly journals CottonGVD: A Comprehensive Genomic Variation Database for Cultivated Cottons

2021 ◽  
Vol 12 ◽  
Author(s):  
Zhen Peng ◽  
Hongge Li ◽  
Gaofei Sun ◽  
Panhong Dai ◽  
Xiaoli Geng ◽  
...  

Cultivated cottons are the most important economic crop, which produce natural fiber for the textile industry. In recent years, the genetic basis of several essential traits for cultivated cottons has been gradually elucidated by decoding their genomic variations. Although an abundance of resequencing data is available in public, there is still a lack of a comprehensive tool to exhibit the results of genomic variations and genome-wide association study (GWAS). To assist cotton researchers in utilizing these data efficiently and conveniently, we constructed the cotton genomic variation database (CottonGVD; http://120.78.174.209/ or http://db.cngb.org/cottonGVD). This database contains the published genomic information of three cultivated cotton species, the corresponding population variations (SNP and InDel markers), and the visualized results of GWAS for major traits. Various built-in genomic tools help users retrieve, browse, and query the variations conveniently. The database also provides interactive maps (e.g., Manhattan map, scatter plot, heatmap, and linkage disequilibrium block) to exhibit GWAS and expression GWAS results. Cotton researchers could easily focus on phenotype-associated loci visualization, and they are interested in and screen for candidate genes. Moreover, CottonGVD will continue to update by adding more data and functions.

2021 ◽  
Author(s):  
Poppy Channa Sakti Sephton-Clark ◽  
Jennifer Tenor ◽  
Dena Toffaletti ◽  
Nancy Meyers ◽  
Charles Giamberardino ◽  
...  

Cryptococcus neoformans is the causative agent of cryptococcosis, a disease with poor patient outcomes, accounting for approximately 180,000 deaths each year. Patient outcomes may be impacted by the underlying genetics of the infecting isolate, however, our current understanding of how genetic diversity contributes to clinical outcomes is limited. Here, we leverage clinical, in vitro growth and genomic data for 284 C. neoformans isolates to identify clinically relevant pathogen variants within a population of clinical isolates from patients with HIV-associated cryptococcosis in Malawi. Through a genome-wide association study (GWAS) approach, we identify variants associated with fungal burden and growth rate. We also find both small and large-scale variation, including aneuploidy, associated with alternate growth phenotypes, which may impact the course of infection. Genes impacted by these variants are involved in transcriptional regulation, signal transduction, glycolysis, sugar transport, and glycosylation. When combined with clinical data, we show that growth within the CNS is reliant upon glycolysis in an animal model, and likely impacts patient mortality, as CNS burden modulates patient outcome. Additionally, we find genes with roles in sugar transport are under selection in the majority of these clinical isolates. Further, we demonstrate that two hypothetical proteins identified by GWAS impact virulence in animal models. Our approach illustrates links between genetic variation and clinically relevant phenotypes, shedding light on survival mechanisms within the CNS and pathways involved in this persistence.


2019 ◽  
Author(s):  
Smitha P K ◽  
Vishnupriyan K ◽  
Ananya S. Kar ◽  
Anil Kumar M ◽  
Christopher Bathula ◽  
...  

Abstract Background: Cotton is one of the most important commercial crops as the source of natural fiber, oil and fodder. To protect it from harmful pest populations number of newer transgenic lines have been developed. For quick expression checks in successful agriculture qPCR (quantitative polymerase chain reaction) have become extremely popular. The selection of appropriate reference genes plays a critical role in the outcome of such experiments as the method quantifies expression of the target gene in comparison with the reference. Traditionally most commonly used reference genes are the “ house-keeping genes”, involved in basic cellular processes. However, expression levels of such genes often vary in response to experimental conditions, forcing the researchers to validate the reference genes for every experimental platform. This study presents a data science driven unbiased genome-wide search for the selection of reference genes by assessing variation of >50,000 genes in a publicly available RNA- seq dataset of cotton species Gossypium hirsutum . Result: Five genes ( TMN5, TBL6, UTR5B, AT1g65240 and CYP76B6 ) identified by data-science driven analysis, along with two commonly used reference genes found in literature ( PP2A1 and UBQ14 ) were taken through qPCR in a set of 33 experimental samples consisting of different tissues (leaves, square, stem and root), different stages of leaf (young and mature) and square development (small, medium and large) in both transgenic and non-transgenic plants. Expression stability of the genes was evaluated using four algorithms - geNorm , BestKeeper , NormFinder and RefFinder. Conclusion: Based on the results we recommend the usage of TMN5 and TBL6 as the optimal candidate reference genes in qPCR experiments with normal and transgenic cotton plant tissues. AT1g65240 and PP2A1 can also be used if expression study includes squares. This study, for the first time successfully displays a data science driven genome-wide search method followed by experimental validation as a method of choice for selection of stable reference genes over the selection based on function alone.


Plants ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 1722
Author(s):  
Byeong Yong Jeong ◽  
Yoonjung Lee ◽  
Yebin Kwon ◽  
Jee Hye Kim ◽  
Tae-Ho Ham ◽  
...  

A genome-wide association study (GWAS) was used to investigate the genetic basis of chilling tolerance in a collection of 117 rice accessions, including 26 Korean landraces and 29 weedy rices, at the reproductive stage. To assess chilling tolerance at the early young microspore stage, plants were treated at 12 °C for 5 days, and tolerance was evaluated using seed set fertility. GWAS, together with principal component analysis and kinship matrix analysis, revealed five quantitative trait loci (QTLs) associated with chilling tolerance on chromosomes 3, 6, and 7. The percentage of phenotypic variation explained by the QTLs was 11–19%. The genomic region underlying the QTL on chromosome 3 overlapped with a previously reported QTL associated with spikelet fertility. Subsequent bioinformatic and haplotype analyses suggested three candidate chilling-tolerance genes within the QTL linkage disequilibrium block: Os03g0305700, encoding a protein similar to peptide chain release factor 2; Os06g0495700, encoding a beta tubulin, autoregulation binding-site-domain-containing protein; and Os07g0137800, encoding a protein kinase, core-domain-containing protein. Further analysis of the detected QTLs and the candidate chilling-tolerance genes will facilitate strategies for developing chilling-tolerant rice cultivars in breeding programs.


Author(s):  
Tao Yan ◽  
Yao Yao ◽  
Dezhi Wu ◽  
Lixi Jiang

Abstract Rapeseed (Brassica napus L.) is a typical polyploid crop and one of the most important oilseed crops worldwide. With the rapid progress on high-throughput sequencing technologies and the reduction of sequencing cost, large-scale genomic data of a specific crop have become available. However, raw sequence data are mostly deposited in the sequence read archive of the National Center of Biotechnology Information (NCBI) and the European Nucleotide Archive (ENA), which is freely accessible to all researchers. Extensive tools for practical purposes should be developed to efficiently utilize these large raw data. Here, we report a web-based rapeseed genomic variation database (BnaGVD, http://rapeseed.biocloud.net/home) from which genomic variations, such as single nucleotide polymorphisms (SNPs) and insertions/deletions (InDels) across a world-wide collection of rapeseed accessions, can be referred. The current release of the BnaGVD contains 34,591,899 high-quality SNPs and 12,281,923 high-quality InDels and provides search tools to retrieve genomic variations and gene annotations across 1,007 accessions of worldwide rapeseed germplasm. We implement a variety of built-in tools (e.g., BnaGWAS, BnaPCA, and BnaStructure) to help users perform in-depth analyses. We recommend this web resource for accelerating studies on the functional genomics and screening of molecular markers for rapeseed breeding.


2019 ◽  
Author(s):  
Yanhong Lou ◽  
Yun Chen ◽  
Zhihao Liu ◽  
Mingjie Sun ◽  
Fei Han ◽  
...  

Abstract Background: Foxtail millet [Setaria italica (L.) P. Beauv.] is a particularly important cereal and fodder crop in arid and semi-arid regions. The genomic variation and alleles underpinning agronomic and quality traits are important for foxtail millet improvement. To better understand the diversity of foxtail millet and facilitate the genetic dissection of its agronomic and quality traits, we used high-quality single nucleotide polymorphisms (SNPs) to perform a genome-wide association study (GWAS). Results: Using genotyping-by-sequencing, 107 foxtail millet accessions were sequenced, and further analysis revealed 72,181 high-quality SNPs, of which 53 were significantly associated with 15 agronomic and quality traits. These SNPs were distributed across the nine chromosomes of foxtail millet; 44 were located in intergenic regions, whereas one and eight SNPs were located in exon and intron regions, respectively. The GWAS revealed that 28 SNPs were associated with a single trait. Conclusions: For some of the significant SNPs, favourable genotypes showed pyramiding effects for several traits. The 53 loci identified in this study will therefore be useful for breeding programs aimed at foxtail millet improvement.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Junji Su ◽  
Caixiang Wang ◽  
Qi Ma ◽  
Ai Zhang ◽  
Chunhui Shi ◽  
...  

Abstract Background Cotton (Gossypium spp.) fiber yield is one of the key target traits, and improved fiber yield has always been thought of as an important objective in the breeding programs and production. Although some studies had been reported for the understanding of genetic bases for cotton yield-related traits, the detected quantitative trait loci (QTL) for the traits is still very limited. To uncover the whole-genome QTL controlling three yield-related traits in upland cotton (Gossypium hirsutum L.), phenotypic traits were investigated under four planting environments and 9244 single-nucleotide polymorphism linkage disequilibrium block (SNPLDB) markers were developed in an association panel consisting of 315 accessions. Results A total of 53, 70 and 68 significant SNPLDB loci associated with boll number (BN), boll weight (BW) and lint percentage (LP), were respectively detected through a restricted two-stage multi-locus multi-allele genome-wide association study (RTM-GWAS) procedure in multiple environments. The haplotype/allele effects of the significant SNPLDB loci were estimated and the QTL-allele matrices were organized for offering the abbreviated genetic composition of the population. Among the significant SNPLDB loci, six of them were simultaneously identified in two or more single planting environments and were thought of as the stable SNPLDB loci. Additionally, a total of 115 genes were annotated in the nearby regions of the six stable SNPLDB loci, and 16 common potential candidate genes controlling target traits of them were predicted by two RNA-seq data. One of 16 genes (GH_D06G2161) was mainly expressed in the early ovule-development stages, and the stable SNPLDB locus (LDB_19_62926589) was mapped in its promoter region. Conclusion This study identified the QTL alleles and candidate genes that could provide important insights into the genetic basis of yield-related traits in upland cotton and might facilitate breeding cotton varieties with high yield.


2019 ◽  
Vol 5 (2) ◽  
pp. e310 ◽  
Author(s):  
Neha S. Raghavan ◽  
Badri Vardarajan ◽  
Richard Mayeux

ObjectiveTo determine the putative protective relationship of educational attainment on Alzheimer disease (AD) risk using Mendelian randomization and to test the hypothesis that by using genetic regions surrounding individually associated single nucleotide polymorphisms (SNPs) as the instrumental variable, we can identify genes that contribute to the relationship.MethodsWe performed Mendelian randomization using genome-wide association study summary statistics from studies of educational attainment and AD in two stages. Our instrumental variable comprised (1) 1,271 SNPs significantly associated with educational attainment and (2) individual 2-Mb regions surrounding the genome-wide significant SNPs.ResultsA causal inverse relationship between educational attainment and AD was identified by the 1,271 SNPs (odds ratio = 0.63; 95% confidence interval, 0.54–0.74; p = 4.08 x 10−8). Analysis of individual loci identified 2 regions that significantly replicated the causal relationship. Genes within these regions included LRRC2, SSBP2, and NEGR1; the latter a regulator of neuronal growth.ConclusionsEducational attainment is an important protective factor for AD. Genomic regions that significantly paralleled the overall causal relationship contain genes expressed in neurons or involved in the regulation of neuronal development.


2019 ◽  
Vol 19 (1) ◽  
Author(s):  
P. K. Smitha ◽  
K. Vishnupriyan ◽  
Ananya S. Kar ◽  
M. Anil Kumar ◽  
Christopher Bathula ◽  
...  

Abstract Background Cotton is one of the most important commercial crops as the source of natural fiber, oil and fodder. To protect it from harmful pest populations number of newer transgenic lines have been developed. For quick expression checks in successful agriculture qPCR (quantitative polymerase chain reaction) have become extremely popular. The selection of appropriate reference genes plays a critical role in the outcome of such experiments as the method quantifies expression of the target gene in comparison with the reference. Traditionally most commonly used reference genes are the “house-keeping genes”, involved in basic cellular processes. However, expression levels of such genes often vary in response to experimental conditions, forcing the researchers to validate the reference genes for every experimental platform. This study presents a data science driven unbiased genome-wide search for the selection of reference genes by assessing variation of > 50,000 genes in a publicly available RNA-seq dataset of cotton species Gossypium hirsutum. Result Five genes (TMN5, TBL6, UTR5B, AT1g65240 and CYP76B6) identified by data-science driven analysis, along with two commonly used reference genes found in literature (PP2A1 and UBQ14) were taken through qPCR in a set of 33 experimental samples consisting of different tissues (leaves, square, stem and root), different stages of leaf (young and mature) and square development (small, medium and large) in both transgenic and non-transgenic plants. Expression stability of the genes was evaluated using four algorithms - geNorm, BestKeeper, NormFinder and RefFinder. Conclusion Based on the results we recommend the usage of TMN5 and TBL6 as the optimal candidate reference genes in qPCR experiments with normal and transgenic cotton plant tissues. AT1g65240 and PP2A1 can also be used if expression study includes squares. This study, for the first time successfully displays a data science driven genome-wide search method followed by experimental validation as a method of choice for selection of stable reference genes over the selection based on function alone.


2021 ◽  
Author(s):  
Bernadette C Young ◽  
Chieh-Hsi Wu ◽  
Jane Charlesworth ◽  
Sarah Earle ◽  
James R Price ◽  
...  

AbstractBackgroundStaphylococcus aureus is a major bacterial pathogen in humans, and a dominant cause of severe bloodstream infections. Globally, antimicrobial resistance (AMR) in S. aureus remains challenging. While human risk factors for infection have been defined, contradictory evidence exists for the role of bacterial genomic variation in S. aureus disease.MethodsTo investigate the contribution of bacterial lineage and genomic variation to the development of bloodstream infection, we undertook a genome-wide association study comparing bacteria from 1017 individuals with bacteraemia to 984 adults with asymptomatic S. aureus nasal carriage. Within 984 carriage isolates, we also compared healthcare-associated (HA) carriage with community-associated (CA) carriage.ResultsAll major global lineages were represented in both bacteraemia and carriage, with no evidence for different attack rates. However, kmers tagging trimethoprim resistance-conferring mutation F99Y in dfrB were significantly associated with bacteraemia-vs-carriage (p=10−8.9-10−9.3). Pooling variation within genes, bacteraemia-vs-carriage was associated with the presence of mecA (HMP=10−5.3) as well as the presence of SCCmec (HMP=10−4.4).Among S. aureus carriers, no lineages were associated with HA-vs-CA carriage. However, we found a novel signal of HA-vs-CA carriage in the foldase protein prsA, where kmers representing conserved sequence allele were associated with CA carriage (p=10−7.1-10−19.4), while in gyrA, a ciprofloxacin resistance-conferring mutation, L84S, was associated with HA carriage (p=10−7.2).ConclusionsIn an extensive study of S. aureus bacteraemia and nasal carriage in the UK, we found strong evidence that all S. aureus lineages are equally capable of causing bloodstream infection, and of being carried in the healthcare environment.Genomic variation in the foldase protein prsA is a novel genomic marker of healthcare origin in S. aureus but was not associated with bacteraemia. AMR determinants were associated with both bacteraemia and hospital-associated carriage, suggesting that AMR increases the propensity not only to survive in hospital environments, but also to cause invasive disease.


2021 ◽  
Vol 7 (11) ◽  
Author(s):  
Bernadette C. Young ◽  
Chieh-Hsi Wu ◽  
Jane Charlesworth ◽  
Sarah Earle ◽  
James R. Price ◽  
...  

Staphylococcus aureus is a major bacterial pathogen in humans, and a dominant cause of severe bloodstream infections. Globally, antimicrobial resistance (AMR) in S. aureus remains challenging. While human risk factors for infection have been defined, contradictory evidence exists for the role of bacterial genomic variation in S. aureus disease. To investigate the contribution of bacterial lineage and genomic variation to the development of bloodstream infection, we undertook a genome-wide association study comparing bacteria from 1017 individuals with bacteraemia to 984 adults with asymptomatic S. aureus nasal carriage. Within 984 carriage isolates, we also compared healthcare-associated (HA) carriage with community-associated (CA) carriage. All major global lineages were represented in both bacteraemia and carriage, with no evidence for different infection rates. However, kmers tagging trimethoprim resistance-conferring mutation F99Y in dfrB were significantly associated with bacteraemia-vs-carriage (P=10-8.9-10-9.3). Pooling variation within genes, bacteraemia-vs-carriage was associated with the presence of mecA (HMP=10-5.3) as well as the presence of SCCmec (HMP=10-4.4). Among S. aureus carriers, no lineages were associated with HA-vs-CA carriage. However, we found a novel signal of HA-vs-CA carriage in the foldase protein prsA, where kmers representing conserved sequence allele were associated with CA carriage (P=10-7.1-10-19.4), while in gyrA, a ciprofloxacin resistance-conferring mutation, L84S, was associated with HA carriage (P=10-7.2). In an extensive study of S. aureus bacteraemia and nasal carriage in the UK, we found strong evidence that all S. aureus lineages are equally capable of causing bloodstream infection, and of being carried in the healthcare environment. Genomic variation in the foldase protein prsA is a novel genomic marker of healthcare origin in S. aureus but was not associated with bacteraemia. AMR determinants were associated with both bacteraemia and healthcare-associated carriage, suggesting that AMR increases the propensity not only to survive in healthcare environments, but also to cause invasive disease.


Sign in / Sign up

Export Citation Format

Share Document