scholarly journals Genome, Transcriptome, and Germplasm Sequencing Uncovers Functional Variation in the Warm-Season Grain Legume Horsegram Macrotyloma uniflorum (Lam.) Verdc.

2021 ◽  
Vol 12 ◽  
Author(s):  
H. B. Mahesh ◽  
M. K. Prasannakumar ◽  
K. G. Manasa ◽  
Sampath Perumal ◽  
Yogendra Khedikar ◽  
...  

Horsegram is a grain legume with excellent nutritional and remedial properties and good climate resilience, able to adapt to harsh environmental conditions. Here, we used a combination of short- and long-read sequencing technologies to generate a genome sequence of 279.12Mb, covering 83.53% of the estimated total size of the horsegram genome, and we annotated 24,521 genes. De novo prediction of DNA repeats showed that approximately 25.04% of the horsegram genome was made up of repetitive sequences, the lowest among the legume genomes sequenced so far. The major transcription factors identified in the horsegram genome were bHLH, ERF, C2H2, WRKY, NAC, MYB, and bZIP, suggesting that horsegram is resistant to drought. Interestingly, the genome is abundant in Bowman–Birk protease inhibitors (BBIs), which can be used as a functional food ingredient. The results of maximum likelihood phylogenetic and estimated synonymous substitution analyses suggested that horsegram is closely related to the common bean and diverged approximately 10.17 million years ago. The double-digested restriction associated DNA (ddRAD) sequencing of 40 germplasms allowed us to identify 3,942 high-quality SNPs in the horsegram genome. A genome-wide association study with powdery mildew identified 10 significant associations similar to the MLO and RPW8.2 genes. The reference genome and other genomic information presented in this study will be of great value to horsegram breeding programs. In addition, keeping the increasing demand for food with nutraceutical values in view, these genomic data provide opportunities to explore the possibility of horsegram for use as a source of food and nutraceuticals.

2021 ◽  
Author(s):  
Toshimitsu Suzuki ◽  
Tetsuya Tatsukawa ◽  
Genki Sudo ◽  
Caroline Delandre ◽  
Yun Jin Pai ◽  
...  

CUX2 gene encodes a transcription factor that controls neuronal proliferation, dendrite branching and synapse formation, locating at the epilepsy-associated chromosomal region 12q24 that we previously identified by a genome-wide association study (GWAS) in Japanese population. A CUX2 recurrent de novo variant p.E590K has been described in patients with rare epileptic encephalopathies and the gene is a candidate for the locus, however the mutation may not be enough to generate the genome-wide significance in the GWAS and whether CUX2 variants appear in other types of epilepsies and physiopathological mechanisms are remained to be investigated. Here in this study, we conducted targeted sequencings of CUX2, a paralog CUX1 and its short isoform CASP harboring a unique C-terminus on 271 Japanese patients with a variety of epilepsies, and found that multiple CUX2 missense variants, other than the p.E590K, and some CASP variants including a deletion, predominantly appeared in patients with temporal lobe epilepsy (TLE). Human cell culture and fly dendritic arborization analyses revealed loss-of- function properties for the CUX2 variants. Cux2- and Casp-specific knockout mice both showed high susceptibility to kainate, increased excitatory cell number in the entorhinal cortex, and significant enhancement in glutamatergic synaptic transmission to the hippocampus. CASP and CUX2 proteins physiologically bound to each other and co-expressed in excitatory neurons in brain regions including the entorhinal cortex. These results suggest that CUX2 and CASP variants contribute to the TLE pathology through a facilitation of excitatory synaptic transmission from entorhinal cortex to hippocampus.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Wei Xu ◽  
Di Wu ◽  
Tianquan Yang ◽  
Chao Sun ◽  
Zaiqing Wang ◽  
...  

Abstract Background Castor bean (Ricinus communis L.) is an important oil crop, which belongs to the Euphorbiaceae family. The seed oil of castor bean is currently the only commercial source of ricinoleic acid that can be used for producing about 2000 industrial products. However, it remains largely unknown regarding the origin, domestication, and the genetic basis of key traits of castor bean. Results Here we perform a de novo chromosome-level genome assembly of the wild progenitor of castor bean. By resequencing and analyzing 505 worldwide accessions, we reveal that the accessions from East Africa are the extant wild progenitors of castor bean, and the domestication occurs ~ 3200 years ago. We demonstrate that significant genetic differentiation between wild populations in Kenya and Ethiopia is associated with past climate fluctuation in the Turkana depression ~ 7000 years ago. This dramatic change in climate may have caused the genetic bottleneck in wild castor bean populations. By a genome-wide association study, combined with quantitative trait locus analysis, we identify important candidate genes associated with plant architecture and seed size. Conclusions This study provides novel insights of domestication and genome evolution of castor bean, which facilitates genomics-based breeding of this important oilseed crop and potentially other tree-like crops in future.


2020 ◽  
Vol 10 (10) ◽  
pp. 3811-3819 ◽  
Author(s):  
Austin Compton ◽  
Jiangtao Liang ◽  
Chujia Chen ◽  
Varvara Lukyanchikova ◽  
Yumin Qi ◽  
...  

Chromosome level assemblies are accumulating in various taxonomic groups including mosquitoes. However, even in the few reference-quality mosquito assemblies, a significant portion of the heterochromatic regions including telomeres remain unresolved. Here we produce a de novo assembly of the New World malaria mosquito, Anopheles albimanus by integrating Oxford Nanopore sequencing, Illumina, Hi-C and optical mapping. This 172.6 Mbps female assembly, which we call AalbS3, is obtained by scaffolding polished large contigs (contig N50 = 13.7 Mbps) into three chromosomes. All chromosome arms end with telomeric repeats, which is the first in mosquito assemblies and represents a significant step toward the completion of a genome assembly. These telomeres consist of tandem repeats of a novel 30-32 bp Telomeric Repeat Unit (TRU) and are confirmed by analyzing the termini of long reads and through both chromosomal in situ hybridization and a Bal31 sensitivity assay. The AalbS3 assembly included previously uncharacterized centromeric and rDNA clusters and more than doubled the content of transposable elements and other repetitive sequences. This telomere-to-telomere assembly, although still containing gaps, represents a significant step toward resolving biologically important but previously hidden genomic components. The comparison of different scaffolding methods will also inform future efforts to obtain reference-quality genomes for other mosquito species.


2019 ◽  
Vol 295 (7) ◽  
pp. 1889-1897 ◽  
Author(s):  
Gergely Karsai ◽  
Museer Lone ◽  
Zoltán Kutalik ◽  
J. Thomas Brenna ◽  
Hongde Li ◽  
...  

Sphingolipids (SLs) are structurally diverse lipids that are defined by the presence of a long-chain base (LCB) backbone. Typically, LCBs contain a single Δ4E double bond (DB) (mostly d18:1), whereas the dienic LCB sphingadienine (d18:2) contains a second DB at the Δ14Z position. The enzyme introducing the Δ14Z DB is unknown. We analyzed the LCB plasma profile in a gender-, age-, and BMI-matched subgroup of the CoLaus cohort (n = 658). Sphingadienine levels showed a significant association with gender, being on average ∼30% higher in females. A genome-wide association study (GWAS) revealed variants in the fatty acid desaturase 3 (FADS3) gene to be significantly associated with the plasma d18:2/d18:1 ratio (p = −log 7.9). Metabolic labeling assays, FADS3 overexpression and knockdown approaches, and plasma LCB profiling in FADS3-deficient mice confirmed that FADS3 is a bona fide LCB desaturase and required for the introduction of the Δ14Z double bond. Moreover, we showed that FADS3 is required for the conversion of the atypical cytotoxic 1-deoxysphinganine (1-deoxySA, m18:0) to 1-deoxysphingosine (1-deoxySO, m18:1). HEK293 cells overexpressing FADS3 were more resistant to m18:0 toxicity than WT cells. In summary, using a combination of metabolic profiling and GWAS, we identified FADS3 to be essential for forming Δ14Z DB containing LCBs, such as d18:2 and m18:1. Our results unravel FADS3 as a Δ14Z LCB desaturase, thereby disclosing the last missing enzyme of the SL de novo synthesis pathway.


2020 ◽  
Author(s):  
Vincent Mérel ◽  
Patricia Gibert ◽  
Inessa Buch ◽  
Valentina Rodriguez Rada ◽  
Arnaud Estoup ◽  
...  

AbstractTransposable Elements (TEs) are ubiquitous and mobile repeated sequences. They are major determinants of host fitness. Here, we portrayed the TE content of the spotted wing fly Drosophila suzukii. Using a recently improved genome assembly, we reconstructed TE sequences de novo, and found that TEs occupy 47% of the genome and are mostly located in gene poor regions. The majority of TE insertions segregate at low frequencies, indicating a recent and probably ongoing TE activity. To explore TE dynamics in the context of biological invasions, we studied variation of TE abundance in genomic data from 16 invasive and six native populations (of D. suzukii). We found a large increase of the TE load in invasive populations correlated with a reduced Watterson estimate of genetic diversity a proxy of effective population size. We did not find any correlation between TE contents and bio-climatic variables, indicating a minor effect of environmentally induced TE activity. A genome-wide association study revealed that ca. 5,000 genomic regions are associated with TE abundance. We did not find, however, any evidence in such regions of an enrichment for genes known to interact with TE activity (e.g. transcription factor encoding genes or genes of the piRNA pathway). Finally, the study of TE insertion frequencies revealed 15 putatively adaptive TE insertions, six of them being likely associated with the recent invasion history of the species.


2020 ◽  
Vol 16 (11) ◽  
pp. e1008325
Author(s):  
Hyungtaek Jung ◽  
Tomer Ventura ◽  
J. Sook Chung ◽  
Woo-Jin Kim ◽  
Bo-Hye Nam ◽  
...  

Eukaryotic genome sequencing and de novo assembly, once the exclusive domain of well-funded international consortia, have become increasingly affordable, thus fitting the budgets of individual research groups. Third-generation long-read DNA sequencing technologies are increasingly used, providing extensive genomic toolkits that were once reserved for a few select model organisms. Generating high-quality genome assemblies and annotations for many aquatic species still presents significant challenges due to their large genome sizes, complexity, and high chromosome numbers. Indeed, selecting the most appropriate sequencing and software platforms and annotation pipelines for a new genome project can be daunting because tools often only work in limited contexts. In genomics, generating a high-quality genome assembly/annotation has become an indispensable tool for better understanding the biology of any species. Herein, we state 12 steps to help researchers get started in genome projects by presenting guidelines that are broadly applicable (to any species), sustainable over time, and cover all aspects of genome assembly and annotation projects from start to finish. We review some commonly used approaches, including practical methods to extract high-quality DNA and choices for the best sequencing platforms and library preparations. In addition, we discuss the range of potential bioinformatics pipelines, including structural and functional annotations (e.g., transposable elements and repetitive sequences). This paper also includes information on how to build a wide community for a genome project, the importance of data management, and how to make the data and results Findable, Accessible, Interoperable, and Reusable (FAIR) by submitting them to a public repository and sharing them with the research community.


2020 ◽  
Vol 375 (1795) ◽  
pp. 20190335 ◽  
Author(s):  
Wilson McKerrow ◽  
Zuojian Tang ◽  
Jared P. Steranka ◽  
Lindsay M. Payer ◽  
Jef D. Boeke ◽  
...  

Long interspersed element-1 (LINE-1, L1) sequences, which comprise about 17% of human genome, are the product of one of the most active types of mobile DNAs in modern humans. LINE-1 insertion alleles can cause inherited and de novo genetic diseases, and LINE-1-encoded proteins are highly expressed in some cancers. Genome-wide LINE-1 mapping in single cells could be useful for defining somatic and germline retrotransposition rates, and for enabling studies to characterize tumour heterogeneity, relate insertions to transcriptional and epigenetic effects at the cellular level, or describe cellular phylogenies in development. Our laboratories have reported a genome-wide LINE-1 insertion site mapping method for bulk DNA, named transposon insertion profiling by sequencing (TIPseq). There have been significant barriers applying LINE-1 mapping to single cells, owing to the chimeric artefacts and features of repetitive sequences. Here, we optimize a modified TIPseq protocol and show its utility for LINE-1 mapping in single lymphoblastoid cells. Results from single-cell TIPseq experiments compare well to known LINE-1 insertions found by whole-genome sequencing and TIPseq on bulk DNA. Among the several approaches we tested, whole-genome amplification by multiple displacement amplification followed by restriction enzyme digestion, vectorette ligation and LINE-1-targeted PCR had the best assay performance. This article is part of a discussion meeting issue ‘Crossroads between transposons and gene regulation’.


Author(s):  
Joris A. Alkemade ◽  
Nelson Nazzicari ◽  
Monika M. Messmer ◽  
Paolo Annicchiarico ◽  
Barbara Ferrari ◽  
...  

Abstract Key message GWAS identifies candidate gene controlling resistance to anthracnose disease in white lupin. Abstract White lupin (Lupinus albus L.) is a promising grain legume to meet the growing demand for plant-based protein. Its cultivation, however, is severely threatened by anthracnose disease caused by the fungal pathogen Colletotrichum lupini. To dissect the genetic architecture for anthracnose resistance, genotyping by sequencing was performed on white lupin accessions collected from the center of domestication and traditional cultivation regions. GBS resulted in 4611 high-quality single-nucleotide polymorphisms (SNPs) for 181 accessions, which were combined with resistance data observed under controlled conditions to perform a genome-wide association study (GWAS). Obtained disease phenotypes were shown to highly correlate with overall three-year disease assessments under Swiss field conditions (r > 0.8). GWAS results identified two significant SNPs associated with anthracnose resistance on gene Lalb_Chr05_g0216161 encoding a RING zinc-finger E3 ubiquitin ligase which is potentially involved in plant immunity. Population analysis showed a remarkably fast linkage disequilibrium decay, weak population structure and grouping of commercial varieties with landraces, corresponding to the slow domestication history and scarcity of modern breeding efforts in white lupin. Together with 15 highly resistant accessions identified in the resistance assay, our findings show promise for further crop improvement. This study provides the basis for marker-assisted selection, genomic prediction and studies aimed at understanding anthracnose resistance mechanisms in white lupin and contributes to improving breeding programs worldwide.


Author(s):  
Cong Feng ◽  
Min Dai ◽  
Yongjing Liu ◽  
Ming Chen

Abstract DNA repeats are abundant in eukaryotic genomes and have been proved to play a vital role in genome evolution and regulation. A large number of approaches have been proposed to identify various repeats in the genome. Some de novo repeat identification tools can efficiently generate sequence repetitive scores based on k-mer counting for repeat detection. However, we noticed that these tools can still be improved in terms of repetitive score calculation, sensitivity to segmental duplications and detection specificity. Therefore, here, we present a new computational approach named Repeat Locator (RepLoc), which is based on weighted k-mer coverage to quantify the genome sequence repetitiveness and locate the repetitive sequences. According to the repetitiveness map of the human genome generated by RepLoc, we found that there may be relationships between sequence repetitiveness and genome structures. A comprehensive benchmark shows that RepLoc is a more efficient k-mer counting based tool for de novo repeat detection. The RepLoc software is freely available at http://bis.zju.edu.cn/reploc.


2019 ◽  
Vol 20 (22) ◽  
pp. 5675 ◽  
Author(s):  
Lang Wu ◽  
Peng Wang ◽  
Yihao Wang ◽  
Qing Cheng ◽  
Qiaohua Lu ◽  
...  

There are many agronomic traits of pepper (Capsicum L.) with abundant phenotypes that can benefit pepper growth. Using specific-locus amplified fragment sequencing (SLAF-seq), a genome-wide association study (GWAS) of 36 agronomic traits was carried out for 287 representative pepper accessions. To ensure the accuracy and reliability of the GWAS results, we analyzed the genetic diversity, distribution of labels (SLAF tags and single nucleotide polymorphisms (SNPs)) and population differentiation and determined the optimal statistical model. In our study, 1487 SNPs were highly significantly associated with 26 agronomic traits, and 2126 candidate genes were detected in the 100-kb region up- and down-stream near these SNPs. Furthermore, 13 major association peaks were identified for 11 key agronomic traits. Then we examined the correlations among the 36 agronomic traits and analyzed SNP distribution and found 37 SNP polymerization regions (total size: 264.69 Mbp) that could be selected areas in pepper breeding. We found that the stronger the correlation between the two traits, the greater the possibility of them being in more than one polymerization region, suggesting that they may be linked or that one pleiotropic gene controls them. These results provide a theoretical foundation for future multi-trait pyramid breeding of pepper. Finally, we found that the GWAS signals were highly consistent with those from the nuclear restorer-of-fertility (Rf) gene for cytoplasmic male sterility (CMS), verifying their reliability. We further identified Capana06g002967 and Capana06g002969 as Rf candidate genes by functional annotation and expression analysis, which provided a reference for the study of cytoplasmic male sterility in Capsicum.


Sign in / Sign up

Export Citation Format

Share Document