arabidopsis thaliana genome
Recently Published Documents


TOTAL DOCUMENTS

70
(FIVE YEARS 12)

H-INDEX

24
(FIVE YEARS 2)

Plants ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 2681
Author(s):  
Ilya Kirov ◽  
Pavel Merkulov ◽  
Maxim Dudnikov ◽  
Ekaterina Polkhovskaya ◽  
Roman A. Komakhin ◽  
...  

Long-read data is a great tool to discover new active transposable elements (TEs). However, no ready-to-use tools were available to gather this information from low coverage ONT datasets. Here, we developed a novel pipeline, nanotei, that allows detection of TE-contained structural variants, including individual TE transpositions. We exploited this pipeline to identify TE insertion in the Arabidopsis thaliana genome. Using nanotei, we identified tens of TE copies, including ones for the well-characterized ONSEN retrotransposon family that were hidden in genome assembly gaps. The results demonstrate that some TEs are inaccessible for analysis with the current A. thaliana (TAIR10.1) genome assembly. We further explored the mobilome of the ddm1 mutant with elevated TE activity. Nanotei captured all TEs previously known to be active in ddm1 and also identified transposition of non-autonomous TEs. Of them, one non-autonomous TE derived from (AT5TE33540) belongs to TR-GAG retrotransposons with a single open reading frame (ORF) encoding the GAG protein. These results provide the first direct evidence that TR-GAGs and other non-autonomous LTR retrotransposons can transpose in the plant genome, albeit in the absence of most of the encoded proteins. In summary, nanotei is a useful tool to detect active TEs and their insertions in plant genomes using low-coverage data from Nanopore genome sequencing.


2021 ◽  
Author(s):  
Bo Wang ◽  
Yanyan Jia ◽  
Peng Jia ◽  
Quanbin Dong ◽  
Xiaofei Yang ◽  
...  

Here, we report a high-quality (HQ) and almost complete genome assembly with a single gap and quality value (QV) larger than 60 of the model plant Arabidopsis thaliana ecotype Columbia (Col-0), generated using combination of Oxford Nanopore Technology (ONT) ultra-long reads, high fidelity (HiFi) reads and Hi-C data. The total genome assembly size is 133,877,291 bp (chr1: 32,659,241 bp, chr2: 22,712,559 bp, chr3: 26,161,332 bp, chr4: 22,250,686 bp and chr5: 30,093,473 bp), and introduces 14.73 Mb (96% belong to centromere) novel sequences compared to TAIR10.1 reference genome. All five chromosomes of our HQ assembly are highly accurate with QV larger than 60, ranging from QV62 to QV68, which is significantly higher than TAIR10.1 referecne (44-51) and a recent published genome (41-43). We have completely resolved chr3 and chr5 from telomere-to-telomere. For chr2 and chr4, we have completely resolved apart from the nucleolar organizing regions, which are composed of highly long-repetitive DNA fragments. It has been reported that the length of centromere 1 is about 9 Mb and it is hard to assembly since tens of thousands of CEN180 satellite repeats. Based on the cutting-edge sequencing data, we assembled about 4Mb continuous sequence of centromere 1. We found different identity patterns across five centromeres, and all centromeres were significantly enriched with CENH3 ChIP-seq signals, confirming the accuracy of the assembly. We obtained four clusters of CEN180 repeats, and found CENH3 presented a strong preference for a cluster 3. Moreover, we observed hypomethylation patterns in CENH3 enriched regions. This high-quality assembly genome will be a valuable reference to assist us in the understanding of global pattern of centromeric polymorphism, genetic and epigenetic in naturally inbred lines of Arabidopsis thaliana.


2021 ◽  
Author(s):  
Matthew Naish ◽  
Michael Alonge ◽  
Piotr Wlodzimierz ◽  
Andrew J Tock ◽  
Bradley W Abramson ◽  
...  

Centromeres attach chromosomes to spindle microtubules during cell division and, despite this conserved role, show paradoxically rapid evolution and are typified by complex repeats. We used ultra-long-read sequencing to generate the Col-CEN Arabidopsis thaliana genome assembly that resolves all five centromeres. The centromeres consist of megabase-scale tandemly repeated satellite arrays, which support high CENH3 occupancy and are densely DNA methylated, with satellite variants private to each chromosome. CENH3 preferentially occupies satellites with least divergence and greatest higher-order repetition. The centromeres are invaded by ATHILA retrotransposons, which disrupt genetic and epigenetic organization of the centromeres. Crossover recombination is suppressed within the centromeres, yet low levels of meiotic DSBs occur that are regulated by DNA methylation. We propose that Arabidopsis centromeres are evolving via cycles of satellite homogenization and retrotransposon-driven diversification.


2021 ◽  
Vol 22 (3) ◽  
pp. 1360
Author(s):  
Lingling Xuan ◽  
Jie Zhang ◽  
Weitai Lu ◽  
Pawel Gluza ◽  
Berit Ebert ◽  
...  

Glycosyltransferases (GTs) catalyze the synthesis of glycosidic linkages and are essential in the biosynthesis of glycans, glycoconjugates (glycolipids and glycoproteins), and glycosides. Plant genomes generally encode many more GTs than animal genomes due to the synthesis of a cell wall and a wide variety of glycosylated secondary metabolites. The Arabidopsis thaliana genome is predicted to encode over 573 GTs that are currently classified into 42 diverse families. The biochemical functions of most of these GTs are still unknown. In this study, we updated the JBEI Arabidopsis GT clone collection by cloning an additional 105 GT cDNAs, 508 in total (89%), into Gateway-compatible vectors for downstream characterization. We further established a functional analysis pipeline using transient expression in tobacco (Nicotiana benthamiana) followed by enzymatic assays, fractionation of enzymatic products by reversed-phase HPLC (RP-HPLC) and characterization by mass spectrometry (MS). Using the GT14 family as an exemplar, we outline a strategy for identifying effective substrates of GT enzymes. By addition of UDP-GlcA as donor and the synthetic acceptors galactose-nitrobenzodiazole (Gal-NBD), β-1,6-galactotetraose (β-1,6-Gal4) and β-1,3-galactopentose (β-1,3-Gal5) to microsomes expressing individual GT14 enzymes, we verified the β-glucuronosyltransferase (GlcAT) activity of three members of this family (AtGlcAT14A, B, and E). In addition, a new family member (AT4G27480, 248) was shown to possess significantly higher activity than other GT14 enzymes. Our data indicate a likely role in arabinogalactan-protein (AGP) biosynthesis for these GT14 members. Together, the updated Arabidopsis GT clone collection and the biochemical analysis pipeline present an efficient means to identify and characterize novel GT catalytic activities.


Genes ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 135
Author(s):  
Eugene V. Korotkov ◽  
Yulia M. Suvorova ◽  
Dmitrii O. Kostenko ◽  
Maria A. Korotkova

In this study, we developed a new mathematical method for performing multiple alignment of highly divergent sequences (MAHDS), i.e., sequences that have on average more than 2.5 substitutions per position (x). We generated sets of artificial DNA sequences with x ranging from 0 to 4.4 and applied MAHDS as well as currently used multiple sequence alignment algorithms, including ClustalW, MAFFT, T-Coffee, Kalign, and Muscle to these sets. The results indicated that most of the existing methods could produce statistically significant alignments only for the sets with x < 2.5, whereas MAHDS could operate on sequences with x = 4.4. We also used MAHDS to analyze a set of promoter sequences from the Arabidopsis thaliana genome and discovered many conserved regions upstream of the transcription initiation site (from −499 to +1 bp); a part of the downstream region (from +1 to +70 bp) also significantly contributed to the obtained alignments. The possibilities of applying the newly developed method for the identification of promoter sequences in any genome are discussed. A server for multiple alignment of nucleotide sequences has been created.


Author(s):  
Haidong Yan ◽  
Federica Torchiana ◽  
Aureliano Bombarely

Background: Transposable elements (TEs) constitute the vast majority of all eukaryotic DNA, and display extreme diversity, with thousands of families. Given their abundance and diversity, TEs discovery and annotation becomes challengeable. At present, tools and databases have built libraries to mask TEs in genomes based on de novo- and homology-based identification strategies, but no consensus criteria about which tools should be used have been proposed. Results: In the de novo-based strategy, we compared performances of TE libraries developed by four commonly used tools, including RepeatModeler, LTR_FINDER, LTRharvest, and MITE_Hunter, by using a simulated genome as a standard control. The results showed that the performance of RepeatModeler decreased as it was combined with either LTR_FINDER or LTRharvest. Combination of RepeatModeler and MITE_Hunter showed better performance than RepeatModeler and MITE_Hunter alone. In the homology-based strategy, we evaluated different sources from a taxonomic point of view to build an accurate TE library. When we selected a library from databases to identify TEs for Arabidopsis thaliana genome, the library from a genus genetically closer to Arabidopsis achieved better performance than other genera with further genetic distance. Without the Arabidopsis, combination of top three genera closer to Arabidopsis showed better performance than combination of all genera. Conclusion: This study proposes a series of recommendations to perform an accurate TE annotation: 1) For de novo-based strategy, RepeatModeler and MITE_Hunter are suggested to build a TE library; 2) For homology-based strategy, it is recommended to use library of genus genetically close to the species rather than use combined library from all genera.


2020 ◽  
Vol 21 (6) ◽  
pp. 2065
Author(s):  
Jan Fíla ◽  
Božena Klodová ◽  
David Potěšil ◽  
Miloslav Juříček ◽  
Petr Šesták ◽  
...  

The nascent polypeptide-associated (NAC) complex was described in yeast as a heterodimer composed of two subunits, α and β, and was shown to bind to the nascent polypeptides newly emerging from the ribosomes. NAC function was widely described in yeast and several information are also available about its role in plants. The knock down of individual NAC subunit(s) led usually to a higher sensitivity to stress. In Arabidopsis thaliana genome, there are five genes encoding NACα subunit, and two genes encoding NACβ. Double homozygous mutant in both genes coding for NACβ was acquired, which showed a delayed development compared to the wild type, had abnormal number of flower organs, shorter siliques and greatly reduced seed set. Both NACβ genes were characterized in more detail—the phenotype of the double homozygous mutant was complemented by a functional NACβ copy. Then, both NACβ genes were localized to nuclei and cytoplasm and their promoters were active in many organs (leaves, cauline leaves, flowers, pollen grains, and siliques together with seeds). Since flowers were the most affected organs by nacβ mutation, the flower buds’ transcriptome was identified by RNA sequencing, and their proteome by gel-free approach. The differential expression analyses of transcriptomic and proteomic datasets suggest the involvement of NACβ subunits in stress responses, male gametophyte development, and photosynthesis.


Nature ◽  
2019 ◽  
Vol 574 (7778) ◽  
pp. E16-E16
Author(s):  
Moises Exposito-Alonso ◽  
◽  
Hernán A. Burbano ◽  
Oliver Bossdorf ◽  
Rasmus Nielsen ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document