scholarly journals Homology-guided re-annotation improves the gene models of the alloploid Nicotiana benthamiana

2018 ◽  
Author(s):  
Jiorgos Kourelis ◽  
Farnusch Kaschani ◽  
Friederike M. Grosse-Holz ◽  
Felix Homma ◽  
Markus Kaiser ◽  
...  

Nicotiana benthamiana is an important model organism of the Solanaceae (Nightshade) family. Several draft assemblies of the N. benthamiana genome have been generated, but many of the gene-models in these draft assemblies appear incorrect. Here we present an improved re-annotation of the Niben1.0.1 draft genome assembly guided by gene models from other Nicotiana species. This approach overcomes problems caused by mis-annotated exon-intron boundaries and mis-assigned short read transcripts to homeologs in polyploid genomes. With an estimated 98.1% completeness; only 53,411 protein-encoding genes; and improved protein lengths and functional annotations, this new predicted proteome is better than the preceding proteome annotations. This dataset is more sensitive and accurate in proteomics applications, clarifying the detection by activity-based proteomics of proteins that were previously mis-annotated to be inactive. Phylogenetic analysis of the subtilase family of hydrolases reveal a pseudogenisation of likely homeologs, associated with a contraction of the functional genome in this alloploid plant species. We use this gene annotation to assign extracellular proteins in comparison to a total leaf proteome, to display the enrichment of hydrolases in the apoplast.

BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Jiorgos Kourelis ◽  
Farnusch Kaschani ◽  
Friederike M. Grosse-Holz ◽  
Felix Homma ◽  
Markus Kaiser ◽  
...  

Abstract Background Nicotiana benthamiana is an important model organism of the Solanaceae (Nightshade) family. Several draft assemblies of the N. benthamiana genome have been generated, but many of the gene-models in these draft assemblies appear incorrect. Results Here we present an improved proteome based on the Niben1.0.1 draft genome assembly guided by gene models from other Nicotiana species. Due to the fragmented nature of the Niben1.0.1 draft genome, many protein-encoding genes are missing or partial. We complement these missing proteins by similarly annotating other draft genome assemblies. This approach overcomes problems caused by mis-annotated exon-intron boundaries and mis-assigned short read transcripts to homeologs in polyploid genomes. With an estimated 98.1% completeness; only 53,411 protein-encoding genes; and improved protein lengths and functional annotations, this new predicted proteome is better in assigning spectra than the preceding proteome annotations. This dataset is more sensitive and accurate in proteomics applications, clarifying the detection by activity-based proteomics of proteins that were previously predicted to be inactive. Phylogenetic analysis of the subtilase family of hydrolases reveal inactivation of likely homeologs, associated with a contraction of the functional genome in this alloploid plant species. Finally, we use this new proteome annotation to characterize the extracellular proteome as compared to a total leaf proteome, which highlights the enrichment of hydrolases in the apoplast. Conclusions This proteome annotation provides the community working with Nicotiana benthamiana with an important new resource for functional proteomics.


2017 ◽  
Vol 5 (32) ◽  
Author(s):  
Chang-Young Hong ◽  
Su-Yeon Lee ◽  
Sun-Hwa Ryu ◽  
Sung-Suk Lee ◽  
Myungkil Kim

ABSTRACT Phanerochaete chrysosporium (ATCC 20696) has a catabolic ability to degrade lignin. Here, we report whole-genome sequencing used to identify genes related to lignin modification. We determined the 39-Mb draft genome sequence of this fungus, comprising 13,560 predicted gene models. Gene annotation provided crucial information about the location and function of protein-encoding genes.


Archaea ◽  
2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Anja Poehlein ◽  
Rolf Daniel ◽  
Henning Seedorf

Methanobrevibacter arboriphilusstrain DH1 is an autotrophic methanogen that was isolated from the wetwood of methane-emitting trees. This species has been of considerable interest for its unusual oxygen tolerance and has been studied as a model organism for more than four decades. Strain DH1 is closely related to other host-associatedMethanobrevibacterspecies from intestinal tracts of animals and the rumen, making this strain an interesting candidate for comparative analysis to identify factors important for colonizing intestinal environments. Here, the genome sequence ofM. arboriphilusstrain DH1 is reported. The draft genome is composed of 2.445.031 bp with an average GC content of 25.44% and predicted to harbour 1964 protein-encoding genes. Among the predicted genes, there are also more than 50 putative genes for the so-called adhesin-like proteins (ALPs). The presence of ALP-encoding genes in the genome of this non-host-associated methanogen strongly suggests that target surfaces for ALPs other than host tissues also need to be considered as potential interaction partners. The high abundance of ALPs may also indicate that these types of proteins are more characteristic for specific phylogenetic groups of methanogens rather than being indicative for a particular environment the methanogens thrives in.


2019 ◽  
Vol 8 (31) ◽  
Author(s):  
Valérie Polonais ◽  
Sebastian Niehus ◽  
Ivan Wawrzyniak ◽  
Adrien Franchet ◽  
Christine Gaspin ◽  
...  

We present the draft genome sequence of Tubulinosema ratisbonensis, a microsporidium species infecting Drosophila melanogaster. A total of 3,013 protein-encoding genes and an array of transposable elements were identified. This work represents a necessary step to develop a novel model of host-parasite relationships using the highly tractable genetic model D. melanogaster.


2017 ◽  
Vol 5 (7) ◽  
Author(s):  
Yannick Lara ◽  
Benoit Durieu ◽  
Luc Cornet ◽  
Olivier Verlaine ◽  
Rosmarie Rippka ◽  
...  

ABSTRACT Phormidesmis priestleyi ULC007 is an Antarctic freshwater cyanobacterium. Its draft genome is 5,684,389 bp long. It contains a total of 5,604 protein-encoding genes, of which 22.2% have no clear homologues in known genomes. To date, this draft genome is the first one ever determined for an axenic cyanobacterium from Antarctica.


2017 ◽  
Author(s):  
Timothy H Webster ◽  
Greer A. Dolby ◽  
Melissa Wilson Sayres ◽  
Kenro Kusumi

Exogenous sequence contamination presents a challenge in first-draft genomes because it can lead to non-contiguous, chimeric assembled sequences. This can mislead downstream analyses reliant on synteny, such as linkage-based analyses. Recently, the Mojave Desert Tortoise (Gopherus agassizii) draft genome was published as a resource to advance conservation efforts for the threatened species and discover more about chelonian biology and evolution. Here, we illustrate steps taken to improve the desert tortoise draft genome by removing contaminating sequences—actions that are typically carried out after the initial release of a draft genome assembly. We used information from NCBI’s Vecscreen output to remove intra-scaffold contamination and trim heading and trailing Ns. We then reordered and renamed scaffolds, and transferred the gene annotation onto this assembly. Finally, we describe the tools developed for this pipeline, freely available on Github (https://github.com/thw17/G_agassizii_reference_update), which facilitate post-assembly processing of other draft genomes. The new gopAga1.1 genome has an N50 of 251 KB, L50 of 2592 scaffolds, and its annotation retains 17,201 of the original 20,172 genes that were unaffected by the scaffold processing.


2018 ◽  
Author(s):  
Timothy H Webster ◽  
Greer A Dolby ◽  
Melissa A Wilson Sayres ◽  
Kenro Kusumi

Exogenous sequence contamination presents a challenge in first-draft genomes because it can lead to non-contiguous, chimeric assembled sequences. This can mislead downstream analyses reliant on synteny, such as linkage-based analyses. Recently, the Mojave Desert Tortoise (Gopherus agassizii) draft genome was published as a resource to advance conservation efforts for the threatened species and discover more about chelonian biology and evolution. Here, we illustrate steps taken to improve the desert tortoise draft genome by removing contaminating sequences—actions that are typically carried out after the initial release of a draft genome assembly. We used information from NCBI’s Vecscreen output to remove intra-scaffold contamination and trim heading and trailing Ns. We then reordered and renamed scaffolds, and transferred the gene annotation onto this assembly. Finally, we describe the tools developed for this pipeline, freely available on Github (https://github.com/thw17/G_agassizii_reference_update), which facilitate post-assembly processing of other draft genomes. The new gopAga1.1 genome has an N50 of 251 kb, L50 of 2592 scaffolds, and its annotation retains 17,201 of the original 20,172 genes that were unaffected by the scaffold processing.


2018 ◽  
Vol 6 (9) ◽  
pp. e01487-17
Author(s):  
Awa Diop ◽  
Khoudia Diop ◽  
Enora Tomei ◽  
Didier Raoult ◽  
Florence Fenollar ◽  
...  

ABSTRACT We report here the draft genome sequence of Ezakiella peruensis strain M6.X2T. The draft genome is 1,672,788 bp long and harbors 1,589 predicted protein-encoding genes, including 26 antibiotic resistance genes with 1 gene encoding vancomycin resistance. The genome also exhibits 1 clustered regularly interspaced short palindromic repeat region and 333 genes acquired by horizontal gene transfer.


2016 ◽  
Vol 4 (1) ◽  
Author(s):  
Brandon S. Guida ◽  
Ferran Garcia-Pichel

Mastigocoleus testarum strain BC008 is a model organism used to study marine photoautotrophic carbonate dissolution. It is a multicellular, filamentous, diazotrophic, euendolithic cyanobacterium ubiquitously found in marine benthic environments. We present an accurate draft genome assembly of 172 contigs spanning 12,700,239 bp with 9,131 annotated genes with an average G+C% of 37.3.


2018 ◽  
Vol 6 (18) ◽  
Author(s):  
Anja Poehlein ◽  
Tim Böer ◽  
Kerrin Steensen ◽  
Rolf Daniel

ABSTRACT The spore-forming, thermophilic, and obligate anaerobic bacterium Moorella stamsii was isolated from digester sludge. Apart from its ability to use carbon monoxide for growth, M. stamsii harbors several enzymes for the use of different sugars. The draft genome has a size of 3,329 Mb and contains 3,306 predicted protein-encoding genes.


Sign in / Sign up

Export Citation Format

Share Document