scholarly journals FusoPortal: An interactive repository of hybrid MinION-sequenced Fusobacterium genomes improves gene identification and characterization

2018 ◽  
Author(s):  
Blake E. Sanders ◽  
Ariana Umana ◽  
Justin A. Lemkul ◽  
Daniel J. Slade

AbstractHere we present FusoPortal, an interactive repository of Fusobacterium genomes that were sequenced using a hybrid MinION long-read sequencing pipeline, followed by assembly and annotation using a diverse portfolio of predominantly open-source software. Significant efforts were made to provide genomic and bioinformatic data as downloadable files, including raw sequencing reads, genome maps, gene annotations, protein functional analysis and classifications, and a custom BLAST server for FusoPortal genomes. FusoPortal has been initiated with eight complete genomes, of which seven were previously only drafts that varied from 24-67 contigs. We showcase that genomes in FusoPortal provide accurate open reading frame annotations, and have corrected a number of large genes (>3 kb) that were previously misannotated due to contig boundaries. In summary, FusoPortal (http://fusoportal.org) is the first database of MinION sequenced and completely assembled Fusobacterium genomes, and this central Fusobacterium genomic and bioinformatic resource will aid the scientific community in developing a deeper understanding of how this human pathogen contributes to an array of diseases including periodontitis and colorectal cancer.ImportanceIn this study, we report a hybrid MinION whole genome sequencing pipeline, and describe the genomic characteristics of the first eight strains deposited in the FusoPortal database. This collection of highly accurate and complete genomes drastically improves upon previous multi-contig assemblies by correcting and newly identifying a significant number of open reading frames. We believe this resource will result in the discovery of proteins and molecular mechanisms used by an oral pathogen, with the potential to further our understanding of how F. nucleatum contributes to a repertoire of diseases including periodontitis, pre-term birth, and colorectal cancer

mSphere ◽  
2018 ◽  
Vol 3 (4) ◽  
Author(s):  
Blake E. Sanders ◽  
Ariana Umana ◽  
Justin A. Lemkul ◽  
Daniel J. Slade

ABSTRACTHere we present FusoPortal, an interactive repository ofFusobacteriumgenomes that were sequenced using a hybrid MinION long-read sequencing pipeline, followed by assembly and annotation using a diverse portfolio of predominantly open-source software. Significant efforts were made to provide genomic and bioinformatic data as downloadable files, including raw sequencing reads, genome maps, gene annotations, protein functional analysis and classifications, and a custom BLAST server for FusoPortal genomes. FusoPortal has been initiated with eight complete genomes, of which seven were previously only drafts that ranged from 24 to 67 contigs. We have showcased that the genomes in FusoPortal provide accurate open reading frame annotations and have corrected a number of large (>3-kb) genes that were previously misannotated due to contig boundaries. In summary, FusoPortal (http://fusoportal.org) is the first database of MinION-sequenced and completely assembledFusobacteriumgenomes, and this centralFusobacteriumgenomic and bioinformatic resource will aid the scientific community in developing a deeper understanding of how this human pathogen contributes to an array of diseases, including periodontitis and colorectal cancer.IMPORTANCEIn this report, we describe a hybrid MinION whole-genome sequencing pipeline and the genomic characteristics of the first eightFusobacteriumstrains deposited in the FusoPortal database. This collection of highly accurate and complete genomes drastically improves upon previous multicontig assemblies by correcting and newly identifying a significant number of open reading frames. We believe that the availability of this resource will result in the discovery of proteins and molecular mechanisms used by an oral pathogen, with the potential to further our understanding of howFusobacterium nucleatumcontributes to a repertoire of diseases, including periodontitis, preterm birth, and colorectal cancer.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Robin-Lee Troskie ◽  
Yohaann Jafrani ◽  
Tim R. Mercer ◽  
Adam D. Ewing ◽  
Geoffrey J. Faulkner ◽  
...  

AbstractPseudogenes are gene copies presumed to mainly be functionless relics of evolution due to acquired deleterious mutations or transcriptional silencing. Using deep full-length PacBio cDNA sequencing of normal human tissues and cancer cell lines, we identify here hundreds of novel transcribed pseudogenes expressed in tissue-specific patterns. Some pseudogene transcripts have intact open reading frames and are translated in cultured cells, representing unannotated protein-coding genes. To assess the biological impact of noncoding pseudogenes, we CRISPR-Cas9 delete the nucleus-enriched pseudogene PDCL3P4 and observe hundreds of perturbed genes. This study highlights pseudogenes as a complex and dynamic component of the human transcriptional landscape.


1988 ◽  
Vol 8 (9) ◽  
pp. 3827-3836
Author(s):  
N P Williams ◽  
P P Mueller ◽  
A G Hinnebusch

Translational control of GCN4 expression in the yeast Saccharomyces cerevisiae is mediated by multiple AUG codons present in the leader of GCN4 mRNA, each of which initiates a short open reading frame of only two or three codons. Upstream AUG codons 3 and 4 are required to repress GCN4 expression in normal growth conditions; AUG codons 1 and 2 are needed to overcome this repression in amino acid starvation conditions. We show that the regulatory function of AUG codons 1 and 2 can be qualitatively mimicked by the AUG codons of two heterologous upstream open reading frames (URFs) containing the initiation regions of the yeast genes PGK and TRP1. These AUG codons inhibit GCN4 expression when present singly in the mRNA leader; however, they stimulate GCN4 expression in derepressing conditions when inserted upstream from AUG codons 3 and 4. This finding supports the idea that AUG codons 1 and 2 function in the control mechanism as translation initiation sites and further suggests that suppression of the inhibitory effects of AUG codons 3 and 4 is a general consequence of the translation of URF 1 and 2 sequences upstream. Several observations suggest that AUG codons 3 and 4 are efficient initiation sites; however, these sequences do not act as positive regulatory elements when placed upstream from URF 1. This result suggests that efficient translation is only one of the important properties of the 5' proximal URFs in GCN4 mRNA. We propose that a second property is the ability to permit reinitiation following termination of translation and that URF 1 is optimized for this regulatory function.


F1000Research ◽  
2015 ◽  
Vol 4 ◽  
pp. 155 ◽  
Author(s):  
Sandeep Chakraborty ◽  
Monica Britton ◽  
Jill Wegrzyn ◽  
Timothy Butterfield ◽  
Pedro José Martínez-García ◽  
...  

The transcriptome provides a functional footprint of the genome by enumerating the molecular components of cells and tissues. The field of transcript discovery has been revolutionized through high-throughput mRNA sequencing (RNA-seq). Here, we present a methodology that replicates and improves existing methodologies, and implements a workflow for error estimation and correction followed by genome annotation and transcript abundance estimation for RNA-seq derived transcriptome sequences (YeATS - Yet Another Tool Suite for analyzing RNA-seq derived transcriptome). A unique feature of YeATS is the upfront determination of the errors in the sequencing or transcript assembly process by analyzing open reading frames of transcripts. YeATS identifies transcripts that have not been merged, result in broken open reading frames or contain long repeats as erroneous transcripts. We present the YeATS workflow using a representative sample of the transcriptome from the tissue at the heartwood/sapwood transition zone in black walnut. A novel feature of the transcriptome that emerged from our analysis was the identification of a highly abundant transcript that had no known homologous genes (GenBank accession: KT023102). The amino acid composition of the longest open reading frame of this gene classifies this as a putative extensin. Also, we corroborated the transcriptional abundance of proline-rich proteins, dehydrins, senescence-associated proteins, and the DNAJ family of chaperone proteins. Thus, YeATS presents a workflow for analyzing RNA-seq data with several innovative features that differentiate it from existing software.


2007 ◽  
Vol 74 (4) ◽  
pp. 1281-1283 ◽  
Author(s):  
Donald A. Comfort ◽  
Chung-Jung Chou ◽  
Shannon B. Conners ◽  
Amy L. VanFossen ◽  
Robert M. Kelly

ABSTRACT Bioinformatics analysis and transcriptional response information for Pyrococcus furiosus grown on α-glucans led to the identification of a novel isomaltase (PF0132) representing a new glycoside hydrolase (GH) family, a novel GH57 β-amylase (PF0870), and an extracellular starch-binding protein (1,141 amino acids; PF1109-PF1110), in addition to several other putative α-glucan-processing enzymes.


Pathogens ◽  
2019 ◽  
Vol 8 (2) ◽  
pp. 57 ◽  
Author(s):  
Kadriye Çağlayan ◽  
Vahid Roumi ◽  
Mona Gazel ◽  
Eminur Elçi ◽  
Mehtap Acioğlu ◽  
...  

High throughput sequencing of total RNA isolated from symptomatic leaves of a sweet cherry tree (Prunus avium cv. 0900 Ziraat) from Turkey identified a new member of the genus Robigovirus designated cherry virus Turkey (CVTR). The presence of the virus was confirmed by electron microscopy and overlapping RT-PCR for sequencing its whole-genome. The virus has a ssRNA genome of 8464 nucleotides which encodes five open reading frames (ORFs) and comprises two non-coding regions, 5′ UTR and 3′ UTR of 97 and 296 nt, respectively. Compared to the five most closely related robigoviruses, RdRp, TGB1, TGB2, TGB3 and CP share amino acid identities ranging from 43–53%, 44–60%, 39–43%, 38–44% and 45–50%, respectively. Unlike the four cherry robigoviruses, CVTR lacks ORFs 2a and 5a. Its genome organization is therefore more similar to African oil palm ringspot virus (AOPRV). Using specific primers, the presence of CVTR was confirmed in 15 sweet cherries and two sour cherries out of 156 tested samples collected from three regions in Turkey. Among them, five samples were showing slight chlorotic symptoms on the leaves. It seems that CVTR infects cherry trees with or without eliciting obvious symptoms, but these data should be confirmed by bioassays in woody and possible herbaceous hosts in future studies.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Jonathan Bohlen ◽  
Liza Harbrecht ◽  
Saioa Blanco ◽  
Katharina Clemm von Hohenberg ◽  
Kai Fenzl ◽  
...  

Abstract Translation efficiency varies considerably between different mRNAs, thereby impacting protein expression. Translation of the stress response master-regulator ATF4 increases upon stress, but the molecular mechanisms are not well understood. We discover here that translation factors DENR, MCTS1 and eIF2D are required to induce ATF4 translation upon stress by promoting translation reinitiation in the ATF4 5′UTR. We find DENR and MCTS1 are only needed for reinitiation after upstream Open Reading Frames (uORFs) containing certain penultimate codons, perhaps because DENR•MCTS1 are needed to evict only certain tRNAs from post-termination 40S ribosomes. This provides a model for how DENR and MCTS1 promote translation reinitiation. Cancer cells, which are exposed to many stresses, require ATF4 for survival and proliferation. We find a strong correlation between DENR•MCTS1 expression and ATF4 activity across cancers. Furthermore, additional oncogenes including a-Raf, c-Raf and Cdk4 have long uORFs and are translated in a DENR•MCTS1 dependent manner.


1999 ◽  
Vol 10 (04) ◽  
pp. 635-643 ◽  
Author(s):  
AGNIESZKA GIERLIK ◽  
PAWEŁ MACKIEWICZ ◽  
MARIA KOWALCZUK ◽  
STANISŁAW CEBRAT ◽  
MIROSŁAW R. DUDEK

Coding sequences of DNA generate Open Reading Frames (ORFs) inside them with much higher frequency than random DNA sequences do, especially in the antisense strand. This is a specific feature of the genetic code. Since coding sequences are selected for their length, the generated ORFs are indirect results of this selection and their length is also influenced by selection. That is why ORFs found in any genome, even much longer ones than those spontaneously generated in random DNA sequences, should be considered as two different sets of ORFs: The first one coding for proteins, the second one generated by the coding ORFs. Even intergenic sequences possess greater capacity for generating ORFs than random DNA sequences of the same nucleotide composition, which seems to be a premise that intergenic sequences were generated from coding sequences by recombinational mechanisms.


2004 ◽  
Vol 78 (21) ◽  
pp. 11544-11550 ◽  
Author(s):  
Paul Kraft ◽  
Andrea Oeckinghaus ◽  
Daniel Kümmel ◽  
George H. Gauss ◽  
John Gilmore ◽  
...  

ABSTRACT Sulfolobus spindle-shaped viruses (SSVs), or Fuselloviridae, are ubiquitous crenarchaeal viruses found in high-temperature acidic hot springs around the world (pH ≤4.0; temperature of ≥70°C). Because they are relatively easy to isolate, they represent the best studied of the crenarchaeal viruses. This is particularly true for the type virus, SSV1, which contains a double-stranded DNA genome of 15.5 kilobases, encoding 34 putative open reading frames. Interestingly, the genome shows little sequence similarity to organisms other than its SSV homologues. Together, sequence similarity and biochemical analyses have suggested functions for only 6 of the 34 open reading frames. Thus, even though SSV1 is the best-studied crenarchaeal virus, functions for most (28) of its open reading frames remain unknown. We have undertaken biochemical and structural studies for the gene product of open reading frame F-93. We find that F-93 exists as a homodimer in solution and that a tight dimer is also present in the 2.7-Å crystal structure. Further, the crystal structure reveals a fold that is homologous to the SlyA and MarR subfamilies of winged-helix DNA binding proteins. This strongly suggests that F-93 functions as a transcription factor that recognizes a (pseudo-)palindromic DNA target sequence.


2008 ◽  
Vol 82 (17) ◽  
pp. 8917-8921 ◽  
Author(s):  
Christopher J. McCormick ◽  
Omar Salim ◽  
Paul R. Lambden ◽  
Ian N. Clarke

ABSTRACT A generally accepted view of norovirus replication is that capsid expression requires production of a subgenomic transcript, the presence of capsid often being used as a surrogate marker to indicate the occurrence of viral replication. Using a polymerase II-based baculovirus delivery system, we observed capsid expression following introduction of a full-length genogroup 3 norovirus genome into HepG2 cells. However, capsid expression occurred as a result of a novel translation termination/reinitiation event between the nonstructural-protein and capsid open reading frames, a feature that may be unique to genogroup 3 noroviruses.


Sign in / Sign up

Export Citation Format

Share Document