scholarly journals Gene organization and transcription of TED, a lepidopteran retrotransposon integrated within the baculovirus genome.

1990 ◽  
Vol 10 (6) ◽  
pp. 3067-3077 ◽  
Author(s):  
P D Friesen ◽  
M S Nissen

A single copy of the retrotransposon TED, from the moth Trichoplusia ni (a lepidopteran noctuid), was identified within the DNA genome of the baculovirus Autographa californica nuclear polyhedrosis virus. Determination of the complete nucleotide sequence (7,510 base pairs) of the integrated copy indicated that TED belongs to the family of retrotransposons that includes Drosophila melanogaster elements 17.6 and gypsy and thus represents the first nondipteran member of this invertebrate group to be identified. The internal portion of TED, flanked by long terminal repeats (LTRs), is composed of three long open reading frames comparable in size and location to the gag, pol, and env genes of the vertebrate retroviruses. Sequence similarity with the dipteran elements was the highest within individual domains of TED open reading frame 2 (pol region) that are also conserved among the retroviruses and encode protease, reverse transcriptase, and integrase functions, respectively. Mapping the 5' and 3' termini of TED RNAs indicated that the LTRs have a retroviral U3-R-U5 structural organization that is capable of directing the synthesis of transcripts that represent potential substrates for reverse transcription and intermediates in transposition. Abundant RNAs were also initiated from a site within the 5' LTR that matches the consensus motif for the promoter of late, hyperexpressed baculovirus genes. The presence of this viruslike promoter within TED and its subsequent activation only after integration within the viral genome suggest a possible symbiotic relationship with the baculovirus that could extend transposon host range.

1990 ◽  
Vol 10 (6) ◽  
pp. 3067-3077
Author(s):  
P D Friesen ◽  
M S Nissen

A single copy of the retrotransposon TED, from the moth Trichoplusia ni (a lepidopteran noctuid), was identified within the DNA genome of the baculovirus Autographa californica nuclear polyhedrosis virus. Determination of the complete nucleotide sequence (7,510 base pairs) of the integrated copy indicated that TED belongs to the family of retrotransposons that includes Drosophila melanogaster elements 17.6 and gypsy and thus represents the first nondipteran member of this invertebrate group to be identified. The internal portion of TED, flanked by long terminal repeats (LTRs), is composed of three long open reading frames comparable in size and location to the gag, pol, and env genes of the vertebrate retroviruses. Sequence similarity with the dipteran elements was the highest within individual domains of TED open reading frame 2 (pol region) that are also conserved among the retroviruses and encode protease, reverse transcriptase, and integrase functions, respectively. Mapping the 5' and 3' termini of TED RNAs indicated that the LTRs have a retroviral U3-R-U5 structural organization that is capable of directing the synthesis of transcripts that represent potential substrates for reverse transcription and intermediates in transposition. Abundant RNAs were also initiated from a site within the 5' LTR that matches the consensus motif for the promoter of late, hyperexpressed baculovirus genes. The presence of this viruslike promoter within TED and its subsequent activation only after integration within the viral genome suggest a possible symbiotic relationship with the baculovirus that could extend transposon host range.


1989 ◽  
Vol 9 (3) ◽  
pp. 935-945
Author(s):  
L A Johnston ◽  
M A Kotarski ◽  
D J Jerry ◽  
L P Kozak

While studying the organization of the mouse glycerol-phosphate dehydrogenase gene (Gdc-1 on chromosome 15), we identified a novel transcriptional unit located only 3.4 kilobases (kb) upstream of the 5' end of the Gdc-1 gene. This gene has been provisionally named D15Kz1. The unusual proximity of these two genes led us to investigate the pattern of expression and sequence characteristics of the new gene for comparison with those of Gdc-1. D15Kz1 was found to have transcripts of 3.2 and 3.4 kb in length. The 3.4-kb transcript was expressed at low levels in all tissues examined, whereas the 3.2-kb transcript was detected only in the cerebral cortex and the brown fat. D15Kz1 and Gdc-1 are not coordinately regulated, as evidenced by the characteristics of their expression in several tissues and in differentiating 3T3-F442A adipocyte cultures. A cDNA sequence of 3,105 bases isolated from an embryonal carcinoma lambda gt10 cDNA library had a large open reading frame of 461 amino acids at one end followed by 1.6 kb of sequence with multiple stop codons. Algorithms used to search the protein and nucleic acid data bases detected no significant sequence similarity to any other protein or gene. Southern blot analysis of genomic DNA using the D15Kz1 cDNA as a probe indicated that D15Kz1 is a single-copy gene in the mouse genome and that it is conserved in humans, rats, and chickens. This conservation of gene sequences suggests that D15Kz1 encodes a protein with an important cellular function.


Genome ◽  
2009 ◽  
Vol 52 (11) ◽  
pp. 904-911 ◽  
Author(s):  
M. Buti ◽  
T. Giordani ◽  
M. Vukich ◽  
L. Gentzbittel ◽  
L. Pistelli ◽  
...  

In this paper we report on the isolation and characterization, for the first time, of a complete 6511 bp retrotransposon of sunflower. Considering its protein domain order and sequence similarity to other copia elements of dicotyledons, this retrotransposon was assigned to the copia retrotransposon superfamily and named HACRE1 ( Helianthus annuus copia-like retroelement 1). HACRE1 carries 5′ and 3′ long terminal repeats (LTRs) flanking an internal region of 4661 bp. The LTRs are identical in their sequence except for two deletions of 7 and 5 nucleotides in the 5′ LTR. Based on the sequence identity of the LTRs, HACRE1 was estimated to have inserted within the last ∼84 000 years. The isolated sequence contains a complete open reading frame with only one complete reading frame. The absence of nonsense mutations agrees with the very high sequence identity between LTRs, confirming that HACRE1 insertion is recent. The haploid genome of sunflower (inbred line HCM) contains about 160 copies of HACRE1. This retrotransposon is expressed in leaflets from 7-day-old plantlets under different light conditions, probably in relation to the occurrence of many putative light-related regulatory cis-elements in the LTRs. However, sequenced cDNAs show less variability than HACRE1 genomic sequences, indicating that only a subset of this family is expressed under these conditions.


2004 ◽  
Vol 78 (21) ◽  
pp. 11544-11550 ◽  
Author(s):  
Paul Kraft ◽  
Andrea Oeckinghaus ◽  
Daniel Kümmel ◽  
George H. Gauss ◽  
John Gilmore ◽  
...  

ABSTRACT Sulfolobus spindle-shaped viruses (SSVs), or Fuselloviridae, are ubiquitous crenarchaeal viruses found in high-temperature acidic hot springs around the world (pH ≤4.0; temperature of ≥70°C). Because they are relatively easy to isolate, they represent the best studied of the crenarchaeal viruses. This is particularly true for the type virus, SSV1, which contains a double-stranded DNA genome of 15.5 kilobases, encoding 34 putative open reading frames. Interestingly, the genome shows little sequence similarity to organisms other than its SSV homologues. Together, sequence similarity and biochemical analyses have suggested functions for only 6 of the 34 open reading frames. Thus, even though SSV1 is the best-studied crenarchaeal virus, functions for most (28) of its open reading frames remain unknown. We have undertaken biochemical and structural studies for the gene product of open reading frame F-93. We find that F-93 exists as a homodimer in solution and that a tight dimer is also present in the 2.7-Å crystal structure. Further, the crystal structure reveals a fold that is homologous to the SlyA and MarR subfamilies of winged-helix DNA binding proteins. This strongly suggests that F-93 functions as a transcription factor that recognizes a (pseudo-)palindromic DNA target sequence.


2019 ◽  
Vol 11 (9) ◽  
pp. 2678-2690 ◽  
Author(s):  
Ann M McCartney ◽  
Edel M Hyland ◽  
Paul Cormican ◽  
Raymond J Moran ◽  
Andrew E Webb ◽  
...  

Abstract Gene fusion occurs when two or more individual genes with independent open reading frames becoming juxtaposed under the same open reading frame creating a new fused gene. A small number of gene fusions described in detail have been associated with novel functions, for example, the hominid-specific PIPSL gene, TNFSF12, and the TWE-PRIL gene family. We use Sequence Similarity Networks and species level comparisons of great ape genomes to identify 45 new genes that have emerged by transcriptional readthrough, that is, transcription-derived gene fusion. For 35 of these putative gene fusions, we have been able to assess available RNAseq data to determine whether there are reads that map to each breakpoint. A total of 29 of the putative gene fusions had annotated transcripts (9/29 of which are human-specific). We carried out RT-qPCR in a range of human tissues (placenta, lung, liver, brain, and testes) and found that 23 of the putative gene fusion events were expressed in at least one tissue. Examining the available ribosome foot-printing data, we find evidence for translation of three of the fused genes in human. Finally, we find enrichment for transcription-derived gene fusions in regions of known segmental duplication in human. Together, our results implicate chromosomal structural variation brought about by segmental duplication with the emergence of novel transcripts and translated protein products.


1992 ◽  
Vol 3 (4) ◽  
pp. 403-414 ◽  
Author(s):  
L C Smith ◽  
R J Britten ◽  
E H Davidson

SpCoel1 is a single copy gene that is specifically expressed in most of the coelomocytes of the adult purple sea urchin, Strongylocentrotus purpuratus. The 4-kb transcript from this gene has a relatively short (426 nucleotide) open reading frame (ORF) with long 3' and 5' untranslated regions. The ORF encodes a protein that has strong amino acid sequence similarity to profilins from yeast to mammals. Transcript titrations of SpCoel1 show significant increases per coelomocyte in animals that have been physiologically challenged. Increases in transcript levels are of similar magnitudes between animals receiving different treatments, such as injuries from needle punctures or from injections of foreign cells. The evidence presented here implies a molecular mechanism by which this lower deuterostome defense system responds to external insult, viz that an external "injury signal" activates a signal transduction system, which in turn mediates the alterations in cytoskeletal state that are required for coelomocyte activation.


1989 ◽  
Vol 35 (1) ◽  
pp. 200-204 ◽  
Author(s):  
Johannes Auer ◽  
Konrad Lechner ◽  
August Bock

Two transcriptional units coding for ribosomal proteins and protein synthesis elongation factors in Methanococcus vannielii have been cloned and analysed in detail. They correspond to the "streptomycin operon" and "spectinomycin operon" of the Escherichia coli chromosome. The following general conclusions can be drawn from comparison of the nucleotide and the derived amino acid sequences of ribosomal proteins from Methanococcus with those from eubacteria and eukaryotes. (i) Ribosomal protein and elongation factor genes in Methanococcus are clustered in transcriptional units corresponding closely to E. coli ribosomal protein operons with respect to both gene composition and organization. (ii) These transcriptional units contain, in addition, a few open reading frames whose putative gene products share sequence similarity with eukaryotic 80S but not with eubacterial, ribosomal proteins. They may correspond to "additional" ribosomal proteins of the Methanococcus ribosome, there being no functional homologues in the eubacterial ribosome. (iii) Methanococcus ribosomal proteins and elongation factors almost exclusively exhibit a higher sequence similarity to eukaryotic 80S ribosomal proteins than to those of eubacteria. (iv) Many Methanococcus ribosomal proteins have a size intermediate between those of their eukaryotic and eubacterial homologues. These results are discussed in terms of a hypothesis which implies that the recent eubacterial ribosome developed by a "minimization" process from a more complex organelle and that the archaebacterial ribosome has maintained features of this ancestor.Key words: archaebacteria, Methanococcus, transcription factors, clonal analysis.


1998 ◽  
Vol 180 (19) ◽  
pp. 5192-5202 ◽  
Author(s):  
Ping Hu ◽  
Jeffrey Elliott ◽  
Paula McCready ◽  
Evan Skowronski ◽  
Jeffrey Garnes ◽  
...  

ABSTRACT The complete nucleotide sequence and gene organization of the three virulence plasmids from Yersinia pestis KIM5 were determined. Plasmid pPCP1 (9,610 bp) has a GC content of 45.3% and encodes two previously known virulence factors, an associated protein, and a single copy of IS100. Plasmid pCD1 (70,504 bp) has a GC content of 44.8%. It is known to encode a number of essential virulence determinants, regulatory functions, and a multiprotein secretory system comprising the low-calcium response stimulation that is shared with the other two Yersinia species pathogenic for humans (Y. pseudotuberculosis and Y. enterocolitica). A new pseudogene, which occurs as an intact gene in the Y. enterocolitica and Y. pseudotuberculosis-derived analogues, was found in pCD1. It corresponds to that encoding the lipoprotein YlpA. Several intact and partial insertion sequences and/or transposons were also found in pCD1, as well as six putative structural genes with high homology to proteins of unknown function in other yersiniae. The sequences of the genes involved in the replication of pCD1 are highly homologous to those of the cognate plasmids in Y. pseudotuberculosisand Y. enterocolitica, but their localization within the plasmid differs markedly from those of the latter. Plasmid pMT1 (100,984 bp) has a GC content of 50.2%. It possesses two copies of IS100, which are located 25 kb apart and in opposite orientations. Adjacent to one of these IS100 inserts is a partial copy of IS285. A single copy of an IS200-like element (recently named IS1541) was also located in pMT1. In addition to 5 previously described genes, such as murine toxin, capsule antigen, capsule anchoring protein, etc., 30 homologues to genes of several bacterial species were found in this plasmid, and another 44 open reading frames without homology to any known or hypothetical protein in the databases were predicted.


1989 ◽  
Vol 9 (3) ◽  
pp. 935-945 ◽  
Author(s):  
L A Johnston ◽  
M A Kotarski ◽  
D J Jerry ◽  
L P Kozak

While studying the organization of the mouse glycerol-phosphate dehydrogenase gene (Gdc-1 on chromosome 15), we identified a novel transcriptional unit located only 3.4 kilobases (kb) upstream of the 5' end of the Gdc-1 gene. This gene has been provisionally named D15Kz1. The unusual proximity of these two genes led us to investigate the pattern of expression and sequence characteristics of the new gene for comparison with those of Gdc-1. D15Kz1 was found to have transcripts of 3.2 and 3.4 kb in length. The 3.4-kb transcript was expressed at low levels in all tissues examined, whereas the 3.2-kb transcript was detected only in the cerebral cortex and the brown fat. D15Kz1 and Gdc-1 are not coordinately regulated, as evidenced by the characteristics of their expression in several tissues and in differentiating 3T3-F442A adipocyte cultures. A cDNA sequence of 3,105 bases isolated from an embryonal carcinoma lambda gt10 cDNA library had a large open reading frame of 461 amino acids at one end followed by 1.6 kb of sequence with multiple stop codons. Algorithms used to search the protein and nucleic acid data bases detected no significant sequence similarity to any other protein or gene. Southern blot analysis of genomic DNA using the D15Kz1 cDNA as a probe indicated that D15Kz1 is a single-copy gene in the mouse genome and that it is conserved in humans, rats, and chickens. This conservation of gene sequences suggests that D15Kz1 encodes a protein with an important cellular function.


2020 ◽  
Vol 36 (19) ◽  
pp. 4827-4832
Author(s):  
C S Casimiro-Soriguer ◽  
M M Rigual ◽  
A M Brokate-Llanos ◽  
M J Muñoz ◽  
A Garzón ◽  
...  

Abstract Motivation Short bioactive peptides encoded by small open reading frames (sORFs) play important roles in eukaryotes. Bioinformatics prediction of ORFs is an early step in a genome sequence analysis, but sORFs encoding short peptides, often using non-AUG initiation codons, are not easily discriminated from false ORFs occurring by chance. Results AnABlast is a computational tool designed to highlight putative protein-coding regions in genomic DNA sequences. This protein-coding finder is independent of ORF length and reading frame shifts, thus making of AnABlast a potentially useful tool to predict sORFs. Using this algorithm, here, we report the identification of 82 putative new intergenic sORFs in the Caenorhabditis elegans genome. Sequence similarity, motif presence, expression data and RNA interference experiments support that the underlined sORFs likely encode functional peptides, encouraging the use of AnABlast as a new approach for the accurate prediction of intergenic sORFs in annotated eukaryotic genomes. Availability and implementation AnABlast is freely available at http://www.bioinfocabd.upo.es/ab/. The C.elegans genome browser with AnABlast results, annotated genes and all data used in this study is available at http://www.bioinfocabd.upo.es/celegans. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document