scholarly journals TwinCons: Conservation score for uncovering deep sequence similarity and divergence

2021 ◽  
Vol 17 (10) ◽  
pp. e1009541
Author(s):  
Petar I. Penev ◽  
Claudia Alvarez-Carreño ◽  
Eric Smith ◽  
Anton S. Petrov ◽  
Loren Dean Williams

We have developed the program TwinCons, to detect noisy signals of deep ancestry of proteins or nucleic acids. As input, the program uses a composite alignment containing pre-defined groups, and mathematically determines a ‘cost’ of transforming one group to the other at each position of the alignment. The output distinguishes conserved, variable and signature positions. A signature is conserved within groups but differs between groups. The method automatically detects continuous characteristic stretches (segments) within alignments. TwinCons provides a convenient representation of conserved, variable and signature positions as a single score, enabling the structural mapping and visualization of these characteristics. Structure is more conserved than sequence. TwinCons highlights alternative sequences of conserved structures. Using TwinCons, we detected highly similar segments between proteins from the translation and transcription systems. TwinCons detects conserved residues within regions of high functional importance for the ribosomal RNA (rRNA) and demonstrates that signatures are not confined to specific regions but are distributed across the rRNA structure. The ability to evaluate both nucleic acid and protein alignments allows TwinCons to be used in combined sequence and structural analysis of signatures and conservation in rRNA and in ribosomal proteins (rProteins). TwinCons detects a strong sequence conservation signal between bacterial and archaeal rProteins related by circular permutation. This conserved sequence is structurally colocalized with conserved rRNA, indicated by TwinCons scores of rRNA alignments of bacterial and archaeal groups. This combined analysis revealed deep co-evolution of rRNA and rProtein buried within the deepest branching points in the tree of life.

1992 ◽  
Vol 12 (1) ◽  
pp. 56-67
Author(s):  
D A Maslov ◽  
N R Sturm ◽  
B M Niner ◽  
E S Gruszynski ◽  
M Peris ◽  
...  

Six short G-rich intergenic regions in the maxicircle of Leishmania tarentolae are conserved in location and polarity in two other kinetoplastid species. We show here that G-rich region 6 (G6) represents a pan-edited cryptogene which contains at least two domains edited independently in a 3'-to-5' manner connected by short unedited regions. In the completely edited RNA, 117 uridines are added at 49 sites and 32 uridines are deleted at 13 sites, creating a translated 85-amino-acid polypeptide. Similar polypeptides are probably encoded by pan-edited G6 transcripts in two other species. The G6 polypeptide has significant sequence similarity to the family of S12 ribosomal proteins. A minicircle-encoded gRNA overlaps 12 editing sites in G6 mRNA, and chimeric gRNA/mRNA molecules were shown to exist, in agreement with the transesterification model for editing.


2003 ◽  
Vol 185 (14) ◽  
pp. 4144-4151 ◽  
Author(s):  
Sheng Ye ◽  
Frank von Delft ◽  
Alexei Brooun ◽  
Mark W. Knuth ◽  
Ronald V. Swanson ◽  
...  

ABSTRACT Shikimate dehydrogenase catalyzes the NADPH-dependent reversible reduction of 3-dehydroshikimate to shikimate. We report the first X-ray structure of shikimate dehydrogenase from Haemophilus influenzae to 2.4-Å resolution and its complex with NADPH to 1.95-Å resolution. The molecule contains two domains, a catalytic domain with a novel open twisted α/β motif and an NADPH binding domain with a typical Rossmann fold. The enzyme contains a unique glycine-rich P-loop with a conserved sequence motif, GAGGXX, that results in NADPH adopting a nonstandard binding mode with the nicotinamide and ribose moieties disordered in the binary complex. A deep pocket with a narrow entrance between the two domains, containing strictly conserved residues primarily contributed by the catalytic domain, is identified as a potential 3-dehydroshikimate binding pocket. The flexibility of the nicotinamide mononucleotide portion of NADPH may be necessary for the substrate 3-dehydroshikimate to enter the pocket and for the release of the product shikimate.


1997 ◽  
Vol 139 (7) ◽  
pp. 1655-1661 ◽  
Author(s):  
Jonathan S. Rosenblum ◽  
Lucy F. Pemberton ◽  
Günter Blobel

A limited number of transport factors, or karyopherins, ferry particular substrates between the cytoplasm and nucleoplasm. We identified the Saccharomyces cerevisiae gene YDR395w/SXM1 as a potential karyopherin on the basis of limited sequence similarity to known karyopherins. From yeast cytosol, we isolated Sxm1p in complex with several potential import substrates. These substrates included Lhp1p, the yeast homologue of the human autoantigen La that has recently been shown to facilitate maturation of pre-tRNA, and three distinct ribosomal proteins, Rpl16p, Rpl25p, and Rpl34p. Further, we demonstrate that Lhp1p is specifically imported by Sxm1p. In the absence of Sxm1p, Lhp1p was mislocalized to the cytoplasm. Sxm1p and Lhp1p represent the karyopherin and a cognate substrate of a unique nuclear import pathway, one that operates upstream of a major pathway of pre-tRNA maturation, which itself is upstream of tRNA export in wild-type cells. In addition, through its association with ribosomal proteins, Sxm1p may have a role in coordinating ribosome biogenesis with tRNA processing.


1999 ◽  
Vol 19 (11) ◽  
pp. 7461-7472 ◽  
Author(s):  
Yeganeh Zebarjadian ◽  
Tom King ◽  
Maurille J. Fournier ◽  
Louise Clarke ◽  
John Carbon

ABSTRACT In budding yeast (Saccharomyces cerevisiae), the majority of box H/ACA small nucleolar RNPs (snoRNPs) have been shown to direct site-specific pseudouridylation of rRNA. Among the known protein components of H/ACA snoRNPs, the essential nucleolar protein Cbf5p is the most likely pseudouridine (Ψ) synthase. Cbf5p has considerable sequence similarity to Escherichia coli TruBp, a known Ψ synthase, and shares the “KP” and “XLD” conserved sequence motifs found in the catalytic domains of three distinct families of known and putative Ψ synthases. To gain additional evidence on the role of Cbf5p in rRNA biosynthesis, we have used in vitro mutagenesis techniques to introduce various alanine substitutions into the putative Ψ synthase domain of Cbf5p. Yeast strains expressing these mutatedcbf5 genes in a cbf5Δ null background are viable at 25°C but display pronounced cold- and heat-sensitive growth phenotypes. Most of the mutants contain reduced levels of Ψ in rRNA at extreme temperatures. Substitution of alanine for an aspartic acid residue in the conserved XLD motif of Cbf5p (mutantcbf5D95A) abolishes in vivo pseudouridylation of rRNA. Some of the mutants are temperature sensitive both for growth and for formation of Ψ in the rRNA. In most cases, the impaired growth phenotypes are not relieved by transcription of the rRNA from a polymerase II-driven promoter, indicating the absence of polymerase I-related transcriptional defects. There is little or no abnormal accumulation of pre-rRNAs in these mutants, although preferential inhibition of 18S rRNA synthesis is seen in mutantcbf5D95A, which lacks Ψ in rRNA. A subset of mutations in the Ψ synthase domain impairs association of the altered Cbf5p proteins with selected box H/ACA snoRNAs, suggesting that the functional catalytic domain is essential for that interaction. Our results provide additional evidence that Cbf5p is the Ψ synthase component of box H/ACA snoRNPs and suggest that the pseudouridylation of rRNA, although not absolutely required for cell survival, is essential for the formation of fully functional ribosomes.


1998 ◽  
Vol 62 (10) ◽  
pp. 2008-2015 ◽  
Author(s):  
Tetsuo OHMACHI ◽  
Ryo FUKUOKA ◽  
Yoshie KIMURA ◽  
Yoshihiro ASADA ◽  
Herbert L. ENNIS

1987 ◽  
Vol 7 (6) ◽  
pp. 2070-2079 ◽  
Author(s):  
R A Ach ◽  
A M Weiner

Formation of the 3' end of U1 and U2 small nuclear RNA (snRNA) precursors is directed by a conserved sequence called the 3' box located 9 to 28 nucleotides downstream of all metazoan U1 to U4 snRNA genes sequenced so far. Deletion of part or all of the 3' box from human U1 and U2 genes drastically reduces 3'-end formation. To define the essential nucleotides within this box that direct 3'-end formation, we constructed a set of point mutations in the conserved residues of the human U1 3' box. The ability of the various mutations to direct 3'-end formation was tested by microinjection into Xenopus oocytes and transfection into HeLa cells. We found that the point mutations had diverse effects on 3'-end formation, ranging from no effect at all to severe inhibition; however, no single or double point mutation we tested completely eliminated 3'-end formation. We also showed that a rat U3 3' flank can effectively substitute for the human U1 3' flank, indicating that the 3' boxes of the different U snRNA genes are functionally equivalent.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12434
Author(s):  
Bijendra Khadka ◽  
Radhey S. Gupta

Both SARS-CoV-2 and SARS coronaviruses (CoVs) are members of the subgenus Sarbecovirus. To understand the origin of SARS-CoV-2, sequences for the spike and nucleocapsid proteins from sarbecoviruses were analyzed to identify molecular markers consisting of conserved inserts or deletions (termed CSIs) that are specific for either a particular clade of Sarbecovirus or are commonly shared by two or more clades of these viruses. Three novel CSIs in the N-terminal domain (NTD) of the spike protein S1-subunit (S1-NTD) are uniquely shared by SARS-CoV-2, Bat-CoV-RaTG13 and most pangolin CoVs (SARS-CoV-2r clade). Three other sarbecoviruses viz. bat-CoVZXC21, -CoVZC45 and -PrC31 (forming CoVZC/PrC31 clade), and a pangolin-CoV_MP789 also contain related CSIs in the same positions. In contrast to the S1-NTD, both SARS and SARS-CoV-2r viruses contain two large CSIs in the S1-C-terminal domain (S1-CTD) that are absent in the CoVZC/PrC31 clade. One of these CSIs, consisting of a 12 aa insert, is also present in the RShSTT clade (Cambodia-CoV strains). Sequence similarity studies show that the S1-NTD of SARS-CoV-2r viruses is most similar to the CoVZC/PrC31 clade, whereas their S1-CTD exhibits highest similarity to the RShSTT- (and the SARS-related) CoVs. Results from the shared presence of CSIs and sequence similarity studies on different CoV lineages support the inference that the SARS-CoV-2r cluster of viruses has originated by a genetic recombination between the S1-NTD of the CoVZC/PrC31 clade of CoVs and the S1-CTD of RShSTT/SARS viruses, respectively. We also present compelling evidence, based on the shared presence of CSIs and sequence similarity studies, that the pangolin-CoV_MP789, whose receptor-binding domain is most similar to the SARS-CoV-2 virus, has resulted from another independent recombination event involving the S1-NTD of the CoVZC/PrC31 CoVs and the S1-CTD of an unidentified SARS-CoV-2r related virus. The SARS-CoV-2 virus involved in this latter recombination event is postulated to be most similar to the SARS-CoV-2. Several other CSIs reported here are specific for other clusters of sarbecoviruses including a clade consisting of bat-SARS-CoVs (BM48-31/BGR/2008 and SARS_BtKY72). Structural mapping studies show that the identified CSIs form distinct loops/patches on the surface of the spike protein. It is hypothesized that these novel loops/patches on the spike protein, through their interactions with other host components, should play important roles in the biology/pathology of SARS-CoV-2 virus. Lastly, the CSIs specific for different clades of sarbecoviruses including SARS-CoV-2r clade provide novel means for the identification of these viruses and other potential applications.


1991 ◽  
Vol 11 (3) ◽  
pp. 1754-1758
Author(s):  
B C Varnum ◽  
Q F Ma ◽  
T H Chi ◽  
B Fletcher ◽  
H R Herschman

The TIS11 primary response gene is rapidly and transiently induced by both 12-O-tetradecanoylphorbol-13-acetate and growth factors. The predicted TIS11 protein contains a 6-amino-acid repeat, YKTELC. We cloned two additional cDNAs, TIS11b and TIS11d, that contain the YKTELC sequence. TIS11, TIS11b, and TIS11d proteins share a 67-amino-acid region of sequence similarity that includes the YKTELC repeat and two cysteine-histidine containing repeats. TIS11 gene family members are not coordinately expressed: (i) unlike TIS11, the TIS11b and TIS11d mRNAs are detectable in quiescent Swiss 3T3 cells and are not dramatically induced by 12-O-tetradecanoylphorbol-13-acetate; (ii) cycloheximide superinduction does not occur for TIS11b and TIS11d; and (iii) unlike TIS11, TIS11b expression is extinguished in PC12 pheochromocytoma cells.


2002 ◽  
Vol 22 (8) ◽  
pp. 2564-2574 ◽  
Author(s):  
Anne Carr-Schmid ◽  
Christine Pfund ◽  
Elizabeth A. Craig ◽  
Terri Goss Kinzy

ABSTRACT G proteins, which bind and hydrolyze GTP, are involved in regulating a variety of critical cellular processes, including the process of protein synthesis. Many members of the subfamily of elongation factor class G proteins interact with the ribosome and function to regulate discrete steps during the process of protein synthesis. Despite sequence similarity to factors involved in translation, a role for the yeast Hbs1 protein has not been defined. In this work we have identified a genetic relationship between genes encoding components of the translational apparatus and HBS1. HBS1, while not essential for viability, is important for efficient growth and protein synthesis under conditions of limiting translation initiation. The identification of an Hbs1p-interacting factor, Dom34p, which shares a similar genetic relationship with components of the translational apparatus, suggests that Hbs1p and Dom34p may function as part of a complex that facilitates gene expression. Dom34p contains an RNA binding motif present in several ribosomal proteins and factors that regulate translation of specific mRNAs. Thus, Hbs1p and Dom34p may function together to help directly or indirectly facilitate the expression either of specific mRNAs or under certain cellular conditions.


1989 ◽  
Vol 35 (1) ◽  
pp. 200-204 ◽  
Author(s):  
Johannes Auer ◽  
Konrad Lechner ◽  
August Bock

Two transcriptional units coding for ribosomal proteins and protein synthesis elongation factors in Methanococcus vannielii have been cloned and analysed in detail. They correspond to the "streptomycin operon" and "spectinomycin operon" of the Escherichia coli chromosome. The following general conclusions can be drawn from comparison of the nucleotide and the derived amino acid sequences of ribosomal proteins from Methanococcus with those from eubacteria and eukaryotes. (i) Ribosomal protein and elongation factor genes in Methanococcus are clustered in transcriptional units corresponding closely to E. coli ribosomal protein operons with respect to both gene composition and organization. (ii) These transcriptional units contain, in addition, a few open reading frames whose putative gene products share sequence similarity with eukaryotic 80S but not with eubacterial, ribosomal proteins. They may correspond to "additional" ribosomal proteins of the Methanococcus ribosome, there being no functional homologues in the eubacterial ribosome. (iii) Methanococcus ribosomal proteins and elongation factors almost exclusively exhibit a higher sequence similarity to eukaryotic 80S ribosomal proteins than to those of eubacteria. (iv) Many Methanococcus ribosomal proteins have a size intermediate between those of their eukaryotic and eubacterial homologues. These results are discussed in terms of a hypothesis which implies that the recent eubacterial ribosome developed by a "minimization" process from a more complex organelle and that the archaebacterial ribosome has maintained features of this ancestor.Key words: archaebacteria, Methanococcus, transcription factors, clonal analysis.


Sign in / Sign up

Export Citation Format

Share Document