conservation patterns
Recently Published Documents


TOTAL DOCUMENTS

50
(FIVE YEARS 19)

H-INDEX

11
(FIVE YEARS 3)

2022 ◽  
Vol 12 ◽  
Author(s):  
Arangasamy Yazhini ◽  
Narayanaswamy Srinivasan ◽  
Sankaran Sandhya

Multi-protein assemblies are complex molecular systems that perform highly sophisticated biochemical functions in an orchestrated manner. They are subject to changes that are governed by the evolution of individual components. We performed a comparative analysis of the ancient and functionally conserved spliceosomal SF3b complex, to recognize molecular signatures that contribute to sequence divergence and functional specializations. For this, we recognized homologous sequences of individual SF3b proteins distributed across 10 supergroups of eukaryotes and identified all seven protein components of the complex in 578 eukaryotic species. Using sequence and structural analysis, we establish that proteins occurring on the surface of the SF3b complex harbor more sequence variation than the proteins that lie in the core. Further, we show through protein interface conservation patterns that the extent of conservation varies considerably between interacting partners. When we analyze phylogenetic distributions of individual components of the complex, we find that protein partners that are known to form independent subcomplexes are observed to share similar profiles, reaffirming the link between differential conservation of interface regions and their inter-dependence. When we extend our analysis to individual protein components of the complex, we find taxa-specific variability in molecular signatures of the proteins. These trends are discussed in the context of proline-rich motifs of SF3b4, functional and drug binding sites of SF3b1. Further, we report key protein-protein interactions between SF3b1 and SF3b6 whose presence is observed to be lineage-specific across eukaryotes. Together, our studies show the association of protein location within the complex and subcomplex formation patterns with the sequence conservation of SF3b proteins. In addition, our study underscores evolutionarily flexible elements that appear to confer adaptive features in individual components of the multi-protein SF3b complexes and may contribute to its functional adaptability.


2021 ◽  
Vol 1 ◽  
Author(s):  
Karl Gemayel ◽  
Alexandre Lomsadze ◽  
Mark Borodovsky

State-of-the-art algorithms of ab initio gene prediction for prokaryotic genomes were shown to be sufficiently accurate. A pair of algorithms would agree on predictions of gene 3′ends. Nonetheless, predictions of gene starts would not match for 15–25% of genes in a genome. This discrepancy is a serious issue that is difficult to be resolved due to the absence of sufficiently large sets of genes with experimentally verified starts. We have introduced StartLink that infers gene starts from conservation patterns revealed by multiple alignments of homologous nucleotide sequences. We also have introduced StartLink+ combining both ab initio and alignment-based methods. The ability of StartLink to predict the start of a given gene is restricted by the availability of homologs in a database. We observed that StartLink made predictions for 85% of genes per genome on average. The StartLink+ accuracy was shown to be 98–99% on the sets of genes with experimentally verified starts. In comparison with database annotations, we observed that the annotated gene starts deviated from the StartLink+ predictions for ∼5% of genes in AT-rich genomes and for 10–15% of genes in GC-rich genomes on average. The use of StartLink+ has a potential to significantly improve gene start annotation in genomic databases.


2021 ◽  
Vol 31 (4) ◽  
pp. 518-534
Author(s):  
Lin Huang ◽  
Jia Ning ◽  
Ping Zhu ◽  
Yuhan Zheng ◽  
Jun Zhai

2021 ◽  
pp. 074873042199987
Author(s):  
Brian A. Upton ◽  
Nicolás M. Díaz ◽  
Shannon A. Gordon ◽  
Russell N. Van Gelder ◽  
Ethan D. Buhr ◽  
...  

Animals have evolved light-sensitive G protein–coupled receptors, known as opsins, to detect coherent and ambient light for visual and nonvisual functions. These opsins have evolved to satisfy the particular lighting niches of the organisms that express them. While many unique patterns of evolution have been identified in mammals for rod and cone opsins, far less is known about the atypical mammalian opsins. Using genomic data from over 400 mammalian species from 22 orders, unique patterns of evolution for each mammalian opsins were identified, including photoisomerases, RGR-opsin (RGR) and peropsin (RRH), as well as atypical opsins, encephalopsin (OPN3), melanopsin (OPN4), and neuropsin (OPN5). The results demonstrate that OPN5 and rhodopsin show extreme conservation across all mammalian lineages. The cone opsins, SWS1 and LWS, and the nonvisual opsins, OPN3 and RRH, demonstrate a moderate degree of sequence conservation relative to other opsins, with some instances of lineage-specific gene loss. Finally, the photoisomerase, RGR, and the best-studied atypical opsin, OPN4, have high sequence diversity within mammals. These conservation patterns are maintained in human populations. Importantly, all mammalian opsins retain key amino acid residues important for conjugation to retinal-based chromophores, permitting light sensitivity. These patterns of evolution are discussed along with known functions of each atypical opsin, such as in circadian or metabolic physiology, to provide insight into the observed patterns of evolutionary constraint.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mahmoud Bayoumi ◽  
Muhammad Munir

AbstractThe addition of a methyl group to the N6-position of adenosine (m6A) is considered one of the most prevalent internal post-transcriptional modifications and is attributed to virus replication and cell biology. Viral epitranscriptome sequencing analysis has revealed that hemagglutinin (HA) mRNA of H1N1 carry eight m6A sites which are primarily enriched in 5′-DRACH-3′ sequence motif. Herein, a large-scale comparative m6A analysis was conducted to investigate the conservation patterns of the DRACH motifs that corresponding to the reference m6A sites among influenza A viruses. A total of 70,030 complete HA sequences that comprise all known HA subtypes (H1–18) collected over several years, countries, and affected host species were analysed on both mRNA and vRNA strands. The bioinformatic analysis revealed the highest degree of DRACHs conservation among all H1 sequences that clustered largely in the middle and in the vicinity to 3′ end with at least four DRACH motifs were conserved in all mRNA sequences. The major HA-containing subtypes displayed a modest DRACH motif conservation located either in the middle region of HA transcript (H3) or at the 3′ end (H5) or were distributed across the length of HA sequence (H9). The lowest conservation was demonstrated in HA subtypes that infect mostly the wild type avian species and bats. Interestingly, the total number and the conserved DRACH motifs in the vRNA were found to be much lower than those observed in the mRNA. Collectively, the identification of putative m6A topology provides a foundation for the future intervention of influenza infection, replication, and pathobiology in susceptible hosts.


mBio ◽  
2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Russell Y. Neches ◽  
Nikos C. Kyrpides ◽  
Christos A. Ouzounis

ABSTRACT Orf8, one of the most puzzling genes in the SARS lineage of coronaviruses, marks a unique and striking difference in genome organization between SARS-CoV-2 and SARS-CoV-1. Here, using sequence comparisons, we unequivocally reveal the distant sequence similarities between SARS-CoV-2 Orf8 with its SARS-CoV-1 counterparts and the X4-like genes of coronaviruses, including its highly divergent “paralog” gene Orf7a, whose product is a potential immune antagonist of known structure. Supervised sequence space walks unravel identity levels that drop below 10% and yet exhibit subtle conservation patterns in this novel superfamily, characterized by an immunoglobulin-like beta sandwich topology. We document the high accuracy of the sequence space walk process in detail and characterize the subgroups of the superfamily in sequence space by systematic annotation of gene and taxon groups. While SARS-CoV-1 Orf7a and Orf8 genes are most similar to bat virus sequences, their SARS-CoV-2 counterparts are closer to pangolin virus homologs, reflecting the fine structure of conservation patterns within the SARS-CoV-2 genomes. The divergence between Orf7a and Orf8 is exceptionally idiosyncratic, since Orf7a is more constrained, whereas Orf8 is subject to rampant change, a peculiar feature that may be related to hitherto-unknown viral infection strategies. Despite their common origin, the Orf7a and Orf8 protein families exhibit different modes of evolutionary trajectories within the coronavirus lineage, which might be partly attributable to their complex interactions with the mammalian host cell, reflected by a multitude of functional associations of Orf8 in SARS-CoV-2 compared to a very small number of interactions discovered for Orf7a. IMPORTANCE Orf8 is one of the most puzzling genes in the SARS lineage of coronaviruses, including SARS-CoV-2. Using sophisticated sequence comparisons, we confirm its origins from Orf7a, another gene in the lineage that appears as more conserved, compared to Orf8. Orf7a is a potential immune antagonist of known structure, while a deletion of Orf8 was shown to decrease the severity of the infection in a cohort study. The subtle sequence similarities imply that Orf8 has the same immunoglobulin-like fold as Orf7a, confirmed by structure determination. We characterize the subgroups of this superfamily and demonstrate the highly idiosyncratic divergence patterns during the evolution of the virus.


Author(s):  
Srdjan Stojanovic ◽  
Zoran Petrovic ◽  
Mario Zlatovic

In this work, we have analyzed the influence of amide-? interactions on stability and properties of superoxide dismutase (SOD) active centers. In the data set of 43 proteins, 5017 amide-? interactions were observed, and every active center forms 117 interactions, on the average. Most of the interactions belong to the backbone of proteins. The analysis of the geometry of the amide-? interactions revealed two preferred structures, parallel-displaced and T-shaped structure. The aim of this study was to investigate the energy contribution resulting from amide-? interactions, which were in the lower range of strong hydrogen bonds. The conservation patterns in the present study indicate that more than half of the residues involved in these interactions are evolutionarily conserved. Stabilization centers for these proteins showed that all residues involved in amide-? interactions were important in locating one or more of such centers. The results presented in this work can be very useful for understanding the contribution of amide-? interaction to the stability of SOD active centers.


F1000Research ◽  
2020 ◽  
Vol 9 ◽  
pp. 213 ◽  
Author(s):  
Roc Reguant ◽  
Yevgeniy Antipin ◽  
Rob Sheridan ◽  
Christian Dallago ◽  
Drew Diamantoukos ◽  
...  

AlignmentViewer is a web-based tool to view and analyze multiple sequence alignments of protein families. The particular strengths of AlignmentViewer include flexible visualization at different scales as well as analysis of conservation patterns and of the distribution of proteins in sequence space. The tool is directly accessible in web browsers without the need for software installation. It can handle protein families with tens of thousands of sequences and is particularly suitable for evolutionary coupling analysis, e.g. via EVcouplings.org.


2020 ◽  
Vol 117 (38) ◽  
pp. 23606-23616
Author(s):  
Min-Hyung Cho ◽  
James O. Wrabl ◽  
James Taylor ◽  
Vincent J. Hilser

Phosphorylation sites are hyperabundant in the eukaryotic disordered proteome, suggesting that conformational fluctuations play a major role in determining to what extent a kinase interacts with a particular substrate. In biophysical terms, substrate selectivity may be determined not just by the structural–chemical complementarity between the kinase and its protein substrates but also by the free energy difference between the conformational ensembles that are, or are not, recognized by the kinase. To test this hypothesis, we developed a statistical-thermodynamics-based informatics framework, which allows us to probe for the contribution of equilibrium fluctuations to phosphorylation, as evaluated by the ability to predict Ser/Thr/Tyr phosphorylation sites in the disordered proteome. Essential to this framework is a decomposition of substrate sequence information into two types: vertical information encoding conserved kinase specificity motifs and horizontal information encoding substrate conformational equilibrium that is embedded, but often not apparent, within position-specific conservation patterns. We find not only that conformational fluctuations play a major role but also that they are the dominant contribution to substrate selectivity. In fact, the main substrate classifier distinguishing selectivity is the magnitude of change in local compaction of the disordered chain upon phosphorylation of these mostly singly phosphorylated sites. In addition to providing fundamental insights into the consequences of phosphorylation across the proteome, our approach provides a statistical-thermodynamic strategy for partitioning any sequence-based search into contributions from structural–chemical complementarity and those from changes in conformational equilibrium.


2020 ◽  
Author(s):  
Camila Pontes ◽  
Victoria Ruiz-Serra ◽  
Rosalba Lepore ◽  
Alfonso Valencia

AbstractThe recent emergence of the novel SARS-CoV-2 in China and its rapid spread in the human population has led to a public health crisis worldwide. Like in SARS-CoV, horseshoe bats currently represent the most likely candidate animal source for SARS-CoV-2. Yet, the specific mechanisms of cross-species transmission and adaptation to the human host remain unknown. Here we show that the unsupervised analysis of conservation patterns across the β-CoV spike protein family, using sequence information alone, can provide rich information on the molecular basis of the specificity of β-CoVs to different host cell receptors. More precisely, our results indicate that host cell receptor usage is encoded in the amino acid sequences of different CoV spike proteins in the form of a set of specificity determining positions (SDPs). Furthermore, by integrating structural data, in silico mutagenesis and coevolution analysis we could elucidate the role of SDPs in mediating ACE2 binding across the Sarbecovirus lineage, either by engaging the receptor through direct intermolecular interactions or by affecting the local environment of the receptor binding motif. Finally, by the analysis of coevolving mutations across a paired MSA we were able to identify key intermolecular contacts occurring at the spike-ACE2 interface. These results show that effective mining of the evolutionary records held in the sequence of the spike protein family can help tracing the molecular mechanisms behind the evolution and host-receptors adaptation of circulating and future novel β-CoVs.SignificanceUnraveling the molecular basis for host cell receptor usage among β-CoVs is crucial to our understanding of cross-species transmission, adaptation and for molecular-guided epidemiological monitoring of potential outbreaks. In the present study, we survey the sequence conservation patterns of the β-CoV spike protein family to identify the evolutionary constraints shaping the functional specificity of the protein across the β-CoV lineage. We show that the unsupervised analysis of statistical patterns in a MSA of the spike protein family can help tracing the amino acid space encoding the specificity of β-CoVs to their cognate host cell receptors. We argue that the results obtained in this work can provide a framework for monitoring the evolution of SARS-CoV-2 specificity to the hACE2 receptor, as the virus continues spreading in the human population and differential virulence starts to arise.


Sign in / Sign up

Export Citation Format

Share Document