Bacterial small membrane proteins: the Swiss army knife of regulators at the lipid bilayer

2021 ◽  
Author(s):  
Srujana S. Yadavalli ◽  
Jing Yuan

Small membrane proteins represent a subset of recently discovered small proteins (≤100 amino acids), which are a ubiquitous class of emerging regulators underlying bacterial adaptation to environmental stressors. Until relatively recently, small open reading frames encoding these proteins were not designated as genes in genome annotations. Therefore, our understanding of small protein biology was primarily limited to a few candidates associated with previously characterized larger partner proteins. Following the first systematic analyses of small proteins in E. coli over a decade ago, numerous small proteins have been uncovered across different bacteria. An estimated one-third of these newly discovered proteins are localized to the cell membrane, where they may interact with distinct groups of membrane proteins such as signal receptors, transporters, and enzymes, and affect their activities. Recently, there has been considerable progress in functionally characterizing small membrane protein regulators aided by innovative tools adapted specifically to study small proteins. Our review covers prototypical proteins that modulate a broad range of cellular processes such as transport, signal transduction, stress response, respiration, cell division, sporulation as well as membrane stability. Thus, small membrane proteins represent a versatile group of regulators of physiology not just at the membrane but the whole cell. Additionally, small membrane proteins have the potential for clinical applications, where some of the proteins may act as antibacterial agents themselves, while others serve as alternative drug targets for the development of novel antimicrobials.

2016 ◽  
Vol 44 (3) ◽  
pp. 790-795 ◽  
Author(s):  
Andrea E. Rawlings

Membrane proteins play crucial roles in cellular processes and are often important pharmacological drug targets. The hydrophobic properties of these proteins make full structural and functional characterization challenging because of the need to use detergents or other solubilizing agents when extracting them from their native lipid membranes. To aid membrane protein research, new methodologies are required to allow these proteins to be expressed and purified cheaply, easily, in high yield and to provide water soluble proteins for subsequent study. This mini review focuses on the relatively new area of water soluble membrane proteins and in particular two innovative approaches: the redesign of membrane proteins to yield water soluble variants and how adding solubilizing fusion proteins can help to overcome these challenges. This review also looks at naturally occurring membrane proteins, which are able to exist as stable, functional, water soluble assemblies with no alteration to their native sequence.


2020 ◽  
Vol 6 (4) ◽  
pp. 41
Author(s):  
Mihnea P. Dragomir ◽  
Ganiraju C. Manyam ◽  
Leonie Florence Ott ◽  
Léa Berland ◽  
Erik Knutsen ◽  
...  

Non-coding RNAs (ncRNAs) are essential players in many cellular processes, from normal development to oncogenic transformation. Initially, ncRNAs were defined as transcripts that lacked an open reading frame (ORF). However, multiple lines of evidence suggest that certain ncRNAs encode small peptides of less than 100 amino acids. The sequences encoding these peptides are known as small open reading frames (smORFs), many initiating with the traditional AUG start codon but terminating with atypical stop codons, suggesting a different biogenesis. The ncRNA-encoded peptides (ncPEPs) are gradually becoming appreciated as a new class of functional molecules that contribute to diverse cellular processes, and are deregulated in different diseases contributing to pathogenesis. As multiple publications have identified unique ncPEPs, we appreciated the need for assembling a new web resource that could gather information about these functional ncPEPs. We developed FuncPEP, a new database of functional ncRNA encoded peptides, containing all experimentally validated and functionally characterized ncPEPs. Currently, FuncPEP includes a comprehensive annotation of 112 functional ncPEPs and specific details regarding the ncRNA transcripts that encode these peptides. We believe that FuncPEP will serve as a platform for further deciphering the biologic significance and medical use of ncPEPs. The link for FuncPEP database can be found at the end of the Introduction Section.


2021 ◽  
Author(s):  
Yanyan Li ◽  
Honghong Zhou ◽  
Xiaomin Chen ◽  
Yu Zheng ◽  
Quan Kang ◽  
...  

Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames (sORFs), which were usually missed in previous genome annotation. The significance of small proteins has been revealed in current years, along with the discovery of their diverse functions. However, systematic annotation of small proteins is still insufficient. SmProt was specially developed to provide valuable information on small proteins for scientific community. Here we present the update of SmProt, which emphasizes reliability of translated sORFs, genetic variants in translated sORFs, disease-specific sORFs translation events or sequences, and significantly increased data volume. More components such as non-AUG translation initiation, function, and new sources are also included. SmProt incorporated 638,958 unique small proteins curated from 3,165,229 primary records, which were computationally predicted from 419 ribosome profiling (Ribo-seq) datasets and collected from the literature and other sources originating from 370 cell lines or tissues in 8 species (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Danio rerio, Saccharomyces cerevisiae, Caenorhabditis elegans, and Escherichia coli). In addition, small protein families identified from human microbiomes were collected. All datasets in SmProt are free to access, and available for browse, search, and bulk downloads at http://bigdata.ibp.ac.cn/SmProt/.


2021 ◽  
Author(s):  
Nikolaos Vakirlis ◽  
Kate M. Duggan ◽  
Aoife McLysaght

We now have a growing understanding that functional short proteins can be translated out of small Open Reading Frames (sORF). Such ″microproteins″ can perform crucial biological tasks and can have considerable phenotypic consequences. However, their size makes them less amenable to genomic analysis, and their evolutionary origins and conservation are poorly understood. Given their short length it is plausible that some of these functional microproteins have recently originated entirely de novo from non-coding sequence. Here we test the possibility that de novo gene birth can produce microproteins that are functional ″out-of-the-box″. We reconstructed the evolutionary origins of human microproteins previously found to have measurable, statistically significant fitness effects. By tracing the appearance of each ORF and its transcriptional activation, we were able to show that, indeed, novel small proteins with significant phenotypic effects have emerged de novo throughout animal evolution, including many after the human-chimpanzee split. We show that traditional methods for assessing the coding potential of such sequences often fall short, due to the high variability present in the alignments and the absence of telltale evolutionary signatures that are not yet measurable. Thus we provide evidence that the functional potential intrinsic to sORFs can be rapidly, and frequently realised through de novo gene birth.


Author(s):  
Christian Jean Michel ◽  
Claudine Mayer ◽  
Olivier Poch ◽  
Julie Dawn Thompson

Abstract Background: The Covid19 infection is caused by the SARS-CoV-2 virus, a novel member of the coronavirus (CoV) family. CoV genomes code for a ORF1a / ORF1ab polyprotein and four structural proteins widely studied as major drug targets. The genomes also contain a variable number of open reading frames (ORFs) coding for accessory proteins that are not essential for virus replication, but appear to have a role in pathogenesis. The accessory proteins have been less well characterized and are difficult to predict by classical bioinformatics methods.Methods: We propose a computational tool GOFIX to characterize potential ORFs in virus genomes. In particular, ORF coding potential is estimated by searching for enrichment in motifs of the X circular code, that is known to be over-represented in the reading frames of viral genes.Results: We applied GOFIX to study the SARS-CoV-2 and related genomes including SARS-CoV and SARS-like viruses from bat, civet and pangolin hosts, focusing on the accessory proteins. Our analysis provides evidence supporting the presence of overlapping ORFs 7b, 9b and 9c in all the genomes and thus helps to resolve some differences in current genome annotations. In contrast, we predict that ORF3b is not functional in all genomes. Novel putative ORFs were also predicted, including a truncated form of the ORF10 previously identified in SARS-CoV-2 and a little known ORF overlapping the Spike protein in Civet-CoV and SARS-CoV.Conclusions: Our findings contribute to characterizing sequence properties of accessory genes of SARS coronaviruses, and especially the newly acquired genes making use of overlapping reading frames.


2021 ◽  
Author(s):  
Rick Gelhausen ◽  
Teresa Müller ◽  
Sarah Svensson ◽  
Omer S. Alkhnbashi ◽  
Cynthia M. Sharma ◽  
...  

Small proteins, those encoded by open reading frames, with less than or equal to 50 codons, are emerging as an important class of cellular macromolecules in all kingdoms of life. However, they are recalcitrant to detection by proteomics or in silico methods. Ribosome profiling (Ribo-seq) has revealed widespread translation of sORFs in diverse species, and this has driven the development of ORF detection tools using Ribo-seq read signals. However, only a handful of tools have been designed for bacterial data, and have not yet been systematically compared. Here, we have performed a comprehensive benchmark of ORF prediction tools which handle bacterial Ribo-seq data. For this, we created a novel Ribo-seq dataset for E. coli, and based on this plus three publicly available datasets for different bacteria, we created a benchmark set by manual labeling of translated ORFs using their Ribo-seq expression profile. This was then used to investigate the predictive performance of four Ribo-seq-based ORF detection tools we found are compatible with bacterial data (REPARATION_blast, DeepRibo, Ribo-TISH and SPECtre). The tool IRSOM was also included as a comparison for tools using coding potential and RNA-seq coverage only. DeepRibo and REPARATION_blast robustly predicted translated ORFs, including sORFs, with no significant difference for those inside or outside of operons. However, none of the tools was able to predict a set of recently identified, novel, experimentally-verified sORFs with high sensitivity. Overall, we find there is potential for improving the performance, applicability, usability, and reproducibility of prokaryotic ORF prediction tools that use Ribo-Seq as input.


Processes ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. 629
Author(s):  
José Rodrigues ◽  
Vanessa T. Almeida ◽  
Ana L. Rosário ◽  
Yong Zi Tan ◽  
Brian Kloss ◽  
...  

Studies on membrane proteins can help to develop new drug targets and treatments for a variety of diseases. However, membrane proteins continue to be among the most challenging targets in structural biology. This uphill endeavor can be even harder for membrane proteins from Mycobacterium species, which are notoriously difficult to express in heterologous systems. Arabinofuranosyltransferases are involved in mycobacterial cell wall synthesis and thus potential targets for antituberculosis drugs. A set of 96 mycobacterial genes coding for Arabinofuranosyltransferases was selected, of which 17 were successfully expressed in E. coli and purified by metal-affinity chromatography. We herein present an efficient high-throughput strategy to screen in microplates a large number of targets from Mycobacteria and select the best conditions for large-scale protein production to pursue functional and structural studies. This methodology can be applied to other targets, is cost and time effective and can be implemented in common laboratories.


2020 ◽  
Author(s):  
Stephan Fuchs ◽  
Martin Kucklick ◽  
Erik Lehmann ◽  
Alexander Beckmann ◽  
Maya Wilkens ◽  
...  

AbstractSmall proteins play diverse and essential roles in bacterial physiology and virulence. Despite their importance, automated genome annotation algorithms still cannot accurately annotate all respective small open reading frames (sORFs), as they usually provide insufficient sequence information for domain and homology searches, tend to be species specific and only a few experimentally validated examples are covered in standard proteomics studies. The accuracy and reliability of genome annotations, particularly for sORFs, can be significantly improved by integrating protein evidence from experimental approaches that enrich for small proteins. Here we present a highly optimized and flexible workflow for bacterial proteogenomics, which covers all steps from (i) creation of protein databases, (ii) database searches, (iii) peptide-to-genome mapping to (iv) result interpretation and whose automated execution is supported by two open source tools (SALT & Pepper). We used the workflow to identify high quality peptide spectrum matches (PSMs) for both annotated and unannotated small proteins (≤ 100 aa; SP100) in Staphylococcus aureus Newman. Proteins isolated from cells at the exponential and stationary growth phase were digested with different endopeptidases (trypsin, Lys-C, AspN), the resulting peptides fractionated by gel-based and gel-free methods and measured with highly sensitive mass spectrometers. PSMs or sORF predictions from sORFfinder were stringently filtered allowing us to detect 185 soluble SP100, 69 of which were missing in the used genome annotation. Most interestingly, almost half of the identified SP100 were basic, suggesting a role in binding to more acidic molecules such as nucleic acids or phospholipids. In addition, phage-related functions were proposed for 30 SP100, based on the localization of their coding sequences in the genome.


2021 ◽  
Vol 12 ◽  
Author(s):  
Igor Fijalkowski ◽  
Marlies K. R. Peeters ◽  
Petra Van Damme

With the rapid growth in the number of sequenced genomes, genome annotation efforts became almost exclusively reliant on automated pipelines. Despite their unquestionable utility, these methods have been shown to underestimate the true complexity of the studied genomes, with small open reading frames (sORFs; ORFs typically considered shorter than 300 nucleotides) and, in consequence, their protein products (sORF encoded polypeptides or SEPs) being the primary example of a poorly annotated and highly underexplored class of genomic elements. With the advent of advanced translatomics such as ribosome profiling, reannotation efforts have progressed a great deal in providing translation evidence for numerous, previously unannotated sORFs. However, proteomics validation of these riboproteogenomics discoveries remains challenging due to their short length and often highly variable physiochemical properties. In this work we evaluate and compare tailored, yet easily adaptable, protein extraction methodologies for their efficacy in the extraction and concomitantly proteomics detection of SEPs expressed in the prokaryotic model pathogen Salmonella typhimurium (S. typhimurium). Further, an optimized protocol for the enrichment and efficient detection of SEPs making use of the of amphipathic polymer amphipol A8-35 and relying on differential peptide vs. protein solubility was developed and compared with global extraction methods making use of chaotropic agents. Given the versatile biological functions SEPs have been shown to exert, this work provides an accessible protocol for proteomics exploration of this fascinating class of small proteins.


2021 ◽  
Author(s):  
Fengyuan Hu ◽  
Jia Lu ◽  
Manuel D. Munoz ◽  
Alexander Saveliev ◽  
Martin Turner

AbstractThe annotation of small open reading frames (smORFs) of less than 100 codons (<300 nucleotides) is challenging due to the large number of such sequences in the genome. The recent development of next generation sequence and ribosome profiling enables identification of actively translated smORFs. In this study, we developed a computational pipeline, which we have named ORFLine, that stringently identifies smORFs and classifies them according to their position within transcripts. We identified a total of 5744 unique smORFs in datasets from mouse B and T lymphocytes and systematically characterized them using ORFLine. We further searched smORFs for the presence of a signal peptide, which predicted known secreted chemokines as well as novel micropeptides. Five novel micropeptides show evidence of secretion and are therefore candidate mediators of immunoregulatory functions.


Sign in / Sign up

Export Citation Format

Share Document