bacterial proteomes
Recently Published Documents


TOTAL DOCUMENTS

38
(FIVE YEARS 13)

H-INDEX

11
(FIVE YEARS 3)

2021 ◽  
Vol 12 ◽  
Author(s):  
Anastasis Oulas ◽  
Margarita Zachariou ◽  
Christos T. Chasapis ◽  
Marios Tomazou ◽  
Umer Z. Ijaz ◽  
...  

The predominance of bacterial taxa in the gut, was examined in view of the putative antimicrobial peptide sequences (AMPs) within their proteomes. The working assumption was that compatible bacteria would share homology and thus immunity to their putative AMPs, while competing taxa would have dissimilarities in their proteome-hidden AMPs. A network–based method (“Bacterial Wars”) was developed to handle sequence similarities of predicted AMPs among UniProt-derived protein sequences from different bacterial taxa, while a resulting parameter (“Die” score) suggested which taxa would prevail in a defined microbiome. T he working hypothesis was examined by correlating the calculated Die scores, to the abundance of bacterial taxa from gut microbiomes from different states of health and disease. Eleven publicly available 16S rRNA datasets and a dataset from a full shotgun metagenomics served for the analysis. The overall conclusion was that AMPs encrypted within bacterial proteomes affected the predominance of bacterial taxa in chemospheres.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Andrew J. Hayes ◽  
Jessica M. Lewis ◽  
Mark R. Davies ◽  
Nichollas E. Scott

AbstractGlycosylation is increasingly recognised as a common protein modification within bacterial proteomes. While great strides have been made in identifying species that contain glycosylation systems, our understanding of the proteins and sites targeted by these systems is far more limited. Within this work we explore the conservation of glycoproteins and glycosylation sites across the pan-Burkholderia glycoproteome. Using a multi-protease glycoproteomic approach, we generate high-confidence glycoproteomes in two widely utilized B. cenocepacia strains, K56-2 and H111. This resource reveals glycosylation occurs exclusively at Serine residues and that glycoproteins/glycosylation sites are highly conserved across B. cenocepacia isolates. This preference for glycosylation at Serine residues is observed across at least 9 Burkholderia glycoproteomes, supporting that Serine is the dominant residue targeted by PglL-mediated glycosylation across the Burkholderia genus. Combined, this work demonstrates that PglL enzymes of the Burkholderia genus are Serine-preferring oligosaccharyltransferases that target conserved and shared protein substrates.


2021 ◽  
Vol 118 (21) ◽  
pp. e2020885118
Author(s):  
Mathieu E. Rebeaud ◽  
Saurav Mallik ◽  
Pierre Goloubinoff ◽  
Dan S. Tawfik

Across the Tree of Life (ToL), the complexity of proteomes varies widely. Our systematic analysis depicts that from the simplest archaea to mammals, the total number of proteins per proteome expanded ∼200-fold. Individual proteins also became larger, and multidomain proteins expanded ∼50-fold. Apart from duplication and divergence of existing proteins, completely new proteins were born. Along the ToL, the number of different folds expanded ∼5-fold and fold combinations ∼20-fold. Proteins prone to misfolding and aggregation, such as repeat and beta-rich proteins, proliferated ∼600-fold and, accordingly, proteins predicted as aggregation-prone became 6-fold more frequent in mammalian compared with bacterial proteomes. To control the quality of these expanding proteomes, core chaperones, ranging from heat shock proteins 20 (HSP20s) that prevent aggregation to HSP60, HSP70, HSP90, and HSP100 acting as adenosine triphosphate (ATP)-fueled unfolding and refolding machines, also evolved. However, these core chaperones were already available in prokaryotes, and they comprise ∼0.3% of all genes from archaea to mammals. This challenge—roughly the same number of core chaperones supporting a massive expansion of proteomes—was met by 1) elevation of messenger RNA (mRNA) and protein abundances of the ancient generalist core chaperones in the cell, and 2) continuous emergence of new substrate-binding and nucleotide-exchange factor cochaperones that function cooperatively with core chaperones as a network.


2021 ◽  
Author(s):  
Andrew J. Hayes ◽  
Jessica M. Lewis ◽  
Mark R. Davies ◽  
Nichollas E. Scott

AbstractGlycosylation is increasingly recognised as a common protein modification within bacterial proteomes. While great strides have been made in identifying species that contain glycosylation systems, our understanding of the proteins and sites targeted by these enzymes is far more limited. Within this work we explore the conservation of glycoproteins and O-linked glycosylation sites across the pan-Burkholderia glycoproteome. Using a multi-protease glycoproteomic approach we generate high-confidence glycoproteomes and associated glycosylation sites in two widely utilized B. cenocepacia strains, K56-2 and H111. This resource reveals glycosylation occurs exclusively at serine residues and that glycoproteins/glycosylation sites are highly conserved across 294 publicly available B. cenocepacia genomes. Consistent with this we demonstrate that the substitution of Serine for Threonine residues in a model protein results in a dramatic decrease in glycosylation efficiency by the oligosaccharidetransferase pglLBC even when pglLBC is overexpressed. This preference for glycosylation at Serine residues is observed across at least 9 Burkholderia glycoproteomes supporting that Serine is the dominant residue targeted by pglL-mediated glycosylation across the Burkholderia genus. Using population genomics we observe that pglL targeted glycosylated proteins are common across Burkholderia species. Combined, this work demonstrates that PglL enzymes of the Burkholderia genus are Serine-preferring oligosaccharidetransferases that target conserved and shared protein substrates across the Burkholderia genus.


Author(s):  
G.S. Dotsenko ◽  
A.S. Dotsenko

Mining protein data is a recent promising area of modern bioinformatics. In this work, we suggested a novel approach for mining protein data – conserved peptides recognition by ensemble of neural networks (CPRENN). This approach was applied for mining lytic polysaccharide monooxygenases (LPMOs) in 19 ascomycete, 18 basidiomycete, and 18 bacterial proteomes. LPMOs are recently discovered enzymes and their mining is of high relevance for biotechnology of lignocellulosic materials. CPRENN was compared with two conventional bioinformatic methods for mining protein data – profile hidden Markov models (HMMs) search (HMMER program) and peptide pattern recognition (PPR program combined with Hotpep application). The maximum number of hypothetical LPMO amino acid sequences was discovered by HMMER. Profile HMMs search proved to be more sensitive method for mining LPMOs than conserved peptides recognition. Totally, CPRENN found 76 %, 67 %, and 65 % of hypothetical ascomycete, basidiomycete, and bacterial LPMOs discovered by HMMER, respectively. For AA9, AA10, and AA11 families which contain the major part of all LPMOs in the carbohydrate-active enzymes database (CAZy), CPRENN and PPR + Hotpep found 69–98 % and 62–95 % of amino acid sequences discovered by HMMER, respectively. In contrast with PPR + Hotpep, CPRENN possessed perfect precision and provided more complete mining of basidiomycete and bacterial LPMOs.


2020 ◽  
Author(s):  
Vivian Monzon ◽  
Aleix Lafita ◽  
Alex Bateman

AbstractBackgroundFibrillar adhesins are long multidomain proteins attached at the cell surface and composed of at least one adhesive domain and multiple tandemly repeated domains, which build an elongated stalk that projects the adhesive domain beyond the bacterial cell surface. They are an important yet understudied class of proteins that mediate interactions of bacteria with their environment. This study aims to characterize fibrillar adhesins in a wide range of bacterial phyla and to identify new fibrillar adhesin-like proteins to improve our understanding of host-bacteria interactions.ResultsBy careful search for fibrillar adhesins in the literature and by computational analysis we identified 75 stalk domains and 24 adhesive domains. Based on the presence of these domains in the UniProt Reference Proteomes database, we identified and analysed 3,388 fibrillar adhesin-like proteins across species of the most common bacterial phyla. We found that the bacterial proteomes with the highest fraction of fibrillar adhesins include several known pathogens. We further enumerate the adhesive and stalk domain combinations found in nature and demonstrate that fibrillar adhesins have complex and variable domain architectures, which differ across species. By analysing the domain architecture of fibrillar adhesins we show that in Gram positive bacteria adhesive domains are mostly positioned at the N-terminus of the protein with the cell surface anchor at the C-terminus, while their positions are more variable in Gram negative bacteria. We provide an open repository of fibrillar adhesin-like proteins and domains to facilitate downstream studies of this class of bacterial surface proteins.ConclusionThis study provides a domain-based characterization of fibrillar adhesins and demonstrates that they are widely found across the main bacterial phyla. We have discovered numerous novel fibrillar adhesins and improved the understanding of how pathogens might adhere to and subsequently invade into host cells.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Julie E. Hernández-Salmerón ◽  
Gabriel Moreno-Hagelsieb

Abstract Background Finding orthologs remains an important bottleneck in comparative genomics analyses. While the authors of software for the quick comparison of protein sequences evaluate the speed of their software and compare their results against the most usual software for the task, it is not common for them to evaluate their software for more particular uses, such as finding orthologs as reciprocal best hits (RBH). Here we compared RBH results obtained using software that runs faster than blastp. Namely, lastal, diamond, and MMseqs2. Results We found that lastal required the least time to produce results. However, it yielded fewer results than any other program when comparing the proteins encoded by evolutionarily distant genomes. The program producing the most similar number of RBH to blastp was diamond ran with the “ultra-sensitive” option. However, this option was diamond’s slowest, with the “very-sensitive” option offering the best balance between speed and RBH results. The speeding up of the programs was much more evident when dealing with eukaryotic genomes, which code for more numerous proteins. For example, lastal took a median of approx. 1.5% of the blastp time to run with bacterial proteomes and 0.6% with eukaryotic ones, while diamond with the very-sensitive option took 7.4% and 5.2%, respectively. Though estimated error rates were very similar among the RBH obtained with all programs, RBH obtained with MMseqs2 had the lowest error rates among the programs tested. Conclusions The fast algorithms for pairwise protein comparison produced results very similar to blast in a fraction of the time, with diamond offering the best compromise in speed, sensitivity and quality, as long as a sensitivity option, other than the default, was chosen.


2020 ◽  
Vol 19 (9) ◽  
pp. 1561-1574 ◽  
Author(s):  
Ameera Raudah Ahmad Izaham ◽  
Nichollas E. Scott

Mass spectrometry has become an indispensable tool for the characterization of glycosylation across biological systems. Our ability to generate rich fragmentation of glycopeptides has dramatically improved over the last decade yet our informatic approaches still lag behind. Although glycoproteomic informatics approaches using glycan databases have attracted considerable attention, database independent approaches have not. This has significantly limited high throughput studies of unusual or atypical glycosylation events such as those observed in bacteria. As such, computational approaches to examine bacterial glycosylation and identify chemically diverse glycans are desperately needed. Here we describe the use of wide-tolerance (up to 2000 Da) open searching as a means to rapidly examine bacterial glycoproteomes. We benchmarked this approach using N-linked glycopeptides of Campylobacter fetus subsp. fetus as well as O-linked glycopeptides of Acinetobacter baumannii and Burkholderia cenocepacia revealing glycopeptides modified with a range of glycans can be readily identified without defining the glycan masses before database searching. Using this approach, we demonstrate how wide tolerance searching can be used to compare glycan use across bacterial species by examining the glycoproteomes of eight Burkholderia species (B. pseudomallei; B. multivorans; B. dolosa; B. humptydooensis; B. ubonensis, B. anthina; B. diffusa; B. pseudomultivorans). Finally, we demonstrate how open searching enables the identification of low frequency glycoforms based on shared modified peptides sequences. Combined, these results show that open searching is a robust computational approach for the determination of glycan diversity within bacterial proteomes.


Author(s):  
Mathieu E. Rebeaud ◽  
Saurav Mallik ◽  
Pierre Goloubinoff ◽  
Dan S. Tawfik

ABSTRACTAcross the Tree of Life (ToL), the complexity of proteomes varies widely. Our systematic analysis depicts that from the simplest archaea to mammals, the total number of proteins per proteome expanded ~200-fold. Individual proteins also became larger, and multi-domain proteins expanded ~50-fold. Apart from duplication and divergence of existing proteins, completely new proteins were born. Along the ToL, the number of different folds expanded ~5-fold and fold-combinations ~20-fold. Proteins prone to misfolding and aggregation, such as repeat and beta-rich proteins, proliferated ~600-fold, and accordingly, proteins predicted as aggregation-prone became 6-fold more frequent in mammalian compared to bacterial proteomes. To control the quality of these expanding proteomes, core-chaperones, ranging from HSP20s that prevent aggregation to HSP60, HSP70, HSP90, and HSP100 acting as ATP-fueled unfolding and refolding machines, also evolved. However, these core-chaperones were already available in prokaryotes, and they comprise ~0.3% of all genes from archaea to mammals. This challenge—roughly the same number of core-chaperones supporting a massive expansion of proteomes, was met by (i) higher cellular abundances of the ancient generalist core-chaperones, and (ii) continuous emergence of new substrate-binding and nucleotide-exchange factor co-chaperones that function cooperatively with core-chaperones, as a network.


Author(s):  
Ameera Raudah Ahmad Izaham ◽  
Nichollas E. Scott

ABSTRACTMass spectrometry has become an indispensable tool for the characterisation of glycosylation across biological systems. Our ability to generate rich fragmentation of glycopeptides has dramatically improved over the last decade yet our informatic approaches still lag behind. While glycoproteomic informatics approaches using glycan databases have attracted considerable attention, database independent approaches have not. This has significantly limited high throughput studies of unusual or atypical glycosylation events such as those observed in bacteria. As such, computational approaches to examine bacterial glycosylation and identify chemically diverse glycans are desperately needed. Here we describe the use of wide-tolerance (up to 2000 Da) open searching as a means to rapidly examine bacterial glycoproteomes. We benchmarked this approach using N-linked glycopeptides of Campylobacter fetus subsp. fetus as well as O-linked glycopeptides of Acinetobacter baumannii and Burkholderia cenocepacia revealing glycopeptides modified with a range of glycans can be readily identified without defining the glycan masses prior to database searching. Utilising this approach, we demonstrate how wide tolerance searching can be used to compare glycan utilisation across bacterial species by examining the glycoproteomes of eight Burkholderia species (B. pseudomallei; B. multivorans; B. dolosa; B. humptydooensis; B. ubonensis, B. anthina; B. diffusa; B. pseudomultivorans). Finally, we demonstrate how open searching enables the identification of low frequency glycoforms based on shared modified peptides sequences. Combined, these results show that open searching is a robust computational approach for the determination of glycan diversity within bacterial proteomes.


Sign in / Sign up

Export Citation Format

Share Document