scholarly journals Sensitive identification of known and unknown protease activities by unsupervised linear motif deconvolution

2021 ◽  
Author(s):  
Anuli C Uzozie ◽  
Theodore G Smith ◽  
Siyuan Chen ◽  
Philipp F Lange

The cleavage-site specificities for many proteases are not well-understood, restricting the utility of supervised classification methods. We present an algorithm and web interface to overcome this limitation through the unsupervised detection of overrepresented patterns in protein sequence data, providing insight into the mixture of protease activities contributing to a complex system. Here, we apply the RObust LInear Motif Deconvolution (RoLiM) algorithm to confidently detect substrate cleavage patterns for SARS-CoV-2 Mpro protease in N terminome data of an infected human cell line. Using mass spectrometry-based peptide data from a case-control comparison of 341 primary urothelial bladder cancer cases and 110 controls, we identified distinct sequence motifs indicative of increased MMP activity in urine from cancer patients. Evaluation of N terminal peptides from patient plasma post-chemotherapy detected novel Granzyme B/Corin activity. RoLiM will enhance unbiased investigation of peptide sequences to establish the composition of known and uncharacterized protease activities in biological systems.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Maxim S. Svetlov ◽  
Timm O. Koller ◽  
Sezen Meydan ◽  
Vaishnavi Shankar ◽  
Dorota Klepacki ◽  
...  

AbstractMacrolide antibiotics bind in the nascent peptide exit tunnel of the bacterial ribosome and prevent polymerization of specific amino acid sequences, selectively inhibiting translation of a subset of proteins. Because preventing translation of individual proteins could be beneficial for the treatment of human diseases, we asked whether macrolides, if bound to the eukaryotic ribosome, would retain their context- and protein-specific action. By introducing a single mutation in rRNA, we rendered yeast Saccharomyces cerevisiae cells sensitive to macrolides. Cryo-EM structural analysis showed that the macrolide telithromycin binds in the tunnel of the engineered eukaryotic ribosome. Genome-wide analysis of cellular translation and biochemical studies demonstrated that the drug inhibits eukaryotic translation by preferentially stalling ribosomes at distinct sequence motifs. Context-specific action markedly depends on the macrolide structure. Eliminating macrolide-arrest motifs from a protein renders its translation macrolide-tolerant. Our data illuminate the prospects of adapting macrolides for protein-selective translation inhibition in eukaryotic cells.


Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 530
Author(s):  
Milton Silva ◽  
Diogo Pratas ◽  
Armando J. Pinho

Recently, the scientific community has witnessed a substantial increase in the generation of protein sequence data, triggering emergent challenges of increasing importance, namely efficient storage and improved data analysis. For both applications, data compression is a straightforward solution. However, in the literature, the number of specific protein sequence compressors is relatively low. Moreover, these specialized compressors marginally improve the compression ratio over the best general-purpose compressors. In this paper, we present AC2, a new lossless data compressor for protein (or amino acid) sequences. AC2 uses a neural network to mix experts with a stacked generalization approach and individual cache-hash memory models to the highest-context orders. Compared to the previous compressor (AC), we show gains of 2–9% and 6–7% in reference-free and reference-based modes, respectively. These gains come at the cost of three times slower computations. AC2 also improves memory usage against AC, with requirements about seven times lower, without being affected by the sequences’ input size. As an analysis application, we use AC2 to measure the similarity between each SARS-CoV-2 protein sequence with each viral protein sequence from the whole UniProt database. The results consistently show higher similarity to the pangolin coronavirus, followed by the bat and human coronaviruses, contributing with critical results to a current controversial subject. AC2 is available for free download under GPLv3 license.


1980 ◽  
Vol 187 (1) ◽  
pp. 65-74 ◽  
Author(s):  
D Penny ◽  
M D Hendy ◽  
L R Foulds

We have recently reported a method to identify the shortest possible phylogenetic tree for a set of protein sequences [Foulds Hendy & Penny (1979) J. Mol. Evol. 13. 127–150; Foulds, Penny & Hendy (1979) J. Mol. Evol. 13, 151–166]. The present paper discusses issues that arise during the construction of minimal phylogenetic trees from protein-sequence data. The conversion of the data from amino acid sequences into nucleotide sequences is shown to be advantageous. A new variation of a method for constructing a minimal tree is presented. Our previous methods have involved first constructing a tree and then either proving that it is minimal or transforming it into a minimal tree. The approach presented in the present paper progressively builds up a tree, taxon by taxon. We illustrate this approach by using it to construct a minimal tree for ten mammalian haemoglobin alpha-chain sequences. Finally we define a measure of the complexity of the data and illustrate a method to derive a directed phylogenetic tree from the minimal tree.


Parasitology ◽  
2007 ◽  
Vol 134 (10) ◽  
pp. 1465-1476 ◽  
Author(s):  
I. BEVERIDGE ◽  
S. SHAMSI ◽  
M. HU ◽  
N. B. CHILTON ◽  
R. B. GASSER

SUMMARYGenetic variation was examined in the anoplocephalid cestode Progamotaenia festiva, from Australian marsupials, in order to test the hypothesis that P. festiva, is a complex of sibling species and to assess the extent of host switching reported previously based on multilocus enzyme electrophoresis (MEE). Polymerase chain reaction (PCR)-based single-strand conformational polymorphism (SSCP) was used for the analysis of sequence variation in the cytochrome c oxidase subunit 1 (cox1) gene among 179 specimens of P. festiva (identified based on morphology and predilection site in the host) from 13 different host species, followed by selective DNA sequencing. Fifty-three distinct sequence types (haplotypes) representing all specimens were defined. Phylogenetic analyses of these sequence data (utilizing maximum parsimony and neighbour-joining methods) revealed 12 distinct clades. Other heterologous species, P. ewersi and P. macropodis, were used as outgroups and the remaining bile-duct inhabiting species, P. diaphana and P. effigia, were included in the analysis for comparative purposes. The latter 2 species were nested within the clades representing P. festiva. Most clades of P. festiva identified were restricted to a single host species; one clade primarily in Macropus robustus was also found in the related host species M. antilopinus in an area of host sympatry; another clade occurring primarily in M. robustus occurred also in additional kangaroo species, M. rufus and M. dorsalis. High levels of genetic divergence, the existence of distinct clades and their occurrence in sympatry provide support for the hypothesis that P. festiva represents a complex of numerous species, most of which, but not all, are host specific. Three distinct clades of cestodes were found within a single host, M. robustus, but there was no evidence of within-host speciation.


2021 ◽  
Author(s):  
Ruth E Timme ◽  
Maria Balkey ◽  
Robyn Randolph ◽  
Julie Haendiges ◽  
Sai Laxmi Gubbala Venkata ◽  
...  

PURPOSE: Step-by-step instructions for submitting pathogen whole genome sequence data to NCBI and to the NCBI Pathogen Detection portal. This protocol covers the steps needed to establish a new NCBI submission environment for your laboratory, including the creation of new BioProject(s) and submission groups. Once these are step up, the protocol then walks through the process for submitting raw reads to SRA and sample metadata to BioSample through the Submission portal. SCOPE: for use by any laboratory submitting WGS data for species under active surveillance within NCBI’s Pathogen Detection. (This includes US laboratories in GenomeTrakr, NARMS, Vet-LIRN, PulseNet, and other non-US networks and submitters). For new submitters, there's quite a bit of groundwork that needs to be established before a laboratory can start its first data submission. We recommend that one person in the laboratory take a few days to get everything set up in advance of when you expect to do your first data submission. If you need a pipeline for frequent or large volume submissions, follow Step 1 to get your NCBI submission environment established, then contact [email protected] to set up an account for submitting through the API. This protocol covers submission using NCBI's Submission Portal web-interface. Version history: V5: Linking directly to the metadata template guidance instead of including duplicate copies of the files in this protocol. Updated screenshot for choosing the pathogen template to reflect changes at NCBI. V4: updated screenshots to reflect NCBI submission portal changes. Updated custom BioSample template.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Tobias Paczian ◽  
William L. Trimble ◽  
Wolfgang Gerlach ◽  
Travis Harrison ◽  
Andreas Wilke ◽  
...  

Abstract Background The MG-RAST API provides search capabilities and delivers organism and function data as well as raw or annotated sequence data via the web interface and its RESTful API. For casual users, however, RESTful APIs are hard to learn and work with. Results We created the graphical MG-RAST API explorer to help researchers more easily build and export API queries; understand the data abstractions and indices available in MG-RAST; and use the results presented in-browser for exploration, development, and debugging. Conclusions The API explorer lowers the barrier to entry for occasional or first-time MG-RAST API users.


2002 ◽  
Vol 48 (12) ◽  
pp. 2208-2216 ◽  
Author(s):  
Jari Leinonen ◽  
Ping Wu ◽  
Ulf-Håkan Stenman

Abstract Background: Prostate-specific antigen (PSA) is the most important marker for prostate cancer, but PSA concentrations determined by various assays can differ significantly because of differences in specificity of the antibodies used. To identify epitopes recognized by various monoclonal antibodies (MAbs) to PSA, we have isolated peptides that react with the paratopes of these. Methods: Six anti-PSA MAbs representing three major epitope groups were screened with five cyclic phage display peptide libraries. After selection, the peptide sequences were determined by sequencing of the relevant part of viral DNA. Binding of the phage peptides to the MAbs was monitored by immunoassay. Results: For each MAb, several paratope-binding peptides with distinct sequence motifs were identified, but only ∼10% showed similarity with the PSA sequence. Some of these correctly predicted the location of the epitopes. By sequential panning of the library with two closely related MAbs, we identified peptides reacting equally with both MAbs. When analyzed against a large panel of PSA MAbs, the peptides generally showed restricted specificity toward the MAb used for selection, but some peptides bound to several related MAbs. Conclusions: Most of the cyclic peptides selected with PSA MAbs are specific for the MAb used for selection and do not resemble any sequence on the antigen. Peptides reactive with two MAbs recognizing the same epitope can be obtained by sequential panning. This method can be used to predict the location of some epitopes, but additional methods are needed to confirm the result.


Sign in / Sign up

Export Citation Format

Share Document