scholarly journals Sequence-Only Based Prediction of β -Turn Location and Type Using Collocation of Amino Acid Pairs

2008 ◽  
Vol 2 (1) ◽  
pp. 37-49 ◽  
Author(s):  
Kevin Campbell ◽  
Lukasz Kurgan

Development of accurate β-turn (beta-turn) type prediction methods would contribute towards the prediction of the tertiary protein structure and would provide useful insights/inputs for the fold recognition and drug design. Only one existing sequence-only method is available for the prediction of beta-turn types (for type I and II) for the entire protein chains, while the proposed method allows for prediction of type I, II, IV, VII, and non-specific (NS) beta-turns, filling in the gap. The proposed predictor, which is based solely on protein sequence, is shown to provide similar performance to other sequence-only methods for prediction of beta-turns and beta-turn types. The main advantage of the proposed method is simplicity and interpretability of the underlying model. We developed novel sequence-based features that allow identifying beta-turns types and differentiating them from non-beta-turns. The features, which are based on tetrapeptides (entire beta-turns) rather than a window centered over the predicted residues as in the case of recent competing methods, provide a more biologically sound model. They include 12 features based on collocation of amino acid pairs, focusing on amino acids (Gly, Asp, and Asn) that are known to be predisposed to form beta-turns. At the same time, our model also includes features that are geared towards exclusion of non-beta-turns, which are based on amino acids known to be strongly detrimental to formation of beta-turns (Met, Ile, Leu, and Val).

2018 ◽  
Author(s):  
Maxim Shapovalov ◽  
Slobodan Vucetic ◽  
Roland L. Dunbrack

AbstractProtein loops connect regular secondary structures and contain 4-residue beta turns which represent 63% of the residues in loops. The commonly used classification of beta turns (Type I, I’, II, II’, VIa1, VIa2, VIb, and VIII) was developed in the 1970s and 1980s from analysis of a small number of proteins of average resolution, and represents only two thirds of beta turns observed in proteins (with a generic class Type IV representing the rest). We present a new clustering of beta turn conformations from a set of 13,030 turns from 1078 ultra-high resolution protein structures (≤1.2 Å). Our clustering is derived from applying the DBSCAN andk-medoids algorithms to this data set with a metric commonly used in directional statistics applied to the set of dihedral angles from the second and third residues of each turn. We define 18 turn types compared to the 8 classical turn types in common use. We propose a new 2-letter nomenclature for all 18 beta-turn types using Ramachandran region names for the two central residues (e.g., ‘A’ and ‘D’ for alpha regions on the left side of the Ramachandran map and ‘a’ and ‘d’ for equivalent regions on the right-hand side; classical Type I turns are ‘AD’ turns and Type I’ turns are ‘ad’). We identify 11 new types of beta turn, 5 of which are sub-types of classical beta turn types. Up-to-date statistics, probability densities of conformations, and sequence profiles of beta turns in loops were collected and analyzed. A library of turn types,BetaTurnLib18, and cross-platform software,BetaTurnTool18, which identifies turns in an input protein structure, are freely available and redistributable fromdunbrack.fccc.edu/betaturnandgithub.com/sh-maxim/BetaTurn18. Given the ubiquitous nature of beta turns, this comprehensive study updates understanding of beta turns and should also provide useful tools for protein structure determination, refinement, and prediction programs.


2018 ◽  
Author(s):  
Antara Sengupta ◽  
Pabitra Pal Choudhury

AbstractThe aim of this paper is to make quantitative analysis of the properties which is really being carried from DNA sequence and finally landing up to the properties of a protein structure through its primary protein sequence. Thus, the paper has a theory which is applicable for any arbitrary DNA sequence whether it is of various species or mutated data or a bunch of genes responsible for a function to be occurred. Irrespective to genes of any families, species, wild type or mutated, our paper here gives a standard model which defines a mapping between physicochemical properties of any arbitrary DNA sequence and physicochemical properties of its amino acid sequence. Experiments have been carried out with PPCA protein family and its four homologs PPC(B E) which establishes that DNA sequence keeps its signature even after its translation into the corresponding amino acid sequence.


2019 ◽  
Author(s):  
Mark Chonofsky ◽  
Saulo H. P. de Oliveira ◽  
Konrad Krawczyk ◽  
Charlotte M. Deane

AbstractOver the last few years, the field of protein structure prediction has been transformed by increasingly-accurate contact prediction software. These methods are based on the detection of coevolutionary relationships between residues from multiple sequence alignments. However, despite speculation, there is little evidence of a link between contact prediction and the physico-chemical interactions which drive amino-acid coevolution. Furthermore, existing protocols predict only a fraction of all protein contacts and it is not clear why some contacts are favoured over others.Using a dataset of 863 protein domains, we assessed the physico-chemical interactions of contacts predicted by CCMpred, MetaPSICOV, and DNCON2, as examples of direct coupling analysis, meta-prediction, and deep learning, respectively. To further investigate what sets these predicted contacts apart, we considered correctly-predicted contacts and compared their properties against the protein contacts that were not predicted.We found that predicted contacts tend to form more bonds than non-predicted contacts, which suggests these contacts may be more important. Comparing the contacts predicted by each method, we found that metaPSICOV and DNCON2 favour accuracy whereas CCMPred detects contacts with more bonds. This suggests that the push for higher accuracy may lead to a loss of physico-chemically important contacts.These results underscore the connection between protein physico-chemistry and the coevolutionary couplings that can be derived from multiple sequence alignments. This relationship is likely to be relevant to protein structure prediction and functional analysis of protein structure and may be key to understanding their utility for different problems in structural biology.Author summaryAccurate contact prediction has allowed scientists to predict protein structures with unprecedented levels of accuracy. The success of contact prediction methods, which are based on inferring correlations between amino acids in protein multiple sequence alignments, has prompted a great deal of work to improve the quality of contact prediction, leading to the development of several different methods for detecting amino acids in proximity.In this paper, we investigate the properties of these contact prediction methods. We find that contacts which are predicted differ from the other contacts in the protein, in particular they have more physico-chemical bonds, and the predicted contacts are more strongly conserved than other contacts across protein families. We also compared the properties of different contact prediction methods and found that the characteristics of the predicted sets depend on the prediction method used.Our results point to a link between physico-chemical bonding interactions and the evolutionary history of proteins, a connection which is reflected in their amino acid sequences.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Xinnan Xu ◽  
Rui Kong ◽  
Xiaoqing Liu ◽  
Pingan He ◽  
Qi Dai

A human papillomavirus type plays an important role in the early diagnosis of cervical cancer. Most of the prediction methods use protein sequence and structure information, but the reduced amino acid modes have not been used until now. In this paper, we introduced the modes of reduced amino acids to predict high-risk HPV. We first reduced 20 amino acids into several nonoverlapping groups and calculated their structure and physicochemical modes for high-risk HPV prediction, which was tested and compared with the existing methods on 68 samples of known HPV types. The experiment result indicates that the proposed method achieved better performance with an accuracy of 96.49%, indicating that the reduced amino acid modes might be used to improve the prediction of high-risk HPV types.


1998 ◽  
Vol 123 (4) ◽  
pp. 493-499 ◽  
Author(s):  
Kyu H. Chung ◽  
Dennis E. Buetow ◽  
Schuyler S. Korban

A nuclear gene, Lhcb1*Pp1, encoding a light-harvesting chlorophyll a/b-binding protein of photosystem II has been isolated from peach [Prunus persica (L.) Batsch. `Stark Earliglo'] leaf genomic DNA, cloned, and sequenced. This gene encodes a precursor polypeptide of 267 amino acids with a transit peptide of 34 and a type I mature protein of 233 amino acids. The amino acid sequence of the mature polypeptide is 89% to 94% and 80% to 94% similar to those encoded by type I Lhcb genes of annual and other woody plants, respectively. In contrast, the amino acid sequence of the peach transit peptide is less conserved being 47% to 69% similar to those of annual plants and only 17% to 22% similar to those of other woody plants. The peach gene was used as a probe for Lhcb gene expression. Lhcb mRNA is detected in leaves of field-grown trees during June to October. Lhcb mRNA is detected at a high level in leaves of peach shoots grown in tissue culture in the light, but only at a trace level in leaves grown in the dark. Some Lhcb genes appear to be light-modulated in stems. Lhcb1*Ppl contains four potential polyadenylation sites. S1 nuclease analysis detected transcripts of the sizes expected from each of the four polyadenylation sites. All four are found in leaves of light-grown shoots and of field-grown trees throughout the growing season. In contrast, only three are detected in stems of light-grown shoots.


Author(s):  
D. Filimonov ◽  
A. Lagunin

It is advisable to use data peptide's chemical structures with amino acids (AMA) substitution and the corresponding sections of the protein sequence without mutation to construct classification models predicting the pathogenic effects AMA substitutions based on MNA descriptors.


Reproduction ◽  
2005 ◽  
Vol 130 (5) ◽  
pp. 655-668 ◽  
Author(s):  
Paul J Booth ◽  
Peter G Humpherson ◽  
Terry J Watson ◽  
Henry J Leese

Preimplantation embryos can consume and produce amino acids in a manner dependent upon the stage of development that may be predictive of subsequent viability. In order to examine these relationships in the pig, patterns of net depletion and appearance of amino acids byin vitroproduced porcine preimplantation embryos were examined. Cumulus oocyte complexes derived from slaughterhouse pre-pubertal pig ovaries were matured for 40 h in defined TCM-199 medium (containing PVA) before being fertilised (Day 0) with frozen-thawed semen in Tris–based medium. After 6 h, presumptive zygotes were denuded and cultured in groups of 20, in NCSU-23 medium modified to contain 0.1 mM glutamine plus a mixture of 19 amino acids (aa) at low concentrations (0.02–0.11 mM) (NCSU-23aa). Groups of 2–20 embryos were removed (dependent on stage) on Day 0 (1 cell), Day 1 (two- and four-cells), Day 4 (compact morulae) and Day 6 (blastocysts) and placed in 4 μl NCSU-23aafor 24 h. After incubation, the embryos were removed and the spent media was analysed by HPLC. The net rate of amino acid depletion or appearance varied according to amino acid (P< 0.001) and, apart from serine and histidine, stage of development (P< 0.014). Glycine, isoleucine, valine, phenylalanine, tryptophan, methionine, asparagine, lysine, glutamate and aspartate consistently appeared, whereas threonine, glutamine and arginine were consistently depleted. Five types of stage-dependent trends could be observed: Type I: amino acids having high rates of net appearance on Day 0 that reached a nadir on Day 1 or 4 but subsequently increased by Day 6 (glycine, glutamate); Type II: those that exhibited lower rates of net appearance on Days 0 and 6 compared with the intermediate Days 1 and 4 (isoleucine, valine, phenylalanine, methionine, arginine); Type III: amino acids which showed a continuous fall in net appearance (asparagine, aspartate); Type IV: those that exhibited a steady fall in net depletion from Day 0 to Day 6 (glutamine, threonine); Type V: those following no discernable trend. Analysis of further embryo types indicated that presumptive polyspermic embryos on Day 0 had increased (P< 0.05) net rates of leucine, isoleucine, valine and glutamate appearance, and reduced (P< 0.05) net rates of threonine and glutamine depletion compared with normally inseminated oocytes. These data suggest that the net rates of depletion and uptake of amino acids by pig embryos vary between a) amino acids, b) the day of embryo development and, c) the type of embryos present at a given stage of development. The results also suggested that the net depletion and appearance rates of amino acids by early pig embryos might be more similar to those of the human than those of the mouse and cow.


Author(s):  
Toshio Iwasaki ◽  
Yoshiharu Miyajima-Nakano ◽  
Risako Fukazawa ◽  
Myat T Lin ◽  
Shin-Ichi Matsushita ◽  
...  

Abstract A set of C43(DE3) and BL21(DE3) Escherichia coli host strains that are auxotrophic for various amino acids is briefly reviewed. These strains require the addition of a defined set of one or more amino acids in the growth medium, and have been specifically designed for overproduction of membrane or water-soluble proteins selectively labeled with stable isotopes such as 2H, 13C and 15N. The strains described here are available for use and have been deposited into public strain banks. Although they cannot fully eliminate the possibility of isotope dilution and mixing, metabolic scrambling of the different amino acid types can be minimized through a careful consideration of the bacterial metabolic pathways. The use of a suitable auxotrophic expression host strain with an appropriately isotopically labeled growth medium ensures high levels of isotope labeling efficiency as well as selectivity for providing deeper insight into protein structure-function relationships.


1998 ◽  
Vol 331 (2) ◽  
pp. 417-422 ◽  
Author(s):  
David C. RISHIKOF ◽  
Ping-Ping KUANG ◽  
Christine POLIKS ◽  
Ronald H. GOLDSTEIN

The steady-state level of α1(I) collagen mRNA is regulated by amino acid availability in human lung fibroblasts. Depletion of amino acids decreases α1(I) collagen mRNA levels and repletion of amino acids induces rapid re-expression of α1(I) mRNA. In these studies, we examined the requirements for individual amino acids on the regulation of α1(I) collagen mRNA. We found that re-expression of α1(I) collagen mRNA was critically dependent on cystine but not on other amino acids. However, the addition of cystine alone did not result in re-expression of α1(I) collagen mRNA. Following amino acid depletion, the addition of cystine with selective amino acids increased α1(I) collagen mRNA levels. The combination of glutamine and cystine increased α1(I) collagen mRNA levels 6.3-fold. Methionine or a branch-chain amino acid (leucine, isoleucine or valine) also acted in combination with cystine to increase α1(I) collagen mRNA expression, whereas other amino acids were not effective. The prolonged absence of cystine lowered steady-state levels of α1(I) collagen mRNA through a mechanism involving decreases in both the rate of gene transcription as assessed by nuclear run-on experiments and mRNA stability as assessed by half-life determination in the presence of actinomycin D. The effect of cystine was not mediated via alterations in the level of glutathione, the major redox buffer in cells, as determined by the addition of buthionine sulphoximine, an inhibitor of γ-glutamylcysteine synthetase. These data suggest that cystine directly affects the regulation of α1(I) collagen mRNA.


1999 ◽  
Vol 13 (4) ◽  
pp. 578-586 ◽  
Author(s):  
Stéphane A. Laporte ◽  
Antony A. Boucard ◽  
Guy Servant ◽  
Gaétan Guillemette ◽  
Richard Leduc ◽  
...  

Abstract To identify ligand-binding domains of Angiotensin II (AngII) type 1 receptor (AT1), two different radiolabeled photoreactive AngII analogs were prepared by replacing either the first or the last amino acid of the octapeptide by p-benzoyl-l-phenylalanine (Bpa). High yield, specific labeling of the AT1 receptor was obtained with the 125I-[Sar1,Bpa8]AngII analog. Digestion of the covalent 125I-[Sar1,Bpa8]AngII-AT1 complex with V8 protease generated two major fragments of 15.8 kDa and 17.8 kDa, as determined by SDS-PAGE. Treatment of the[ Sar1,Bpa8]AngII-AT1 complex with cyanogen bromide produced a major fragment of 7.5 kDa which, upon further digestion with endoproteinase Lys-C, generated a fragment of 3.6 kDa. Since the 7.5-kDa fragment was sensitive to hydrolysis by 2-nitro-5-thiocyanobenzoic acid, we circumscribed the labeling site of 125I-[Sar1,Bpa8]AngII within amino acids 285 and 295 of the AT1 receptor. When the AT1 receptor was photolabeled with 125I-[Bpa1]AngII, a poor incorporation yield was obtained. Cleavage of the labeled receptor with endoproteinase Lys-C produced a glycopeptide of 31 kDa, which upon deglycosylation showed an apparent molecular mass of 7.5 kDa, delimiting the labeling site of 125I-[Bpa1]AngII within amino acids 147 and 199 of the AT1 receptor. CNBr digestion of the hAT1 I165M mutant receptor narrowed down the labeling site to the fragment 166–199. Taken together, these results indicate that the seventh transmembrane domain of the AT1 receptor interacts strongly with the C-terminal amino acid of[ Sar1, Bpa8]AngII, whereas the N-terminal amino acid of[ Bpa1]AngII interacts with the second extracellular loop of the AT1 receptor.


Sign in / Sign up

Export Citation Format

Share Document