scholarly journals SUMOhunt: Combining Spatial Staging between Lysine and SUMO with Random Forests to Predict SUMOylation

2013 ◽  
Vol 2013 ◽  
pp. 1-11 ◽  
Author(s):  
Amna Ijaz

Modification with SUMO protein has many key roles in eukaryotic systems which renders the identification of its target proteins and sites of considerable importance. Information regarding the SUMOylation of a protein may tell us about its subcellular localization, function, and spatial orientation. This modification occurs at particular and not all lysine residues in a given protein. In competition with biochemical means of modified-site recognition, computational methods are strong contenders in the prediction of SUMOylation-undergoing sites on proteins. In this research, physicochemical properties of amino acids retrieved from AAIndex, especially those involved in docking of modifier and target proteins and optimal presentation of target lysine, in combination with sequence information and random forest-based classifier presented in WEKA have been used to develop a prediction model, SUMOhunt, with statistics significantly better than all previous predictors. In this model 97.56% accuracy, 100% sensitivity, 94% specificity, and 0.95 MCC have been achieved which shows that proposed amino acid properties have a significant role in SUMO attachment. SUMOhunt will hence bring great reliability and efficiency in SUMOylation prediction.

2011 ◽  
Vol 8 (3) ◽  
pp. 158-175
Author(s):  
Gualberto Asencio Cortés ◽  
Jesús A. Aguilar-Ruiz

SummaryThe prediction of protein structures is a current issue of great significance in structural bioinformatics. More specifically, the prediction of the tertiary structure of a protein con- sists in determining its three-dimensional conformation based solely on its amino acid sequence. This study proposes a method in which protein fragments are assembled according to their physicochemical similarities, using information extracted from known protein structures. Many approaches cited in the literature use the physicochemical properties of amino acids, generally hydrophobicity, polarity and charge, to predict structure. In our method, implemented with parallel multithreading, we used a set of 30 physicochemical amino acid properties selected from the AAindex database. Several protein tertiary structure prediction methods produce a contact map. Our proposed method produces a distance map, which provides more information about the structure of a protein than a contact map. We performed several preliminary analysis of the protein physicochemical data distributions using 3D surfaces. Three main pattern types were found in 3D surfaces, thus it is possible to extract rules in order to predict distances between amino acids according to their physicochemical properties. We performed an experimental validation of our method using five non-homologous protein sets and we showed the generality of this method and its prediction quality using the amino acid properties considered. Finally, we included a study of the algorithm efficiency according to the number of most similar fragments considered and we notably improved the precision with the studied proteins sets.


2019 ◽  
Vol 24 (34) ◽  
pp. 4013-4022 ◽  
Author(s):  
Xiang Cheng ◽  
Xuan Xiao ◽  
Kuo-Chen Chou

Knowledge of protein subcellular localization is vitally important for both basic research and drug development. With the avalanche of protein sequences emerging in the post-genomic age, it is highly desired to develop computational tools for timely and effectively identifying their subcellular localization based on the sequence information alone. Recently, a predictor called “pLoc-mPlant” was developed for identifying the subcellular localization of plant proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mPlant was trained by an extremely skewed dataset in which some subsets (i.e., the protein numbers for some subcellular locations) were more than 10 times larger than the others. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. To overcome such biased consequence, we have developed a new and bias-free predictor called pLoc_bal-mPlant by balancing the training dataset. Cross-validation tests on exactly the same experimentconfirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mPlant, the existing state-of-the-art predictor in identifying the subcellular localization of plant proteins. To maximize the convenience for the majority of experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mPlant/, by which users can easily get their desired results without the need to go through the detailed mathematics.


2019 ◽  
Vol 15 (5) ◽  
pp. 472-485 ◽  
Author(s):  
Kuo-Chen Chou ◽  
Xiang Cheng ◽  
Xuan Xiao

<P>Background/Objective: Information of protein subcellular localization is crucially important for both basic research and drug development. With the explosive growth of protein sequences discovered in the post-genomic age, it is highly demanded to develop powerful bioinformatics tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called “pLoc-mEuk” was developed for identifying the subcellular localization of eukaryotic proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems where many proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mEuk was trained by an extremely skewed dataset where some subset was about 200 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. </P><P> Methods: To alleviate such bias, we have developed a new predictor called pLoc_bal-mEuk by quasi-balancing the training dataset. Cross-validation tests on exactly the same experimentconfirmed dataset have indicated that the proposed new predictor is remarkably superior to pLocmEuk, the existing state-of-the-art predictor in identifying the subcellular localization of eukaryotic proteins. It has not escaped our notice that the quasi-balancing treatment can also be used to deal with many other biological systems. </P><P> Results: To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mEuk/. </P><P> Conclusion: It is anticipated that the pLoc_bal-Euk predictor holds very high potential to become a useful high throughput tool in identifying the subcellular localization of eukaryotic proteins, particularly for finding multi-target drugs that is currently a very hot trend trend in drug development.</P>


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Chuandong Song ◽  
Haifeng Wang

Emerging evidence demonstrates that post-translational modification plays an important role in several human complex diseases. Nevertheless, considering the inherent high cost and time consumption of classical and typical in vitro experiments, an increasing attention has been paid to the development of efficient and available computational tools to identify the potential modification sites in the level of protein. In this work, we propose a machine learning-based model called CirBiTree for identification the potential citrullination sites. More specifically, we initially utilize the biprofile Bayesian to extract peptide sequence information. Then, a flexible neural tree and fuzzy neural network are employed as the classification model. Finally, the most available length of identified peptides has been selected in this model. To evaluate the performance of the proposed methods, some state-of-the-art methods have been employed for comparison. The experimental results demonstrate that the proposed method is better than other methods. CirBiTree can achieve 83.07% in sn%, 80.50% in sp, 0.8201 in F1, and 0.6359 in MCC, respectively.


2013 ◽  
Vol 804 ◽  
pp. 70-75 ◽  
Author(s):  
Jian-Hua Huang ◽  
Hua-Lin Xie ◽  
Jun Yan ◽  
Hong-Mei Lu ◽  
Qing-Song Xu ◽  
...  

Author(s):  
Bindia Sahu ◽  
Jaya Prakash Alla ◽  
Gladstone Christopher Jayakumar

Leather tanning is a stabilisation process of skin fibers. This is achieved by the interaction of collagen amino acids with tanning agents to stabilise skin from putrefaction. Tanning of collagen with oil is a special class of tanning known as chamois tanning. Chemically, the oil tanning involves oxidation of unsaturation present in the oil, which is generally achieved by exposing oil treated skins to air. In this study, Benzoyl peroxide has been used as an accelerating agent for oxidation of unsaturated bonds present in the linseed oil for oil tanning process. Results shows remarkable reduction in tanning duration from fifteen days to two days. The chamois leathers prepared using oxidation accelerant (Benzoyl peroxide) have been evaluated for physical properties such as water absorption (611%), tensile strength (18 N/mm2) and percentage of elongation (66 %) which are found to be better than control leathers.


1994 ◽  
Vol 14 (3) ◽  
pp. 2201-2212 ◽  
Author(s):  
Z Yang ◽  
L Gu ◽  
P H Romeo ◽  
D Bories ◽  
H Motohashi ◽  
...  

GATA-3 is a zinc finger transcription factor which is expressed in a highly restricted and strongly conserved tissue distribution pattern in vertebrate organisms, specifically, in a subset of hematopoietic cells, in cells within the central and peripheral nervous systems, in the kidney, and in placental trophoblasts. Tissue-specific cellular genes regulated by GATA-3 have been identified in T lymphocytes and the placenta, while GATA-3-regulated genes in the nervous system and kidney have not yet been defined. We prepared monoclonal antibodies with which we could dissect the biochemical and functional properties of human GATA-3. The results of these experiments show some anticipated phenotypes, for example, the definition of discrete domains required for specific DNA-binding site recognition (amino acids 303 to 348) and trans activation (amino acids 30 to 74). The signaling sequence for nuclear localization of human GATA-3 is a property conferred by sequences within and surrounding the amino finger (amino acids 249 to 311) of the protein, thereby assigning a function to this domain and thus explaining the curious observation that this zinc finger is dispensable for DNA binding by the GATA family of transcription factors.


2021 ◽  
pp. 1-8
Author(s):  
Adeyeye EI ◽  
◽  
Idowu OT ◽  

This article reports the amino acid composition of the Nigerian local cheese called ‘wara’. ‘Wara’ is made by boiling cow milk with some added coagulant to cuddle the milk protein resulting in coagulated milk protein and whey. ‘Wara’ used to be an excellent source of nutrients such as proteins, fats, minerals and vitamins. Samples were purchased in Ado-Ekiti, Nigeria. Amino acid values were high (g/100g crude protein) in Leu, Asp, Glu, Pro, Phe, Arg with total value of 97.7. The quality parameters of the amino acids were: TEAA (42.6g/100g and 43.6%) whereas TNEAA (55.1g/100g and 56.4%); TArAA (12.8g/100g and 13.1%); TBAA (14.2g/100g and 14.5%); TSAA (3.10g/100g and 3.17%); %Cys in TSAA (51.4); Leu/Ile ratio (1.74); P-PER1 (2.65); P-PER2 (2.48); P-PER3 (2.41); EAAI1 (soybean standard) (1.29) and EAAI2 (egg standard) (99.9); BV (97.2) and Lys/Trp ratio (3.62). The statistical analysis of TEAA/TNEAA at r=0.01 was not significantly different. On the amino acid scores, Met was limiting (0.459) at egg comparison, Lys was limiting at both FAO/WHO [24] and preschool EAA requirements with respective values of 0.966 and 0.97. Estimates of essential amino acid requirements at ages 10-12 years (mg/kg/day) showed the ‘wara’ sample to be better than the standard by 3.72-330% with Lys (3.72%) being least better and Trp (330%) being most. The results showed that ‘wara’ is protein-condensed which can be eaten as raw cheese, flavoured snack, sandwich filling or fried cake.


1994 ◽  
Vol 14 (3) ◽  
pp. 2201-2212
Author(s):  
Z Yang ◽  
L Gu ◽  
P H Romeo ◽  
D Bories ◽  
H Motohashi ◽  
...  

GATA-3 is a zinc finger transcription factor which is expressed in a highly restricted and strongly conserved tissue distribution pattern in vertebrate organisms, specifically, in a subset of hematopoietic cells, in cells within the central and peripheral nervous systems, in the kidney, and in placental trophoblasts. Tissue-specific cellular genes regulated by GATA-3 have been identified in T lymphocytes and the placenta, while GATA-3-regulated genes in the nervous system and kidney have not yet been defined. We prepared monoclonal antibodies with which we could dissect the biochemical and functional properties of human GATA-3. The results of these experiments show some anticipated phenotypes, for example, the definition of discrete domains required for specific DNA-binding site recognition (amino acids 303 to 348) and trans activation (amino acids 30 to 74). The signaling sequence for nuclear localization of human GATA-3 is a property conferred by sequences within and surrounding the amino finger (amino acids 249 to 311) of the protein, thereby assigning a function to this domain and thus explaining the curious observation that this zinc finger is dispensable for DNA binding by the GATA family of transcription factors.


Sign in / Sign up

Export Citation Format

Share Document