scholarly journals Positive-Unlabeled Learning for Pupylation Sites Prediction

2016 ◽  
Vol 2016 ◽  
pp. 1-5 ◽  
Author(s):  
Ming Jiang ◽  
Jun-Zhe Cao

Pupylation plays a key role in regulating various protein functions as a crucial posttranslational modification of prokaryotes. In order to understand the molecular mechanism of pupylation, it is important to identify pupylation substrates and sites accurately. Several computational methods have been developed to identify pupylation sites because the traditional experimental methods are time-consuming and labor-sensitive. With the existing computational methods, the experimentally annotated pupylation sites are used as the positive training set and the remaining nonannotated lysine residues as the negative training set to build classifiers to predict new pupylation sites from the unknown proteins. However, the remaining nonannotated lysine residues may contain pupylation sites which have not been experimentally validated yet. Unlike previous methods, in this study, the experimentally annotated pupylation sites were used as the positive training set whereas the remaining nonannotated lysine residues were used as the unlabeled training set. A novel method named PUL-PUP was proposed to predict pupylation sites by using positive-unlabeled learning technique. Our experimental results indicated that PUL-PUP outperforms the other methods significantly for the prediction of pupylation sites. As an application, PUL-PUP was also used to predict the most likely pupylation sites in nonannotated lysine sites.

2020 ◽  
Vol 26 ◽  
Author(s):  
Pengmian Feng ◽  
Lijing Feng ◽  
Chaohui Tang

Background and Purpose: N 6 -methyladenosine (m6A) plays critical roles in a broad set of biological processes. Knowledge about the precise location of m6A site in the transcriptome is vital for deciphering its biological functions. Although experimental techniques have made substantial contributions to identify m6A, they are still labor intensive and time consuming. As good complements to experimental methods, in the past few years, a series of computational approaches have been proposed to identify m6A sites. Methods: In order to facilitate researchers to select appropriate methods for identifying m6A sites, it is necessary to give a comprehensive review and comparison on existing methods. Results: Since researches on m6A in Saccharomyces cerevisiae are relatively clear, in this review, we summarized recent progresses on computational prediction of m6A sites in S. cerevisiae and assessed the performance of existing computational methods. Finally, future directions of computationally identifying m6A sites were presented. Conclusion: Taken together, we anticipate that this review will provide important guides for computational analysis of m 6A modifications.


2019 ◽  
Vol 20 (5) ◽  
pp. 565-578 ◽  
Author(s):  
Lidong Wang ◽  
Ruijun Zhang

Ubiquitination is an important post-translational modification (PTM) process for the regulation of protein functions, which is associated with cancer, cardiovascular and other diseases. Recent initiatives have focused on the detection of potential ubiquitination sites with the aid of physicochemical test approaches in conjunction with the application of computational methods. The identification of ubiquitination sites using laboratory tests is especially susceptible to the temporality and reversibility of the ubiquitination processes, and is also costly and time-consuming. It has been demonstrated that computational methods are effective in extracting potential rules or inferences from biological sequence collections. Up to the present, the computational strategy has been one of the critical research approaches that have been applied for the identification of ubiquitination sites, and currently, there are numerous state-of-the-art computational methods that have been developed from machine learning and statistical analysis to undertake such work. In the present study, the construction of benchmark datasets is summarized, together with feature representation methods, feature selection approaches and the classifiers involved in several previous publications. In an attempt to explore pertinent development trends for the identification of ubiquitination sites, an independent test dataset was constructed and the predicting results obtained from five prediction tools are reported here, together with some related discussions.


2016 ◽  
Vol 397 (2) ◽  
pp. 135-145 ◽  
Author(s):  
Miriam Olombrada ◽  
Lucía García-Ortega ◽  
Javier Lacadena ◽  
Mercedes Oñaderra ◽  
José G. Gavilanes ◽  
...  

Abstract Ribotoxins are cytotoxic members of the family of fungal extracellular ribonucleases best represented by RNase T1. They share a high degree of sequence identity and a common structural fold, including the geometric arrangement of their active sites. However, ribotoxins are larger, with a well-defined N-terminal β-hairpin, and display longer and positively charged unstructured loops. These structural differences account for their cytotoxic properties. Unexpectedly, the discovery of hirsutellin A (HtA), a ribotoxin produced by the invertebrate pathogen Hirsutella thompsonii, showed how it was possible to accommodate these features into a shorter amino acid sequence. Examination of HtA N-terminal β-hairpin reveals differences in terms of length, charge, and spatial distribution. Consequently, four different HtA mutants were prepared and characterized. One of them was the result of deleting this hairpin [Δ(8-15)] while the other three affected single Lys residues in its close spatial proximity (K115E, K118E, and K123E). The results obtained support the general conclusion that HtA active site would show a high degree of plasticity, being able to accommodate electrostatic and structural changes not suitable for the other previously known larger ribotoxins, as the variants described here only presented small differences in terms of ribonucleolytic activity and cytotoxicity against cultured insect cells.


2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Hudson Fernandes Golino ◽  
Liliany Souza de Brito Amaral ◽  
Stenio Fernando Pimentel Duarte ◽  
Cristiano Mauro Assis Gomes ◽  
Telma de Jesus Soares ◽  
...  

The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudoR2(.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudoR2(.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power.


2018 ◽  
Vol 2018 ◽  
pp. 1-15 ◽  
Author(s):  
Huaping Guo ◽  
Xiaoyu Diao ◽  
Hongbing Liu

Rotation Forest is an ensemble learning approach achieving better performance comparing to Bagging and Boosting through building accurate and diverse classifiers using rotated feature space. However, like other conventional classifiers, Rotation Forest does not work well on the imbalanced data which are characterized as having much less examples of one class (minority class) than the other (majority class), and the cost of misclassifying minority class examples is often much more expensive than the contrary cases. This paper proposes a novel method called Embedding Undersampling Rotation Forest (EURF) to handle this problem (1) sampling subsets from the majority class and learning a projection matrix from each subset and (2) obtaining training sets by projecting re-undersampling subsets of the original data set to new spaces defined by the matrices and constructing an individual classifier from each training set. For the first method, undersampling is to force the rotation matrix to better capture the features of the minority class without harming the diversity between individual classifiers. With respect to the second method, the undersampling technique aims to improve the performance of individual classifiers on the minority class. The experimental results show that EURF achieves significantly better performance comparing to other state-of-the-art methods.


Author(s):  
Martin L. Rennie ◽  
Kimon Lemonidis ◽  
Connor Arkinson ◽  
Viduth K. Chaugule ◽  
Mairi Clarke ◽  
...  

AbstractThe Fanconi Anemia (FA) pathway is a dedicated pathway for the repair of DNA interstrand crosslinks, and which is additionally activated in response to other forms of replication stress. A key step in the activation of the FA pathway is the monoubiquitination of each of the two subunits (FANCI and FANCD2) of the ID2 complex on specific lysine residues. However, the molecular function of these modifications has been unknown for nearly two decades. Here we find that ubiquitination of FANCD2 acts to increase ID2’s affinity for double stranded DNA via promoting/stabilizing a large-scale conformational change in the complex, resulting in a secondary “Arm” ID2 interphase encircling DNA. Ubiquitination of FANCI, on the other hand, largely protects the ubiquitin on FANCD2 from USP1/UAF deubiquitination, with key hydrophobic residues of FANCI’s ubiquitin being important for this protection. In effect, both of these post-translational modifications function to stabilise a conformation in which the ID2 complex encircles DNA.


2013 ◽  
Vol 45 (3) ◽  
pp. 379-383 ◽  
Author(s):  
A. Cias

Conventional sintering techniques for structural steels have been developed principally for Cu and Ni containing alloys. Applying these to Cr and Mn steels (successful products of traditional metallurgy) encounter the problem of the high affinity for oxygen of these elements. A solution is employing a microatmosphere in a semiclosed container which favours reduction reactions. This has already proved successful on a laboratory scale, especially with nitrogen as the furnace gas. Further modifications to the system, now described, include the use of two sintering boxes, one inside the other. Superior mechanical properties, even using air as the furnace gas, are attainable.


2020 ◽  
Vol 21 (1) ◽  
pp. 3-10 ◽  
Author(s):  
Jianwei Li ◽  
Yan Huang ◽  
Yuan Zhou

RNA 5-methylcytosine (m5C) is one of the pillars of post-transcriptional modification (PTCM). A growing body of evidence suggests that m5C plays a vital role in RNA metabolism. Accurate localization of RNA m5C sites in tissue cells is the premise and basis for the in-depth understanding of the functions of m5C. However, the main experimental methods of detecting m5C sites are limited to varying degrees. Establishing a computational model to predict modification sites is an excellent complement to wet experiments for identifying m5C sites. In this review, we summarized some available m5C predictors and discussed the characteristics of these methods.


Author(s):  
Kun Wei ◽  
Cheng Deng ◽  
Xu Yang

Zero-Shot Learning (ZSL) handles the problem that some testing classes never appear in training set. Existing ZSL methods are designed for learning from a fixed training set, which do not have the ability to capture and accumulate the knowledge of multiple training sets, causing them infeasible to many real-world applications. In this paper, we propose a new ZSL setting, named as Lifelong Zero-Shot Learning (LZSL), which aims to accumulate the knowledge during the learning from multiple datasets and recognize unseen classes of all trained datasets. Besides, a novel method is conducted to realize LZSL, which effectively alleviates the Catastrophic Forgetting in the continuous training process. Specifically, considering those datasets containing different semantic embeddings, we utilize Variational Auto-Encoder to obtain unified semantic representations. Then, we leverage selective retraining strategy to preserve the trained weights of previous tasks and avoid negative transfer when fine-tuning the entire model. Finally, knowledge distillation is employed to transfer knowledge from previous training stages to current stage. We also design the LZSL evaluation protocol and the challenging benchmarks. Extensive experiments on these benchmarks indicate that our method tackles LZSL problem effectively, while existing ZSL methods fail.


Author(s):  
Nesrin Sarigul-Klijn ◽  
Anthony White

This article details a novel method for the determination of safe flight paths dynamically following an in-flight distress event. The method is based on probabilistic safety metrics which also include the touchdown and evacuation/rescue phases after landing. Two case studies simulating in-flight distress events, one from the west and the other from the east coast are presented using these formulations for a quantitative analysis. It is found that the nearest landing sites are not always the safest ones showing the benefits of the newly developed safety metrics. Finally, the path safety levels are plotted as a function of mission safety probability values using innovative polar plots that provide useful information to pilots.


Sign in / Sign up

Export Citation Format

Share Document