scholarly journals Recent Development of Machine Learning Methods in Microbial Phosphorylation Sites

2020 ◽  
Vol 21 (3) ◽  
pp. 194-203 ◽  
Author(s):  
Md. Mamunur Rashid ◽  
Swakkhar Shatabda ◽  
Md. Mehedi Hasan ◽  
Hiroyuki Kurata

A variety of protein post-translational modifications has been identified that control many cellular functions. Phosphorylation studies in mycobacterial organisms have shown critical importance in diverse biological processes, such as intercellular communication and cell division. Recent technical advances in high-precision mass spectrometry have determined a large number of microbial phosphorylated proteins and phosphorylation sites throughout the proteome analysis. Identification of phosphorylated proteins with specific modified residues through experimentation is often laborintensive, costly and time-consuming. All these limitations could be overcome through the application of machine learning (ML) approaches. However, only a limited number of computational phosphorylation site prediction tools have been developed so far. This work aims to present a complete survey of the existing ML-predictors for microbial phosphorylation. We cover a variety of important aspects for developing a successful predictor, including operating ML algorithms, feature selection methods, window size, and software utility. Initially, we review the currently available phosphorylation site databases of the microbiome, the state-of-the-art ML approaches, working principles, and their performances. Lastly, we discuss the limitations and future directions of the computational ML methods for the prediction of phosphorylation.

2021 ◽  
Vol 12 ◽  
Author(s):  
Yu-Jian Shao ◽  
Qiao-Yun Zhu ◽  
Zi-Wei Yao ◽  
Jian-Xiang Liu

Plants rapidly adapt to elevated ambient temperature by adjusting their growth and developmental programs. To date, a number of experiments have been carried out to understand how plants sense and respond to warm temperatures. However, how warm temperature signals are relayed from thermosensors to transcriptional regulators is largely unknown. To identify new early regulators of plant thermo-responsiveness, we performed phosphoproteomic analysis using TMT (Tandem Mass Tags) labeling and phosphopeptide enrichment with Arabidopsis etiolated seedlings treated with or without 3h of warm temperatures (29°C). In total, we identified 13,160 phosphopeptides in 5,125 proteins with 10,700 quantifiable phosphorylation sites. Among them, 200 sites (180 proteins) were upregulated, while 120 sites (87 proteins) were downregulated by elevated temperature. GO (Gene Ontology) analysis indicated that phosphorelay-related molecular function was enriched among the differentially phosphorylated proteins. We selected ATL6 (ARABIDOPSIS TOXICOS EN LEVADURA 6) from them and expressed its native and phosphorylation-site mutated (S343A S357A) forms in Arabidopsis and found that the mutated form of ATL6 was less stable than that of the native form both in vivo and in cell-free degradation assays. Taken together, our data revealed extensive protein phosphorylation during thermo-responsiveness, providing new candidate proteins/genes for studying plant thermomorphogenesis in the future.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Niraj Thapa ◽  
Meenal Chaudhari ◽  
Anthony A. Iannetta ◽  
Clarence White ◽  
Kaushik Roy ◽  
...  

AbstractProtein phosphorylation, which is one of the most important post-translational modifications (PTMs), is involved in regulating myriad cellular processes. Herein, we present a novel deep learning based approach for organism-specific protein phosphorylation site prediction in Chlamydomonas reinhardtii, a model algal phototroph. An ensemble model combining convolutional neural networks and long short-term memory (LSTM) achieves the best performance in predicting phosphorylation sites in C. reinhardtii. Deemed Chlamy-EnPhosSite, the measured best AUC and MCC are 0.90 and 0.64 respectively for a combined dataset of serine (S) and threonine (T) in independent testing higher than those measures for other predictors. When applied to the entire C. reinhardtii proteome (totaling 1,809,304 S and T sites), Chlamy-EnPhosSite yielded 499,411 phosphorylated sites with a cut-off value of 0.5 and 237,949 phosphorylated sites with a cut-off value of 0.7. These predictions were compared to an experimental dataset of phosphosites identified by liquid chromatography-tandem mass spectrometry (LC–MS/MS) in a blinded study and approximately 89.69% of 2,663 C. reinhardtii S and T phosphorylation sites were successfully predicted by Chlamy-EnPhosSite at a probability cut-off of 0.5 and 76.83% of sites were successfully identified at a more stringent 0.7 cut-off. Interestingly, Chlamy-EnPhosSite also successfully predicted experimentally confirmed phosphorylation sites in a protein sequence (e.g., RPS6 S245) which did not appear in the training dataset, highlighting prediction accuracy and the power of leveraging predictions to identify biologically relevant PTM sites. These results demonstrate that our method represents a robust and complementary technique for high-throughput phosphorylation site prediction in C. reinhardtii. It has potential to serve as a useful tool to the community. Chlamy-EnPhosSite will contribute to the understanding of how protein phosphorylation influences various biological processes in this important model microalga.


2007 ◽  
Vol 36 (Database) ◽  
pp. D1015-D1021 ◽  
Author(s):  
J. L. Heazlewood ◽  
P. Durek ◽  
J. Hummel ◽  
J. Selbig ◽  
W. Weckwerth ◽  
...  

2004 ◽  
Vol 18 (3) ◽  
pp. 441-451
Author(s):  
Melissa D. Zolodz ◽  
Karl V. Wood

Proteomic analysis is becoming a popular field in science. Analysis of protein modifications is useful in deciphering cellular functions and errors in pathways that can result in disease. There has been increased interest in the phosphotyrosine proteome. Due to the difficulty in finding the location of the tyrosine phosphorylation site in the tyrosine phosphorylated peptide or even to verify that the parent protein is a phosphotyrosyl‒protein, new analytical tools are being developed. The phosphotyrosine immonium ion can be produced via skimmer CID for detection via ion trap mass spectrometry and is a useful marker for the indication of the presence of a phosphotyrosine residue. Skimmer CID analysis can also be used to differentiate phosphotyrosine‒containing peptides from other phosphorylated peptides. In this study, phosphotyrosine‒containing peptides were analyzed by skimmer CID in an ion trap mass spectrometer. The factors affecting the signal abundance of the phosphotyrosine immonium ion were investigated.


1991 ◽  
Vol 279 (3) ◽  
pp. 727-732 ◽  
Author(s):  
G B Sala-Newby ◽  
A K Campbell

cDNA coding for the luciferase in the firefly Photinus pyralis was amplified in vitro to generate cyclic AMP-dependent protein kinase phosphorylation sites. The DNA was transcribed and translated to generate light-emitting protein. A valine at position 217 was mutated to arginine to generate a site RRFS and the heptapeptide kemptide, the phosphorylation site of the porcine pyruvate kinase, was added at the N- or C-terminus of the luciferase. The proteins carrying phosphorylation sites were characterized for their specific activity, pI, effect of pH on the colour of the light emitted and effect of the catalytic subunit of protein kinase A in the presence of ATP. Only one of the recombinant proteins (RRFS) was significantly different from wild-type luciferase. The RRFS mutant had a lower specific activity, lower pH optimum, emitted greener light at low pH and when phosphorylated it decreased its activity by up to 80%. This latter effect was reversed by phosphatase. This recombinant protein is a good candidate to measure for the first time cyclic AMP-dependent phosphorylation in live cells.


2009 ◽  
Vol 8 (7) ◽  
pp. 922-932 ◽  
Author(s):  
Jens Boesger ◽  
Volker Wagner ◽  
Wolfram Weisheit ◽  
Maria Mittag

ABSTRACT Cilia and flagella are cell organelles that are highly conserved throughout evolution. For many years, the green biflagellate alga Chlamydomonas reinhardtii has served as a model for examination of the structure and function of its flagella, which are similar to certain mammalian cilia. Proteome analysis revealed the presence of several kinases and protein phosphatases in these organelles. Reversible protein phosphorylation can control ciliary beating, motility, signaling, length, and assembly. Despite the importance of this posttranslational modification, the identities of many ciliary phosphoproteins and knowledge about their in vivo phosphorylation sites are still missing. Here we used immobilized metal affinity chromatography to enrich phosphopeptides from purified flagella and analyzed them by mass spectrometry. One hundred forty-one phosphorylated peptides were identified, belonging to 32 flagellar proteins. Thereby, 126 in vivo phosphorylation sites were determined. The flagellar phosphoproteome includes different structural and motor proteins, kinases, proteins with protein interaction domains, and many proteins whose functions are still unknown. In several cases, a dynamic phosphorylation pattern and clustering of phosphorylation sites were found, indicating a complex physiological status and specific control by reversible protein phosphorylation in the flagellum.


2020 ◽  
Vol 219 (9) ◽  
Author(s):  
Manuel Chiusa ◽  
Wen Hu ◽  
Jozef Zienkiewicz ◽  
Xiwu Chen ◽  
Ming-Zhi Zhang ◽  
...  

Excessive accumulation of collagen leads to fibrosis. Integrin α1β1 (Itgα1β1) prevents kidney fibrosis by reducing collagen production through inhibition of the EGF receptor (EGFR) that phosphorylates cytoplasmic and nuclear proteins. To elucidate how the Itgα1β1/EGFR axis controls collagen synthesis, we analyzed the levels of nuclear tyrosine phosphorylated proteins in WT and Itgα1-null kidney cells. We show that the phosphorylation of the RNA-DNA binding protein fused in sarcoma (FUS) is higher in Itgα1-null cells. FUS contains EGFR-targeted phosphorylation sites and, in Itgα1-null cells, activated EGFR promotes FUS phosphorylation and nuclear translocation. Nuclear FUS binds to the collagen IV promoter, commencing gene transcription that is reduced by inhibiting EGFR, down-regulating FUS, or expressing FUS mutated in the EGFR-targeted phosphorylation sites. Finally, a cell-penetrating peptide that inhibits FUS nuclear translocation reduces FUS nuclear content and collagen IV transcription. Thus, EGFR-mediated FUS phosphorylation regulates FUS nuclear translocation and transcription of a major profibrotic collagen gene. Targeting FUS nuclear translocation offers a new antifibrotic therapy.


Author(s):  
Bethany Percha

Electronic health records (EHRs) are becoming a vital source of data for healthcare quality improvement, research, and operations. However, much of the most valuable information contained in EHRs remains buried in unstructured text. The field of clinical text mining has advanced rapidly in recent years, transitioning from rule-based approaches to machine learning and, more recently, deep learning. With new methods come new challenges, however, especially for those new to the field. This review provides an overview of clinical text mining for those who are encountering it for the first time (e.g., physician researchers, operational analytics teams, machine learning scientists from other domains). While not a comprehensive survey, this review describes the state of the art, with a particular focus on new tasks and methods developed over the past few years. It also identifies key barriers between these remarkable technical advances and the practical realities of implementation in health systems and in industry. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2021 ◽  
Author(s):  
Gábor Csizmadia ◽  
Krisztina Liszkai-Peres ◽  
Bence Ferdinandy ◽  
Ádám Miklósi ◽  
Veronika Konok

Abstract Human activity recognition (HAR) using machine learning (ML) methods is a relatively new method for collecting and analyzing large amounts of human behavioral data using special wearable sensors. Our main goal was to find a reliable method which could automatically detect various playful and daily routine activities in children. We defined 40 activities for ML recognition, and we collected activity motion data by means of wearable smartwatches with a special SensKid software. We analyzed the data of 34 children (19 girls, 15 boys; age range: 6.59 – 8.38; median age = 7.47). All children were typically developing first graders from three elementary schools. The activity recognition was a binary classification task which was evaluated with a Light Gradient Boosted Machine (LGBM)learning algorithm, a decision based method with a 3-fold cross validation. We used the sliding window technique during the signal processing, and we aimed at finding the best window size for the analysis of each behavior element to achieve the most effective settings. Seventeen activities out of 40 were successfully recognized with AUC values above 0.8. The window size had no significant effect. The overall accuracy was 0.95, which is at the top segment of the previously published similar HAR data. In summary, the LGBM is a very promising solution for HAR. In line with previous findings, our results provide a firm basis for a more precise and effective recognition system that can make human behavioral analysis faster and more objective.


Sign in / Sign up

Export Citation Format

Share Document