scholarly journals Predicting protein-membrane interfaces of peripheral membrane proteins using ensemble machine learning

2021 ◽  
Author(s):  
Alexios Chatzigoulas ◽  
Zoe Cournia

Motivation: Abnormal protein-membrane attachment is involved in deregulated cellular pathways and in disease. Therefore, the possibility to modulate protein-membrane interactions represents a new promising therapeutic strategy for peripheral membrane proteins that have been considered so far undruggable. A major obstacle in this drug design strategy is that the membrane binding domains of peripheral membrane proteins are usually not known. The development of fast and efficient algorithms predicting the protein-membrane interface would shed light into the accessibility of membrane-protein interfaces by drug-like molecules. Results: Herein, we describe an ensemble machine learning methodology and algorithm for predicting membrane-penetrating residues. We utilize available experimental data in the literature for training 21 machine learning classifiers and a voting classifier. Evaluation of the ensemble classifier accuracy produced a macro-averaged F1 score = 0.92 and an MCC = 0.84 for predicting correctly membrane-penetrating residues on unknown proteins of an independent test set. Availability and implementation: The python code for predicting protein-membrane interfaces of peripheral membrane proteins is available at https://github.com/zoecournia/DREAMM.

2000 ◽  
Vol 11 (4) ◽  
pp. 1421-1432 ◽  
Author(s):  
Ozlem Ugur ◽  
Teresa L. Z. Jones

XLαs is a splice variant of the heterotrimeric G protein, Gαs, found on Golgi membranes in cells with regulated and constitutive secretion. We examined the role of the alternatively spliced amino terminus of XLαs for Golgi targeting with the use of subcellular fractionation and fluorescence microscopy. XLαs incorporated [3H]palmitate, and mutation of cysteines in a cysteine-rich region inhibited this incorporation and lessened membrane attachment. Deletion of a proline-rich region abolished Golgi localization of XLαs without changing its membrane attachment. The proline-rich and cysteine-rich regions together were sufficient to target the green fluorescent protein, a cytosolic protein, to Golgi membranes. The membrane attachment and Golgi targeting of the fusion protein required the putative palmitoylation sites, the cysteine residues in the cysteine-rich region. Several peripheral membrane proteins found at the Golgi have proline-rich regions, including a Gαi2 splice variant, dynamin II, βIII spectrin, comitin, and a Golgi SNARE, GS32. Our results suggest that proline-rich regions can be a Golgi-targeting signal for G protein α subunits and possibly for other peripheral membrane proteins as well.


Author(s):  
Yae Won Park ◽  
Jihwan Eom ◽  
Sooyon Kim ◽  
Hwiyoung Kim ◽  
Sung Soo Ahn ◽  
...  

Abstract Context Early identification of the response of prolactinoma patients to dopamine agonists (DA) is crucial in treatment planning. Objective To develop a radiomics model using an ensemble machine learning classifier with conventional magnetic resonance images (MRIs) to predict the DA response in prolactinoma patients. Design Retrospective study Setting Severance Hospital Patients A total of 177 prolactinoma patients who underwent baseline MRI (109 DA responders and 68 DA non-responders) were allocated to the training (n = 141) and test (n = 36) sets. Radiomic features (n = 107) were extracted from coronal T2-weighed MRIs. After feature selection, single models (random forest, light gradient boosting machine, extra-trees, quadratic discrimination analysis, and linear discrimination analysis) with oversampling methods were trained to predict the DA response. A soft voting ensemble classifier was used to achieve the final performance. The performance of the classifier was validated in the test set. Results The ensemble classifier showed an area under the curve (AUC) of 0.81 (95 % confidence interval [CI], 0.74–0.87) in the training set. In the test set, the ensemble classifier showed an AUC, accuracy, sensitivity, and specificity of 0.81 (95 % CI, 0.67–0.96), 77.8 %, 78.6 %, and 77.3 %, respectively. The ensemble classifier achieved the highest performance among all the individual models in the test set. Conclusions Radiomic features may be useful biomarkers to predict the DA response in prolactinoma patients.


2020 ◽  
Author(s):  
Sriraksha Srinivasan ◽  
Valeria Zoni ◽  
Stefano Vanni

Peripheral membrane proteins play a major role in numerous biological processes by transiently associating with cellular membranes, often with extreme membrane specificity. Because of the short-lived nature of these interactions,...


2019 ◽  
Vol 2019 ◽  
pp. 1-13 ◽  
Author(s):  
Emine Yaman ◽  
Abdulhamit Subasi

The neuromuscular disorders are diagnosed using electromyographic (EMG) signals. Machine learning algorithms are employed as a decision support system to diagnose neuromuscular disorders. This paper compares bagging and boosting ensemble learning methods to classify EMG signals automatically. Even though ensemble classifiers’ efficacy in relation to real-life issues has been presented in numerous studies, there are almost no studies which focus on the feasibility of bagging and boosting ensemble classifiers to diagnose the neuromuscular disorders. Therefore, the purpose of this paper is to assess the feasibility of bagging and boosting ensemble classifiers to diagnose neuromuscular disorders through the use of EMG signals. It should be understood that there are three steps to this method, where the step number one is to calculate the wavelet packed coefficients (WPC) for every type of EMG signal. After this, it is necessary to calculate statistical values of WPC so that the distribution of wavelet coefficients could be demonstrated. In the last step, an ensemble classifier used the extracted features as an input of the classifier to diagnose the neuromuscular disorders. Experimental results showed the ensemble classifiers achieved better performance for diagnosis of neuromuscular disorders. Results are promising and showed that the AdaBoost with random forest ensemble method achieved an accuracy of 99.08%, F-measure 0.99, AUC 1, and kappa statistic 0.99.


Author(s):  
Ishrat Nazeer ◽  
Mamoon Rashid ◽  
Sachin Kumar Gupta ◽  
Abhishek Kumar

Twitter is a platform where people express their opinions and come with regular updates. At present, it has become a source for many organizations where data will be extracted and then later analyzed for sentiments. Many machine learning algorithms are available for twitter sentiment analysis which are used for automatically predicting the sentiment of tweets. However, there are challenges that hinder machine learning classifiers to achieve better results in terms of classification. In this chapter, the authors are proposing a novel feature generation technique to provide desired features for training model. Next, the novel ensemble classification system is proposed for identifying sentiment in tweets through weighted majority rule ensemble classifier, which utilizes several commonly used statistical models like naive Bayes, random forest, logistic regression, which are weighted according to their performance on historical data, where weights are chosen separately for each model.


2019 ◽  
Author(s):  
Katerina C. Nastou ◽  
Georgios N. Tsaousis ◽  
Stavros J. Hamodrakas ◽  
Vassiliki A. Iconomidou

AbstractThe majority of all proteins in cells interact with membranes either permanently or temporarily. Peripheral membrane proteins form transient complexes with membrane proteins and/or lipids, via non-covalent interactions and are of outmost importance, due to numerous cellular functions in which they participate. In an effort to collect data regarding this heterogeneous group of proteins we designed and constructed a database, called PerMemDB. PerMemDB is currently the most complete and comprehensive repository of data for eukaryotic peripheral membrane proteins deposited in UniProt or predicted with the use of MBPpred – a computational method that specializes in the detection of proteins that interact non-covalently with membrane lipids, via membrane binding domains. The first version of the database contains 241173 peripheral membrane proteins from 1216 organisms. All entries have cross-references to other databases, literature references and annotation regarding their interactions with other proteins. Moreover, additional sequence annotation of the characteristic domains that allow these proteins to interact with membranes is available, due to the application of MBPpred. Through the web interface of PerMemDB, users can browse the contents of the database, submit advanced text searches and BLAST queries against the protein sequences deposited in PerMemDB. We expect this repository to serve as a source of information for the development of prediction algorithms regarding peripheral membrane proteins, in addition to proteome-wide analyses.Availabilityhttp://bioinformatics.biol.uoa.gr/[email protected] informationhttp://83.212.109.111:8085/assets/Nastou_Supplement.xlsx


Sign in / Sign up

Export Citation Format

Share Document