XGBoost and Network Analysis for Prediction of Proteins Affecting Insulin based on Protein Protein Interactions

Protein Interaction Analysis (PPI) can be used to identify proteins that have a supporting function on the main protein, especially in the synthesis process. Insulin is synthesized by proteins that have the same molecular function covering different but mutually supportive roles. To identify this function, the translation of Gene Ontology (GO) gives certain characteristics to each protein. This study purpose to predict proteins that interact with insulin using the centrality method as a feature extractor and extreme gradient boosting as a classification algorithm. Characteristics using the centralized method produces features as a central function of protein. Classification results are measured using measurements, precision, recall and ROC scores. Optimizing the model by finding the right parameters produces an accuracy of and a ROC score of . The prediction model produced by XGBoost has capabilities above the average of other machine learning methods.

Download Full-text

Predicting Protein-Protein Interactions based on Biological Information using Extreme Gradient Boosting

2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) ◽

10.1109/cibcb.2019.8791241 ◽

2019 ◽

Cited By ~ 1

Author(s):

Jerome Cary Beltran ◽

Paolo Valdez ◽

Prospero Naval

Keyword(s):

Protein Interactions ◽

Biological Information ◽

Gradient Boosting ◽

Protein Protein Interactions ◽

Extreme Gradient Boosting

Download Full-text

A Deep Learning and XGBoost-Based Method for Predicting Protein-Protein Interaction Sites

Frontiers in Genetics ◽

10.3389/fgene.2021.752732 ◽

2021 ◽

Vol 12 ◽

Author(s):

Pan Wang ◽

Guiyang Zhang ◽

Zu-Guo Yu ◽

Guohua Huang

Keyword(s):

Deep Learning ◽

Protein Interaction ◽

Protein Interactions ◽

Gradient Boosting ◽

Protein Protein Interactions ◽

Global Features ◽

Protein Protein Interaction ◽

Interaction Sites ◽

Extreme Gradient Boosting ◽

Protein Interaction Sites

Knowledge about protein-protein interactions is beneficial in understanding cellular mechanisms. Protein-protein interactions are usually determined according to their protein-protein interaction sites. Due to the limitations of current techniques, it is still a challenging task to detect protein-protein interaction sites. In this article, we presented a method based on deep learning and XGBoost (called DeepPPISP-XGB) for predicting protein-protein interaction sites. The deep learning model served as a feature extractor to remove redundant information from protein sequences. The Extreme Gradient Boosting algorithm was used to construct a classifier for predicting protein-protein interaction sites. The DeepPPISP-XGB achieved the following results: area under the receiver operating characteristic curve of 0.681, a recall of 0.624, and area under the precision-recall curve of 0.339, being competitive with the state-of-the-art methods. We also validated the positive role of global features in predicting protein-protein interaction sites.

Download Full-text

Classification of Hot Spots using XGBoost and LightGBM Algorithms

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.e9459.069520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 722-724

Keyword(s):

Computational Methods ◽

Protein Interactions ◽

Hot Spots ◽

Cell Metabolism ◽

Pearson Correlation ◽

Classification Performance ◽

Gradient Boosting ◽

Support Vector ◽

Extreme Gradient Boosting ◽

Hub Proteins

Protein-Protein Interactions referred as PPIs perform significant role in biological functions like cell metabolism, immune response, signal transduction etc. Hot spots are small fractions of residues in interfaces and provide substantial binding energy in PPIs. Therefore, identification of hot spots is important to discover and analyze molecular medicines and diseases. The current strategy, alanine scanning isn't pertinent to enormous scope applications since the technique is very costly and tedious. The existing computational methods are poor in classification performance as well as accuracy in prediction. They are concerned with the topological structure and gene expression of hub proteins. The proposed system focuses on hot spots of hub proteins by eliminating redundant as well as highly correlated features using Pearson Correlation Coefficient and Support Vector Machine based feature elimination. Extreme Gradient boosting and LightGBM algorithms are used to ensemble a set of weak classifiers to form a strong classifier. The proposed system shows better accuracy than the existing computational methods. The model can also be used to predict accurate molecular inhibitors for specific PPIs

Download Full-text

Network Protein Interaction in Parkinson's Disease and Periodontitis Interplay: A Bioinformatic Analysis

10.20944/preprints202009.0050.v1 ◽

2020 ◽

Author(s):

João Botelho ◽

Paulo Mascarenhas ◽

José João Mendes ◽

Vanessa Machado

Keyword(s):

Protein Interaction ◽

Protein Interactions ◽

Molecular Mechanisms ◽

Interaction Analysis ◽

Bioinformatic Analysis ◽

Comprehensive Analysis ◽

Protein Network ◽

Protein Protein Interactions ◽

Network Interaction ◽

Genes Encoding

Recent studies supported a clinical association between Parkinson’s Disease (PD) and periodontitis. Hence, investigating possible protein interactions between these two conditions is of interest. In this study, we conducted a protein-protein network interaction analysis with recognized genes encoding proteins for PD and periodontitis. Genes of interest were collected via GWAS database. Then, we conducted a protein interaction analysis using STRING database, with a highest confidence cut-off of 0.9. Our protein network casted a comprehensive analysis of potential protein-protein interactions between PD and periodontitis. This analysis may underpin valuable information for new candidate molecular mechanisms between PD and periodontitis and may serve new potential targets for research purposes. These results should be carefully interpreted giving the limitations of this approach.

Download Full-text

The prediction of molecule atomization energy using neural network and extreme gradient boosting

Journal of Physics Conference Series ◽

10.1088/1742-6596/2072/1/012005 ◽

2021 ◽

Vol 2072 (1) ◽

pp. 012005

Author(s):

M Sumanto ◽

M A Martoprawiro ◽

A L Ivansyah

Keyword(s):

Neural Network ◽

Machine Learning ◽

Atomization Energy ◽

Gradient Boosting ◽

Intelligence System ◽

The Neural Network ◽

Extreme Gradient Boosting ◽

Boosting Method ◽

The Right ◽

Parameter Values

Abstract Machine Learning is an artificial intelligence system, where the system has the ability to learn automatically from experience without being explicitly programmed. The learning process from Machine Learning starts from observing the data and then looking at the pattern of the data. The main purpose of this process is to make computers learn automatically. In this study, we will use Machine Learning to predict molecular atomization energy. From various methods in Machine Learning, we use two methods namely Neural Network and Extreme Gradient Boosting. Both methods have several parameters that must be adjusted so that the predicted value of the atomization energy of the molecule has the lowest possible error. We are trying to find the right parameter values for both methods. For the neural network method, it is quite difficult to find the right parameter value because it takes a long time to train the model of the neural network to find out whether the model is good or bad, while for the Extreme Gradient Boosting method the time needed to train the model is shorter, so it is quite easy to find the right parameter values for the model. This study also looked at the effects of the modification on the dataset with the output transformation of normalization and standardization then removing molecules containing Br atoms and changing the entry in the Coulomb matrix to 0 if the distance between atoms in the molecule exceeds 2 angstrom.

Download Full-text

Explainable Artificial Intelligence for Sarcasm Detection in Dialogues

Wireless Communications and Mobile Computing ◽

10.1155/2021/2939334 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Akshi Kumar ◽

Shubham Dikshit ◽

Victor Hugo C. Albuquerque

Keyword(s):

Language Processing ◽

Learning Algorithm ◽

Real Life ◽

Decision Makers ◽

Gradient Boosting ◽

Trained Classifier ◽

Extreme Gradient Boosting ◽

Interpretable Model ◽

The Right ◽

Post Hoc

Sarcasm detection in dialogues has been gaining popularity among natural language processing (NLP) researchers with the increased use of conversational threads on social media. Capturing the knowledge of the domain of discourse, context propagation during the course of dialogue, and situational context and tone of the speaker are some important features to train the machine learning models for detecting sarcasm in real time. As situational comedies vibrantly represent human mannerism and behaviour in everyday real-life situations, this research demonstrates the use of an ensemble supervised learning algorithm to detect sarcasm in the benchmark dialogue dataset, MUStARD. The punch-line utterance and its associated context are taken as features to train the eXtreme Gradient Boosting (XGBoost) method. The primary goal is to predict sarcasm in each utterance of the speaker using the chronological nature of a scene. Further, it is vital to prevent model bias and help decision makers understand how to use the models in the right way. Therefore, as a twin goal of this research, we make the learning model used for conversational sarcasm detection interpretable. This is done using two post hoc interpretability approaches, Local Interpretable Model-agnostic Explanations (LIME) and Shapley Additive exPlanations (SHAP), to generate explanations for the output of a trained classifier. The classification results clearly depict the importance of capturing the intersentence context to detect sarcasm in conversational threads. The interpretability methods show the words (features) that influence the decision of the model the most and help the user understand how the model is making the decision for detecting sarcasm in dialogues.

Download Full-text

An Explanation Framework for Interpretable Credit Scoring

International Journal of Artificial Intelligence & Applications ◽

10.5121/ijaia.2021.12102 ◽

2021 ◽

Vol 12 (1) ◽

pp. 19-38

Author(s):

Lara Marie Demajo ◽

Vince Vella ◽

Alexiei Dingli

Keyword(s):

Credit Scoring ◽

Imbalanced Data ◽

Scoring Systems ◽

Gradient Boosting ◽

Home Equity ◽

General Data Protection Regulation ◽

Box Models ◽

Extreme Gradient Boosting ◽

Feature Based ◽

The Right

With the recent boosted enthusiasm in Artificial Intelligence (AI) and Financial Technology (FinTech), applications such as credit scoring have gained substantial academic interest. However, despite the evergrowing achievements, the biggest obstacle in most AI systems is their lack of interpretability. This deficiency of transparency limits their application in different domains including credit scoring. Credit scoring systems help financial experts make better decisions regarding whether or not to accept a loan application so that loans with a high probability of default are not accepted. Apart from the noisy and highly imbalanced data challenges faced by such credit scoring models, recent regulations such as the `right to explanation' introduced by the General Data Protection Regulation (GDPR) and the Equal Credit Opportunity Act (ECOA) have added the need for model interpretability to ensure that algorithmic decisions are understandable and coherent. A recently introduced concept is eXplainable AI (XAI), which focuses on making black-box models more interpretable. In this work, we present a credit scoring model that is both accurate and interpretable. For classification, state-of-the-art performance on the Home Equity Line of Credit (HELOC) and Lending Club (LC) Datasets is achieved using the Extreme Gradient Boosting (XGBoost) model. The model is then further enhanced with a 360-degree explanation framework, which provides different explanations (i.e. global, local feature-based and local instance- based) that are required by different people in different situations. Evaluation through the use of functionally-grounded, application-grounded and human-grounded analysis shows that the explanations provided are simple and consistent as well as correct, effective, easy to understand, sufficiently detailed and trustworthy.

Download Full-text

Protein interactions with nitric oxide synthases: controlling the right time, the right place, and the right amount of nitric oxide

AJP Renal Physiology ◽

10.1152/ajprenal.00048.2003 ◽

2003 ◽

Vol 285 (2) ◽

pp. F178-F190 ◽

Cited By ~ 179

Author(s):

Bruce C. Kone ◽

Teresa Kuncewicz ◽

Wenzheng Zhang ◽

Zhi-Yuan Yu

Keyword(s):

Nitric Oxide ◽

Protein Interactions ◽

Biological Effects ◽

Pdz Domain ◽

Membrane Receptors ◽

Heterologous Proteins ◽

Protein Protein Interactions ◽

Potential Implication ◽

The Right ◽

Nos Isoforms

Nitric oxide (NO) is a potent cell-signaling, effector, and vasodilator molecule that plays important roles in diverse biological effects in the kidney, vasculature, and many other tissues. Because of its high biological reactivity and diffusibility, multiple tiers of regulation, ranging from transcriptional to posttranslational controls, tightly control NO biosynthesis. Interactions of each of the major NO synthase (NOS) isoforms with heterologous proteins have emerged as a mechanism by which the activity, spatial distribution, and proximity of the NOS isoforms to regulatory proteins and intended targets are governed. Dimerization of the NOS isozymes, required for their activity, exhibits distinguishing features among these proteins and may serve as a regulated process and target for therapeutic intervention. An increasingly wide array of proteins, ranging from scaffolding proteins to membrane receptors, has been shown to function as NOS-binding partners. Neuronal NOS interacts via its PDZ domain with several PDZ-domain proteins. Several resident and recruited proteins of plasmalemmal caveolae, including caveolins, anchoring proteins, G protein-coupled receptors, kinases, and molecular chaperones, modulate the activity and trafficking of endothelial NOS in the endothelium. Inducible NOS (iNOS) interacts with the inhibitory molecules kalirin and NOS-associated protein 110 kDa, as well as activator proteins, the Rac GTPases. In addition, protein-protein interactions of proteins governing iNOS transcription function to specify activation or suppression of iNOS induction by cytokines. The calpain and ubiquitin-proteasome pathways are the major proteolytic systems responsible for the regulated degradation of NOS isozymes. The experimental basis for these protein-protein interactions, their functional importance, and potential implication for renal and vascular physiology and pathophysiology is reviewed.

Download Full-text

Interaction Analysis through Proteomic Phage Display

BioMed Research International ◽

10.1155/2014/176172 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 28

Author(s):

Gustav N. Sundell ◽

Ylva Ivarsson

Keyword(s):

Phage Display ◽

Protein Interactions ◽

Interaction Analysis ◽

Peptide Binding ◽

Open Reading Frames ◽

Protein Protein Interactions ◽

Biological Relevance ◽

Binding Domains ◽

Genome Wide ◽

Phage Libraries

Phage display is a powerful technique for profiling specificities of peptide binding domains. The method is suited for the identification of high-affinity ligands with inhibitor potential when using highly diverse combinatorial peptide phage libraries. Such experiments further provide consensus motifs for genome-wide scanning of ligands of potential biological relevance. A complementary but considerably less explored approach is to display expression products of genomic DNA, cDNA, open reading frames (ORFs), or oligonucleotide libraries designed to encode defined regions of a target proteome on phage particles. One of the main applications of such proteomic libraries has been the elucidation of antibody epitopes. This review is focused on the use of proteomic phage display to uncover protein-protein interactions of potential relevance for cellular function. The method is particularly suited for the discovery of interactions between peptide binding domains and their targets. We discuss the largely unexplored potential of this method in the discovery of domain-motif interactions of potential biological relevance.

Download Full-text

Explainable AI for Interpretable Credit Scoring

10.5121/csit.2020.101516 ◽

2020 ◽

Author(s):

Lara Marie Demajo ◽

Vince Vella ◽

Alexiei Dingli

Keyword(s):

Credit Scoring ◽

Imbalanced Data ◽

Gradient Boosting ◽

Home Equity ◽

General Data Protection Regulation ◽

Box Models ◽

Extreme Gradient Boosting ◽

Explainable Ai ◽

Feature Based ◽

The Right

With the ever-growing achievements in Artificial Intelligence (AI) and the recent boosted enthusiasm in Financial Technology (FinTech), applications such as credit scoring have gained substantial academic interest. Credit scoring helps financial experts make better decisions regarding whether or not to accept a loan application, such that loans with a high probability of default are not accepted. Apart from the noisy and highly imbalanced data challenges faced by such credit scoring models, recent regulations such as the `right to explanation' introduced by the General Data Protection Regulation (GDPR) and the Equal Credit Opportunity Act (ECOA) have added the need for model interpretability to ensure that algorithmic decisions are understandable and coherent. An interesting concept that has been recently introduced is eXplainable AI (XAI), which focuses on making black-box models more interpretable. In this work, we present a credit scoring model that is both accurate and interpretable. For classification, state-of-the-art performance on the Home Equity Line of Credit (HELOC) and Lending Club (LC) Datasets is achieved using the Extreme Gradient Boosting (XGBoost) model. The model is then further enhanced with a 360-degree explanation framework, which provides different explanations (i.e. global, local feature-based and local instance-based) that are required by different people in different situations. Evaluation through the use of functionallygrounded, application-grounded and human-grounded analysis show that the explanations provided are simple, consistent as well as satisfy the six predetermined hypotheses testing for correctness, effectiveness, easy understanding, detail sufficiency and trustworthiness.

Download Full-text