Pathogenicity Prediction of Single Amino Acid Variants with Machine Learning Model Based on Protein Structural Energies

AbstractThe predictive performance of a machine learning model highly depends on the corresponding hyper-parameter setting. Hence, hyper-parameter tuning is often indispensable. Normally such tuning requires the dedicated machine learning model to be trained and evaluated on centralized data to obtain a performance estimate. However, in a distributed machine learning scenario, it is not always possible to collect all the data from all nodes due to privacy concerns or storage limitations. Moreover, if data has to be transferred through low bandwidth connections it reduces the time available for tuning. Model-Based Optimization (MBO) is one state-of-the-art method for tuning hyper-parameters but the application on distributed machine learning models or federated learning lacks research. This work proposes a framework $$\textit{MODES}$$ MODES that allows to deploy MBO on resource-constrained distributed embedded systems. Each node trains an individual model based on its local data. The goal is to optimize the combined prediction accuracy. The presented framework offers two optimization modes: (1) $$\textit{MODES}$$ MODES -B considers the whole ensemble as a single black box and optimizes the hyper-parameters of each individual model jointly, and (2) $$\textit{MODES}$$ MODES -I considers all models as clones of the same black box which allows it to efficiently parallelize the optimization in a distributed setting. We evaluate $$\textit{MODES}$$ MODES by conducting experiments on the optimization for the hyper-parameters of a random forest and a multi-layer perceptron. The experimental results demonstrate that, with an improvement in terms of mean accuracy ($$\textit{MODES}$$ MODES -B), run-time efficiency ($$\textit{MODES}$$ MODES -I), and statistical stability for both modes, $$\textit{MODES}$$ MODES outperforms the baseline, i.e., carry out tuning with MBO on each node individually with its local sub-data set.

Download Full-text

OAB-056: A machine learning model based on tumor and immune biomarkers to predict undetectable measurable residual disease (MRD) in transplant-eligible multiple myeloma (MM)

Clinical Lymphoma Myeloma & Leukemia ◽

10.1016/s2152-2650(21)02128-5 ◽

2021 ◽

Vol 21 ◽

pp. S35

Author(s):

Camila Guerrero ◽

Noemi Puig ◽

María Teresa Cedena ◽

Ibai Goicoechea ◽

Cristina Pérez ◽

...

Keyword(s):

Machine Learning ◽

Multiple Myeloma ◽

Residual Disease ◽

Learning Model ◽

Model Based ◽

Immune Biomarkers ◽

Machine Learning Model ◽

Measurable Residual Disease

Download Full-text

Predicting the 7th Day Efficacy of Acupoint Application of Chinese Herbs (Xiao Zhong Zhi Tong Tie) in Patients with Diarrhea – A Machine-Learning Model Based on XGBoost Algorithm

World Journal of Traditional Chinese Medicine ◽

10.4103/2311-8571.326076 ◽

2021 ◽

Vol 0 (0) ◽

pp. 0

Author(s):

Feng-Qin Xu ◽

Song Sheng ◽

Rui Li ◽

Xing Wang ◽

Hong-Yang Gao ◽

...

Keyword(s):

Machine Learning ◽

Learning Model ◽

Chinese Herbs ◽

Model Based ◽

Machine Learning Model

Download Full-text

Developing a machine learning model to identify protein–protein interaction hotspots to facilitate drug discovery

PeerJ ◽

10.7717/peerj.10381 ◽

2020 ◽

Vol 8 ◽

pp. e10381

Author(s):

Rohit Nandakumar ◽

Valentin Dinu

Keyword(s):

Machine Learning ◽

Amino Acid ◽

Drug Discovery ◽

Structural Information ◽

Learning Model ◽

Protein Protein Interaction ◽

Drug Molecules ◽

Machine Learning Model ◽

Disease Associations ◽

History Of

Throughout the history of drug discovery, an enzymatic-based approach for identifying new drug molecules has been primarily utilized. Recently, protein–protein interfaces that can be disrupted to identify small molecules that could be viable targets for certain diseases, such as cancer and the human immunodeficiency virus, have been identified. Existing studies computationally identify hotspots on these interfaces, with most models attaining accuracies of ~70%. Many studies do not effectively integrate information relating to amino acid chains and other structural information relating to the complex. Herein, (1) a machine learning model has been created and (2) its ability to integrate multiple features, such as those associated with amino-acid chains, has been evaluated to enhance the ability to predict protein–protein interface hotspots. Virtual drug screening analysis of a set of hotspots determined on the EphB2-ephrinB2 complex has also been performed. The predictive capabilities of this model offer an AUROC of 0.842, sensitivity/recall of 0.833, and specificity of 0.850. Virtual screening of a set of hotspots identified by the machine learning model developed in this study has identified potential medications to treat diseases caused by the overexpression of the EphB2-ephrinB2 complex, including prostate, gastric, colorectal and melanoma cancers which are linked to EphB2 mutations. The efficacy of this model has been demonstrated through its successful ability to predict drug-disease associations previously identified in literature, including cimetidine, idarubicin, pralatrexate for these conditions. In addition, nadolol, a beta blocker, has also been identified in this study to bind to the EphB2-ephrinB2 complex, and the possibility of this drug treating multiple cancers is still relatively unexplored.

Download Full-text

CAE Performance Prediction Using Machine Learning Model Based On Historical Data

10.4271/2021-26-0401 ◽

2021 ◽

Author(s):

Srinivas Patro Tangudu ◽

Praveen Rongali

Keyword(s):

Machine Learning ◽

Performance Prediction ◽

Historical Data ◽

Learning Model ◽

Model Based ◽

Machine Learning Model

Download Full-text

Machine Learning Model Based Expert System for Pig Disease Diagnosis

Communications in Computer and Information Science - Recent Trends in Image Processing and Pattern Recognition ◽

10.1007/978-981-16-0493-5_27 ◽

2021 ◽

pp. 302-312

Author(s):

Khumukcham Robindro ◽

Ksh. Nilakanta Singh ◽

Leishangthem Sashikumar Singh

Keyword(s):

Machine Learning ◽

Expert System ◽

Disease Diagnosis ◽

Learning Model ◽

Model Based ◽

Machine Learning Model

Download Full-text

Severity Detection for the Coronavirus Disease 2019 (COVID-19) Patients Using a Machine Learning Model Based on the Blood and Urine Tests

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2020.00683 ◽

2020 ◽

Vol 8 ◽

Cited By ~ 3

Author(s):

Haochen Yao ◽

Nan Zhang ◽

Ruochi Zhang ◽

Meiyu Duan ◽

Tianqi Xie ◽

...

Keyword(s):

Machine Learning ◽

Learning Model ◽

Model Based ◽

Machine Learning Model

Download Full-text

Machine Learning Model - based Prediction of Flight Delay

2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) ◽

10.1109/i-smac49090.2020.9243339 ◽

2020 ◽

Author(s):

N Lakshmi Kalyani ◽

G. Jeshmitha ◽

Bindu Sri Sai U. ◽

M. Samanvitha ◽

J. Mahesh ◽

...

Keyword(s):

Machine Learning ◽

Learning Model ◽

Model Based ◽

Machine Learning Model ◽

Flight Delay

Download Full-text

Gastrointestinal Spatiotemporal mRNA Expression of Ghrelin vs Growth Hormone Receptor and New Growth Yield Machine Learning Model Based on Perturbation Theory

Scientific Reports ◽

10.1038/srep30174 ◽

2016 ◽

Vol 6 (1) ◽

Cited By ~ 7

Author(s):

Tao Ran ◽

Yong Liu ◽

Hengzhi Li ◽

Shaoxun Tang ◽

Zhixiong He ◽

...

Keyword(s):

Machine Learning ◽

Growth Hormone ◽

Perturbation Theory ◽

Mrna Expression ◽

Hormone Receptor ◽

Growth Hormone Receptor ◽

Growth Yield ◽

Learning Model ◽

Model Based ◽

Machine Learning Model

Download Full-text

An Intelligent Rockburst Prediction Model Based on Scorecard Methodology

Minerals ◽

10.3390/min11111294 ◽

2021 ◽

Vol 11 (11) ◽

pp. 1294

Author(s):

Honglei Wang ◽

Zhenlei Li ◽

Dazhao Song ◽

Xueqiu He ◽

Aleksei Sobolev ◽

...

Keyword(s):

Machine Learning ◽

Prediction Model ◽

False Alarm ◽

False Alarm Rate ◽

Learning Model ◽

Underground Engineering ◽

Risk Levels ◽

Model Based ◽

Machine Learning Model ◽

Sample Category

Rockburst is a serious hazard in underground engineering, and accurate prediction of rockburst risk is challenging. To construct an intelligent prediction model of rockburst risk with interpretability and high accuracy, three binary scorecards predicting different risk levels of rockburst were constructed using ChiMerge, evidence weight theory, and the logistic regression algorithm. An intelligent rockburst prediction model based on scorecard methodology (IRPSC) was obtained by integrating the three scorecards. The effects of hazard sample category weights on the missed alarm rate, false alarm rate, and accuracy of the IRPSC were analyzed. Results show that the accuracy, false alarm rate, and missed alarm rate of the IRPSC for rockburst prediction in riverside hydropower stations are 75%, 12.5%, and 12.5%, respectively. Setting higher hazard sample category weights can reduce the missed alarm rate of IRPSC, but it will lead to a higher false alarm rate. The IRPSC can adaptively adjust the threshold and weight value of the indicator and convert the abstract machine learning model into a tabular form, which overcomes the commonly black box problems of machine learning model, as well as is of great significance to the application of machine learning in rockburst risk prediction.

Download Full-text