Machine-Learning-Enabled Virtual Screening for Inhibitors of Lysine-Specific Histone Demethylase 1

A machine learning approach has been applied to virtual screening for lysine specific demethylase 1 (LSD1) inhibitors. LSD1 is an important anti-cancer target. Machine learning models to predict activity were constructed using Morgan molecular fingerprints. The dataset, consisting of 931 molecules with LSD1 inhibition activity, was obtained from the ChEMBL database. An evaluation of several candidate algorithms on the main dataset revealed that the support vector regressor gave the best model, with a coefficient of determination (R2) of 0.703. Virtual screening, using this model, identified five predicted potent inhibitors from the ZINC database comprising more than 300,000 molecules. The virtual screening recovered a known inhibitor, RN1, as well as four compounds where activity against LSD1 had not previously been suggested. Thus, we performed a machine-learning-enabled virtual screening of LSD1 inhibitors using only the structural information of the molecules.

Download Full-text

Machine Learning for Sensorless Temperature Estimation of a BLDC Motor

Sensors ◽

10.3390/s21144655 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4655

Author(s):

Dariusz Czerwinski ◽

Jakub Gęca ◽

Krzysztof Kolano

Keyword(s):

Machine Learning ◽

Temperature Measurement ◽

Stochastic Gradient Descent ◽

Estimation Accuracy ◽

Coefficient Of Determination ◽

Percentage Error ◽

Support Vector ◽

Bldc Motor ◽

Temperature Estimation ◽

Motor Operation

In this article, the authors propose two models for BLDC motor winding temperature estimation using machine learning methods. For the purposes of the research, measurements were made for over 160 h of motor operation, and then, they were preprocessed. The algorithms of linear regression, ElasticNet, stochastic gradient descent regressor, support vector machines, decision trees, and AdaBoost were used for predictive modeling. The ability of the models to generalize was achieved by hyperparameter tuning with the use of cross-validation. The conducted research led to promising results of the winding temperature estimation accuracy. In the case of sensorless temperature prediction (model 1), the mean absolute percentage error MAPE was below 4.5% and the coefficient of determination R2 was above 0.909. In addition, the extension of the model with the temperature measurement on the casing (model 2) allowed reducing the error value to about 1% and increasing R2 to 0.990. The results obtained for the first proposed model show that the overheating protection of the motor can be ensured without direct temperature measurement. In addition, the introduction of a simple casing temperature measurement system allows for an estimation with accuracy suitable for compensating the motor output torque changes related to temperature.

Download Full-text

Prediction of Healing Performance of Autogenous Healing Concrete Using Machine Learning

Materials ◽

10.3390/ma14154068 ◽

2021 ◽

Vol 14 (15) ◽

pp. 4068

Author(s):

Xu Huang ◽

Mirna Wasouf ◽

Jessada Sresakoolchai ◽

Sakdirat Kaewunruen

Keyword(s):

Machine Learning ◽

Search Algorithm ◽

Weather Conditions ◽

Prediction Performance ◽

Machine Learning Algorithms ◽

Coefficient Of Determination ◽

Gradient Boosting ◽

Support Vector ◽

Self Healing ◽

Artificial Neural Network Ann

Cracks typically develop in concrete due to shrinkage, loading actions, and weather conditions; and may occur anytime in its life span. Autogenous healing concrete is a type of self-healing concrete that can automatically heal cracks based on physical or chemical reactions in concrete matrix. It is imperative to investigate the healing performance that autogenous healing concrete possesses, to assess the extent of the cracking and to predict the extent of healing. In the research of self-healing concrete, testing the healing performance of concrete in a laboratory is costly, and a mass of instances may be needed to explore reliable concrete design. This study is thus the world’s first to establish six types of machine learning algorithms, which are capable of predicting the healing performance (HP) of self-healing concrete. These algorithms involve an artificial neural network (ANN), a k-nearest neighbours (kNN), a gradient boosting regression (GBR), a decision tree regression (DTR), a support vector regression (SVR) and a random forest (RF). Parameters of these algorithms are tuned utilising grid search algorithm (GSA) and genetic algorithm (GA). The prediction performance indicated by coefficient of determination (R2) and root mean square error (RMSE) measures of these algorithms are evaluated on the basis of 1417 data sets from the open literature. The results show that GSA-GBR performs higher prediction performance (R2GSA-GBR = 0.958) and stronger robustness (RMSEGSA-GBR = 0.202) than the other five types of algorithms employed to predict the healing performance of autogenous healing concrete. Therefore, reliable prediction accuracy of the healing performance and efficient assistance on the design of autogenous healing concrete can be achieved.

Download Full-text

How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2016.09.012 ◽

2016 ◽

Vol 64 ◽

pp. 20-24 ◽

Cited By ~ 26

Author(s):

Daisuke Ichikawa ◽

Toki Saito ◽

Waka Ujita ◽

Hiroshi Oyama

Keyword(s):

Machine Learning ◽

Virtual Screening ◽

Learning Approach ◽

Learning Methods ◽

Machine Learning Methods ◽

Machine Learning Approach

Download Full-text

Distribution Grids Fault Location employing ST based Optimized Machine Learning Approach

Energies ◽

10.3390/en11092328 ◽

2018 ◽

Vol 11 (9) ◽

pp. 2328 ◽

Cited By ~ 12

Author(s):

Md Shafiullah ◽

M. Abido ◽

Taher Abdel-Fattah

Keyword(s):

Machine Learning ◽

Fault Location ◽

Percentage Error ◽

Support Vector ◽

Learning Approach ◽

Efficiency Coefficient ◽

Learning Tools ◽

Performance Indices ◽

Machine Learning Approach ◽

Distribution Grids

Precise information of fault location plays a vital role in expediting the restoration process, after being subjected to any kind of fault in power distribution grids. This paper proposed the Stockwell transform (ST) based optimized machine learning approach, to locate the faults and to identify the faulty sections in the distribution grids. This research employed the ST to extract useful features from the recorded three-phase current signals and fetches them as inputs to different machine learning tools (MLT), including the multilayer perceptron neural networks (MLP-NN), support vector machines (SVM), and extreme learning machines (ELM). The proposed approach employed the constriction-factor particle swarm optimization (CF-PSO) technique, to optimize the parameters of the SVM and ELM for their better generalization performance. Hence, it compared the obtained results of the test datasets in terms of the selected statistical performance indices, including the root mean squared error (RMSE), mean absolute percentage error (MAPE), percent bias (PBIAS), RMSE-observations to standard deviation ratio (RSR), coefficient of determination (R2), Willmott’s index of agreement (WIA), and Nash–Sutcliffe model efficiency coefficient (NSEC) to confirm the effectiveness of the developed fault location scheme. The satisfactory values of the statistical performance indices, indicated the superiority of the optimized machine learning tools over the non-optimized tools in locating faults. In addition, this research confirmed the efficacy of the faulty section identification scheme based on overall accuracy. Furthermore, the presented results validated the robustness of the developed approach against the measurement noise and uncertainties associated with pre-fault loading condition, fault resistance, and inception angle.

Download Full-text

Hybrid Machine Learning Approach for Skin Disease Detection Using Optimal Support Vector Machine

Intelligent Data Communication Technologies and Internet of Things - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-3-030-34080-3_73 ◽

2019 ◽

pp. 647-658

Author(s):

K. Melbin ◽

Y. Jacob Vetha Raj

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Skin Disease ◽

Support Vector ◽

Disease Detection ◽

Learning Approach ◽

Machine Learning Approach ◽

Hybrid Machine

Download Full-text

A machine learning approach to predict pancreatic islet grafts rejection versus tolerance

PLoS ONE ◽

10.1371/journal.pone.0241925 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0241925

Author(s):

Gerardo A. Ceballos ◽

Luis F. Hernandez ◽

Daniel Paredes ◽

Luis R. Betancourt ◽

Midhat H. Abdulreda

Keyword(s):

Machine Learning ◽

Pancreatic Islet ◽

Support Vector ◽

Medical Diagnoses ◽

Laser Induced Fluorescence Detection ◽

Classification Score ◽

New Information ◽

Islet Allografts ◽

Machine Learning Approach ◽

Positive Classification

The application of artificial intelligence (AI) and machine learning (ML) in biomedical research promises to unlock new information from the vast amounts of data being generated through the delivery of healthcare and the expanding high-throughput research applications. Such information can aid medical diagnoses and reveal various unique patterns of biochemical and immune features that can serve as early disease biomarkers. In this report, we demonstrate the feasibility of using an AI/ML approach in a relatively small dataset to discriminate among three categories of samples obtained from mice that either rejected or tolerated their pancreatic islet allografts following transplant in the anterior chamber of the eye, and from naïve controls. We created a locked software based on a support vector machine (SVM) technique for pattern recognition in electropherograms (EPGs) generated by micellar electrokinetic chromatography and laser induced fluorescence detection (MEKC-LIFD). Predictions were made based only on the aligned EPGs obtained in microliter-size aqueous humor samples representative of the immediate local microenvironment of the islet allografts. The analysis identified discriminative peaks in the EPGs of the three sample categories. Our classifier software was tested with targeted and untargeted peaks. Working with the patterns of untargeted peaks (i.e., based on the whole pattern of EPGs), it was able to achieve a 21 out of 22 positive classification score with a corresponding 95.45% prediction accuracy among the three sample categories, and 100% accuracy between the rejecting and tolerant recipients. These findings demonstrate the feasibility of AI/ML approaches to classify small numbers of samples and they warrant further studies to identify the analytes/biochemicals corresponding to discriminative features as potential biomarkers of islet allograft immune rejection and tolerance.

Download Full-text

Detecting Controversial Articles on Citizen Journalism

Jurnal Ilmu Komputer dan Informasi ◽

10.21609/jiki.v11i1.478 ◽

2018 ◽

Vol 11 (1) ◽

pp. 34

Author(s):

Alfan Farizki Wicaksono ◽

Sharon Raissa Herdiyana ◽

Mirna Adriani

Keyword(s):

Machine Learning ◽

Structural Information ◽

Structural Features ◽

The Body ◽

Supervised Machine Learning ◽

Citizen Journalism ◽

Learning Approach ◽

Daily News ◽

Machine Learning Approach ◽

Controversial Topic

Someone's understanding and stance on a particular controversial topic can be influenced by daily news or articles he consume everyday. Unfortunately, readers usually do not realize that they are reading controversial articles. In this paper, we address the problem of automatically detecting controversial article from citizen journalism media. To solve the problem, we employ a supervised machine learning approach with several hand-crafted features that exploits linguistic information, meta-data of an article, structural information in the commentary section, and sentiment expressed inside the body of an article. The experimental results shows that our proposed method manages to perform the addressed task effectively. The best performance so far is achieved when we use all proposed feature with Logistic Regression as our model (82.89\% in terms of accuracy). Moreover, we found that information from commentary section (structural features) contributes most to the classification task.

Download Full-text

Arabic English Cross-Lingual Plagiarism Detection Based on Keyphrases Extraction, Monolingual and Machine Learning Approach

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2018/v2i330075 ◽

2019 ◽

pp. 1-12

Author(s):

Mokhtar Al-Suhaiqi ◽

Muneer A. S. Hazaa ◽

Mohammed Albared

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Detection Methods ◽

Support Vector ◽

Svm Classifier ◽

Learning Approach ◽

Plagiarism Detection ◽

Machine Learning Approach ◽

Cross Lingual ◽

Cross Language

Due to rapid growth of research articles in various languages, cross-lingual plagiarism detection problem has received increasing interest in recent years. Cross-lingual plagiarism detection is more challenging task than monolingual plagiarism detection. This paper addresses the problem of cross-lingual plagiarism detection (CLPD) by proposing a method that combines keyphrases extraction, monolingual detection methods and machine learning approach. The research methodology used in this study has facilitated to accomplish the objectives in terms of designing, developing, and implementing an efficient Arabic – English cross lingual plagiarism detection. This paper empirically evaluates five different monolingual plagiarism detection methods namely i)N-Grams Similarity, ii)Longest Common Subsequence, iii)Dice Coefficient, iv)Fingerprint based Jaccard Similarity and v) Fingerprint based Containment Similarity. In addition, three machine learning approaches namely i) naïve Bayes, ii) Support Vector Machine, and iii) linear logistic regression classifiers are used for Arabic-English Cross-language plagiarism detection. Several experiments are conducted to evaluate the performance of the key phrases extraction methods. In addition, Several experiments to investigate the performance of machine learning techniques to find the best method for Arabic-English Cross-language plagiarism detection. According to the experiments of Arabic-English Cross-language plagiarism detection, the highest result was obtained using SVM classifier with 92% f-measure. In addition, the highest results were obtained by all classifiers are achieved, when most of the monolingual plagiarism detection methods are used.

Download Full-text

Investigating the Applications of Machine Learning Techniques to Predict the Rock Brittleness Index

Applied Sciences ◽

10.3390/app10051691 ◽

2020 ◽

Vol 10 (5) ◽

pp. 1691 ◽

Cited By ~ 5

Author(s):

Deliang Sun ◽

Mahshid Lonbani ◽

Behnam Askarian ◽

Danial Jahed Armaghani ◽

Reza Tarinejad ◽

...

Keyword(s):

Machine Learning ◽

Point Load ◽

P Wave ◽

Machine Learning Techniques ◽

Brittleness Index ◽

Coefficient Of Determination ◽

Support Vector ◽

Ann Model ◽

Learning Techniques ◽

Rock Brittleness

Despite the vast usage of machine learning techniques to solve engineering problems, a very limited number of studies on the rock brittleness index (BI) have used these techniques to analyze issues in this field. The present study developed five well-known machine learning techniques and compared their performance to predict the brittleness index of the rock samples. The comparison of the models’ performance was conducted through a ranking system. These techniques included Chi-square automatic interaction detector (CHAID), random forest (RF), support vector machine (SVM), K-nearest neighbors (KNN), and artificial neural network (ANN). This study used a dataset from a water transfer tunneling project in Malaysia. Results of simple rock index tests i.e., Schmidt hammer, p-wave velocity, point load, and density were considered as model inputs. The results of this study indicated that while the RF model had the best performance for training (ranking = 25), the ANN outperformed other models for testing (ranking = 22). However, the KNN model achieved the highest cumulative ranking, which was 37. The KNN model showed desirable stability for both training and testing. However, the results of validation stage indicated that RF model with coefficient of determination (R2) of 0.971 provides higher performance capacity for prediction of the rock BI compared to KNN model with R2 of 0.807 and ANN model with R2 of 0.860. The results of this study suggest a practical use of the machine learning models in solving problems related to rock mechanics specially rock brittleness index.

Download Full-text

Driver Stress State Evaluation by Means of Thermal Imaging: A Supervised Machine Learning Approach Based on ECG Signal

Applied Sciences ◽

10.3390/app10165673 ◽

2020 ◽

Vol 10 (16) ◽

pp. 5673 ◽

Cited By ~ 2

Author(s):

Daniela Cardone ◽

David Perpetuini ◽

Chiara Filippini ◽

Edoardo Spadolini ◽

Lorenza Mancini ◽

...

Keyword(s):

Machine Learning ◽

Stress State ◽

Thermal Imaging ◽

Driving Simulator ◽

Supervised Machine Learning ◽

Support Vector ◽

Learning Approach ◽

Machine Learning Approach ◽

Thermal Features ◽

Driver Stress

Traffic accidents determine a large number of injuries, sometimes fatal, every year. Among other factors affecting a driver’s performance, an important role is played by stress which can decrease decision-making capabilities and situational awareness. In this perspective, it would be beneficial to develop a non-invasive driver stress monitoring system able to recognize the driver’s altered state. In this study, a contactless procedure for drivers’ stress state assessment by means of thermal infrared imaging was investigated. Thermal imaging was acquired during an experiment on a driving simulator, and thermal features of stress were investigated with comparison to a gold-standard metric (i.e., the stress index, SI) extracted from contact electrocardiography (ECG). A data-driven multivariate machine learning approach based on a non-linear support vector regression (SVR) was employed to estimate the SI through thermal features extracted from facial regions of interest (i.e., nose tip, nostrils, glabella). The predicted SI showed a good correlation with the real SI (r = 0.61, p = ~0). A two-level classification of the stress state (STRESS, SI ≥ 150, versus NO STRESS, SI < 150) was then performed based on the predicted SI. The ROC analysis showed a good classification performance with an AUC of 0.80, a sensitivity of 77%, and a specificity of 78%.

Download Full-text