Effectiveness analysis of machine learning classification models for predicting personalized context-aware smartphone usage

AbstractThe development of selective inhibitors of the clinically relevant human Carbonic Anhydrase (hCA) isoforms IX and XII has become a major topic in drug research, due to their deregulation in several types of cancer. Indeed, the selective inhibition of these two isoforms, especially with respect to the homeostatic isoform II, holds great promise to develop anticancer drugs with limited side effects. Therefore, the development of in silico models able to predict the activity and selectivity against the desired isoform(s) is of central interest. In this work, we have developed a series of machine learning classification models, trained on high confidence data extracted from ChEMBL, able to predict the activity and selectivity profiles of ligands for human Carbonic Anhydrase isoforms II, IX and XII. The training datasets were built with a procedure that made use of flexible bioactivity thresholds to obtain well-balanced active and inactive classes. We used multiple algorithms and sampling sizes to finally select activity models able to classify active or inactive molecules with excellent performances. Remarkably, the results herein reported turned out to be better than those obtained by models built with the classic approach of selecting an a priori activity threshold. The sequential application of such validated models enables virtual screening to be performed in a fast and more reliable way to predict the activity and selectivity profiles against the investigated isoforms.

Download Full-text

A Comparative Study on Machine Learning Classification Models for Activity Recognition

Journal of Information Technology & Software Engineering ◽

10.4172/2165-7866.1000209 ◽

2017 ◽

Vol 07 (04) ◽

Cited By ~ 7

Author(s):

Mohsen Nabian

Keyword(s):

Machine Learning ◽

Comparative Study ◽

Activity Recognition ◽

Classification Models ◽

Machine Learning Classification

Download Full-text

Sensing fermentation degree of cocoa (Theobroma cacao L.) beans by machine learning classification models based electronic nose system

Journal of Food Process Engineering ◽

10.1111/jfpe.13175 ◽

2019 ◽

Vol 42 (6) ◽

Cited By ~ 5

Author(s):

Juzhong Tan ◽

Balu Balasubramanian ◽

Darin Sukha ◽

Saila Ramkissoon ◽

Pathmanathan Umaharan

Keyword(s):

Machine Learning ◽

Electronic Nose ◽

Theobroma Cacao ◽

Classification Models ◽

Machine Learning Classification ◽

Electronic Nose System ◽

Theobroma Cacao L

Download Full-text

Machine Learning Classification Models with SPD/ED Dataset: Comparative Study of Abstract Versus Full Article Approach

Lecture Notes in Computer Science - The Impact of Digital Technologies on Public Health in Developed and Developing Countries ◽

10.1007/978-3-030-51517-1_31 ◽

2020 ◽

pp. 348-356

Author(s):

Mayara Khadhraoui ◽

Hatem Bellaaj ◽

Mehdi Ben Ammar ◽

Habib Hamam ◽

Mohamed Jmaiel

Keyword(s):

Machine Learning ◽

Full Article ◽

Comparative Study ◽

Classification Models ◽

Machine Learning Classification

Download Full-text

ContextPCA: Predicting Context-Aware Smartphone Apps Usage Based On Machine Learning Techniques

Symmetry ◽

10.3390/sym12040499 ◽

2020 ◽

Vol 12 (4) ◽

pp. 499 ◽

Cited By ~ 8

Author(s):

Iqbal H. Sarker ◽

Yoosef B. Abushark ◽

Asif Irshad Khan

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Real Life ◽

Machine Learning Techniques ◽

Model Complexity ◽

Context Aware ◽

Smartphone Apps ◽

Data Set ◽

Machine Learning Classification ◽

Learning Techniques

This paper mainly formulates the problem of predicting context-aware smartphone apps usage based on machine learning techniques. In the real world, people use various kinds of smartphone apps differently in different contexts that include both the user-centric context and device-centric context. In the area of artificial intelligence and machine learning, decision tree model is one of the most popular approaches for predicting context-aware smartphone usage. However, real-life smartphone apps usage data may contain higher dimensions of contexts, which may cause several issues such as increases model complexity, may arise over-fitting problem, and consequently decreases the prediction accuracy of the context-aware model. In order to address these issues, in this paper, we present an effective principal component analysis (PCA) based context-aware smartphone apps prediction model, “ContextPCA” using decision tree machine learning classification technique. PCA is an unsupervised machine learning technique that can be used to separate symmetric and asymmetric components, and has been adopted in our “ContextPCA” model, in order to reduce the context dimensions of the original data set. The experimental results on smartphone apps usage datasets show that “ContextPCA” model effectively predicts context-aware smartphone apps in terms of precision, recall, f-score and ROC values in various test cases.

Download Full-text

Prediction of P-glycoprotein inhibitors with machine learning classification models and 3D-RISM-KH theory based solvation energy descriptors

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-019-00253-5 ◽

2019 ◽

Vol 33 (11) ◽

pp. 965-971 ◽

Cited By ~ 1

Author(s):

Vijaya Kumar Hinge ◽

Dipankar Roy ◽

Andriy Kovalenko

Keyword(s):

Machine Learning ◽

Solvation Energy ◽

Classification Models ◽

Machine Learning Classification ◽

P Glycoprotein ◽

Glycoprotein Inhibitors

Download Full-text

Comparing Statistical and Machine Learning Classifiers: Alternatives for Predictive Modeling in Human Factors Research

Human Factors The Journal of the Human Factors and Ergonomics Society ◽

10.1518/hfes.45.3.408.27248 ◽

2003 ◽

Vol 45 (3) ◽

pp. 408-423 ◽

Cited By ~ 6

Author(s):

Brian Carnahan ◽

Gérard Meyer ◽

Lois-Ann Kuntz

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Discriminant Analysis ◽

Human Factors ◽

Predictive Accuracy ◽

Performance Outcomes ◽

Learning Approaches ◽

Classification Models ◽

Machine Learning Classification ◽

Human Factors Research

Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches - genetic programming and decision tree induction - were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.

Download Full-text

Credit Risk Analysis Applying Machine Learning Classification Models

Advances in Intelligent Systems and Computing - Intelligent Computing ◽

10.1007/978-3-030-22871-2_57 ◽

2019 ◽

pp. 804-814

Author(s):

Roy Melendez

Keyword(s):

Machine Learning ◽

Risk Analysis ◽

Credit Risk ◽

Classification Models ◽

Machine Learning Classification ◽

Credit Risk Analysis

Download Full-text

Automatically Extracting Disease-Disease Association from Literature with a Large Margin Context-Aware Convolutional Neural Network (Preprint)

10.2196/preprints.14129 ◽

2019 ◽

Author(s):

Po-Ting Lai ◽

Wei-Liang Lu ◽

Ting-Rung Kuo ◽

Chia-Ru Chung ◽

Jen-Chieh Han ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Convolutional Neural Network ◽

Disease Association ◽

Supervised Machine Learning ◽

Support Vector ◽

Context Aware ◽

Large Margin ◽

Machine Learning Classification ◽

Output Layer

BACKGROUND Research on disease-disease association, like comorbidity and complication, provides important insights into disease treatment and drug discovery, and a large body of literature has been published in the field. However, using current search tools, it is not easy for researchers to retrieve information on the latest disease association findings. For one thing, comorbidity and complication keywords pull up large numbers of PubMed studies. Secondly, disease is not highlighted in search results. Third, disease-disease association (DDA) is not identified, as currently no DDA extraction dataset or tools are available. OBJECTIVE Since there are no available disease-disease association extraction (DDAE) datasets or tools, we aim to develop (1) a DDAE dataset and (2) a neural network model for extracting DDAs from literature. METHODS In this study, we formulate DDAE as a supervised machine learning classification problem. To develop the system, we first build a DDAE dataset. We then employ two machine-learning models, support vector machine (SVM) and convolutional neural network (CNN), to extract DDAs. Furthermore, we evaluate the effect of using the output layer as features of the SVM-based model. Finally, we implement large margin context-aware convolutional neural network (LC-CNN) architecture to integrate context features and CNN through the large margin function. RESULTS Our DDAE dataset consists of 521 PubMed abstracts. Experiment results show that the SVM-based approach achieves an F1-measure of 80.32%, which is higher than the CNN-based approach (73.32%). Using the output layer of CNN as a feature for SVM does not further improve the performance of SVM. However, our LC-CNN achieves the highest F1-measure of 84.18%, and demonstrates combining the hinge loss function of SVM with CNN into a single NN architecture outperforms other approaches. CONCLUSIONS To facilitate the development of text-mining research for DDAE, we develop the first publicly available DDAE dataset consisting of disease mentions, MeSH IDs and relation annotations. We develop different conventional ML models and NN architectures, and evaluate their effects on our DDAE dataset. To further improve DDAE performance, we propose an LC-CNN model for DDAE that outperforms other approaches.

Download Full-text

Cultural differences in music features across Taiwanese, Japanese and American markets

PeerJ Computer Science ◽

10.7717/peerj-cs.642 ◽

2021 ◽

Vol 7 ◽

pp. e642

Author(s):

Kongmeng Liew ◽

Yukiko Uchida ◽

Igor de Almeida

Keyword(s):

Machine Learning ◽

Popular Music ◽

Cultural Differences ◽

Binary Classification ◽

High Energy ◽

Japanese American ◽

Classification Models ◽

Machine Learning Classification ◽

Follow Up Study

Background Preferences for music can be represented through music features. The widespread prevalence of music streaming has allowed for music feature information to be consolidated by service providers like Spotify. In this paper, we demonstrate that machine learning classification on cultural market membership (Taiwanese, Japanese, American) by music features reveals variations in popular music across these markets. Methods We present an exploratory analysis of 1.08 million songs centred on Taiwanese, Japanese and American markets. We use both multiclass classification models (Gradient Boosted Decision Trees (GBDT) and Multilayer Perceptron (MLP)), and binary classification models, and interpret their results using variable importance measures and Partial Dependence Plots. To ensure the reliability of our interpretations, we conducted a follow-up study comparing Top-50 playlists from Taiwan, Japan, and the US on identified variables of importance. Results The multiclass models achieved moderate classification accuracy (GBDT = 0.69, MLP = 0.66). Accuracy scores for binary classification models ranged between 0.71 to 0.81. Model interpretation revealed music features of greatest importance: Overall, popular music in Taiwan was characterised by high acousticness, American music was characterised by high speechiness, and Japanese music was characterised by high energy features. A follow-up study using Top-50 charts found similarly significant differences between cultures for these three features. Conclusion We demonstrate that machine learning can reveal both the magnitude of differences in music preference across Taiwanese, Japanese, and American markets, and where these preferences are different. While this paper is limited to Spotify data, it underscores the potential contribution of machine learning in exploratory approaches to research on cultural differences.

Download Full-text