Index split decision tree and compositional deep neural  network for text categorization

Text categorization with machine learning algorithms generally reckons to possess horizontal set of classes. Several advanced machine learning algorithms have been designed in the past few decades. With the growing research work for text categorization, it has become important to categorize the research outcome and provide the learners with an effective machine learning method, a framework called, Hierarchical Decision Tree and Deep Neural Network (HDT-DNN).It investigates machine learning algorithms to create horizontal set of classes and it is used for classification of text. With this objective, a novel and efficient text categorization framework based on decision tree model is used in order to categorize text according to superior and subordinate level. The text to be categorized is presented in the form of a tree with parent text category being superior to all. The intermediate level represents the text that is both superior and subordinate. Then Deep Neural Network model is presented initiating compositional model, where the text has to be categorized, as a layered integration of primitives from the constructed decision tree model. The extra layers enable composition of features from lower layers, potentially modeling complex text with fewer units than a similarly carried out shallow network producing hierarchical classification. The significance of the impact of HDT-DNN framework is evaluated through empirical study. Extensive experiments are carried out and the performance of HDT-DNN framework is evaluated and compared with existing state-of-art methods using parameters such as precision, classification accuracy, classification time, with respect to varied number of features and document size.

Download Full-text

Intelligent approach to build a Deep Neural Network based IDS for cloud environment using combination of machine learning algorithms

Computers & Security ◽

10.1016/j.cose.2019.06.013 ◽

2019 ◽

Vol 86 ◽

pp. 291-317 ◽

Cited By ~ 15

Author(s):

Zouhair Chiba ◽

Noreddine Abghour ◽

Khalid Moussaid ◽

Amina El omri ◽

Mohamed Rida

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Neural Network ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Cloud Environment ◽

Intelligent Approach

Download Full-text

An Attempt to Use Machine Learning Algorithms to Estimate the Rockburst Hazard in Underground Excavations of Hard Coal Mine

Energies ◽

10.3390/en14216928 ◽

2021 ◽

Vol 14 (21) ◽

pp. 6928

Author(s):

Łukasz Wojtecki ◽

Sebastian Iwaszenko ◽

Derek B. Apel ◽

Tomasz Cichy

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Rock Mass ◽

Decision Tree ◽

Risk Prediction ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Hard Coal ◽

Underground Excavations

Rockburst is a dynamic rock mass failure occurring during underground mining under unfavorable stress conditions. The rockburst phenomenon concerns openings in different rocks and is generally correlated with high stress in the rock mass. As a result of rockburst, underground excavations lose their functionality, the infrastructure is damaged, and the working conditions become unsafe. Assessing rockburst hazards in underground excavations becomes particularly important with the increasing mining depth and the mining-induced stresses. Nowadays, rockburst risk prediction is based mainly on various indicators. However, some attempts have been made to apply machine learning algorithms for this purpose. For this article, we employed an extensive range of machine learning algorithms, e.g., an artificial neural network, decision tree, random forest, and gradient boosting, to estimate the rockburst risk in galleries in one of the deep hard coal mines in the Upper Silesian Coal Basin, Poland. With the use of these algorithms, we proposed rockburst risk prediction models. Neural network and decision tree models were most effective in assessing whether a rockburst occurred in an analyzed case, taking into account the average value of the recall parameter. In three randomly selected datasets, the artificial neural network models were able to identify all of the rockbursts.

Download Full-text

Survey of Machine Learning Algorithms to Detect Malware in Consumer Internet of Things Devices

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213021500202 ◽

2021 ◽

Vol 30 (04) ◽

pp. 2150020

Author(s):

Luke Holbrook ◽

Miltiadis Alamaniotis

Keyword(s):

Neural Network ◽

Machine Learning ◽

Internet Of Things ◽

Deep Neural Network ◽

Learning Algorithms ◽

Cyber Attacks ◽

Machine Learning Algorithms ◽

Coefficient Of Determination ◽

Support Vector ◽

Iot Devices

With the increase of cyber-attacks on millions of Internet of Things (IoT) devices, the poor network security measures on those devices are the main source of the problem. This article aims to study a number of these machine learning algorithms available for their effectiveness in detecting malware in consumer internet of things devices. In particular, the Support Vector Machines (SVM), Random Forest, and Deep Neural Network (DNN) algorithms are utilized for a benchmark with a set of test data and compared as tools in safeguarding the deployment for IoT security. Test results on a set of 4 IoT devices exhibited that all three tested algorithms presented here detect the network anomalies with high accuracy. However, the deep neural network provides the highest coefficient of determination R2, and hence, it is identified as the most precise among the tested algorithms concerning the security of IoT devices based on the data sets we have undertaken.

Download Full-text

A new ML-based approach to enhance student engagement in online environment

PLoS ONE ◽

10.1371/journal.pone.0258788 ◽

2021 ◽

Vol 16 (11) ◽

pp. e0258788

Author(s):

Sarra Ayouni ◽

Fahima Hajjej ◽

Mohamed Maddeh ◽

Shaha Al-Otaibi

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Student Engagement ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Online Environment

The educational research is increasingly emphasizing the potential of student engagement and its impact on performance, retention and persistence. This construct has emerged as an important paradigm in the higher education field for many decades. However, evaluating and predicting the student’s engagement level in an online environment remains a challenge. The purpose of this study is to suggest an intelligent predictive system that predicts the student’s engagement level and then provides the students with feedback to enhance their motivation and dedication. Three categories of students are defined depending on their engagement level (Not Engaged, Passively Engaged, and Actively Engaged). We applied three different machine-learning algorithms, namely Decision Tree, Support Vector Machine and Artificial Neural Network, to students’ activities recorded in Learning Management System reports. The results demonstrate that machine learning algorithms could predict the student’s engagement level. In addition, according to the performance metrics of the different algorithms, the Artificial Neural Network has a greater accuracy rate (85%) compared to the Support Vector Machine (80%) and Decision Tree (75%) classification techniques. Based on these results, the intelligent predictive system sends feedback to the students and alerts the instructor once a student’s engagement level decreases. The instructor can identify the students’ difficulties during the course and motivate them through e-mail reminders, course messages, or scheduling an online meeting.

Download Full-text

Application of Data Mining Algorithms for Dementia in People with HIV/AIDS

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/4602465 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Luana Ibiapina Cordeiro Calíope Pinheiro ◽

Maria Lúcia Duarte Pereira ◽

Marcial Porto Fernandez ◽

Francisco Mardônio Vieira Filho ◽

Wilson Jorge Correia Pinto de Abreu ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Data Mining ◽

Logistic Regression ◽

Random Forest ◽

Decision Tree ◽

Learning Algorithms ◽

Principal Component ◽

Machine Learning Algorithms ◽

Hiv Aids

Dementia interferes with the individual’s motor, behavioural, and intellectual functions, causing him to be unable to perform instrumental activities of daily living. This study is aimed at identifying the best performing algorithm and the most relevant characteristics to categorise individuals with HIV/AIDS at high risk of dementia from the application of data mining. Principal component analysis (PCA) algorithm was used and tested comparatively between the following machine learning algorithms: logistic regression, decision tree, neural network, KNN, and random forest. The database used for this study was built from the data collection of 270 individuals infected with HIV/AIDS and followed up at the outpatient clinic of a reference hospital for infectious and parasitic diseases in the State of Ceará, Brazil, from January to April 2019. Also, the performance of the algorithms was analysed for the 104 characteristics available in the database; then, with the reduction of dimensionality, there was an improvement in the quality of the machine learning algorithms and identified that during the tests, even losing about 30% of the variation. Besides, when considering only 23 characteristics, the precision of the algorithms was 86% in random forest, 56% logistic regression, 68% decision tree, 60% KNN, and 59% neural network. The random forest algorithm proved to be more effective than the others, obtaining 84% precision and 86% accuracy.

Download Full-text

Comparative analysis of machine learning algorithms in water extraction

Journal of Physics Conference Series ◽

10.1088/1742-6596/2076/1/012045 ◽

2021 ◽

Vol 2076 (1) ◽

pp. 012045

Author(s):

Aimin Li ◽

Meng Fan ◽

Guangduo Qin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Logistic Regression ◽

Comparative Analysis ◽

Random Forest ◽

Decision Tree ◽

Water Body ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector

Abstract There are many traditional methods available for water body extraction based on remote sensing images, such as normalised difference water index (NDWI), modified NDWI (MNDWI), and the multi-band spectrum method, but the accuracy of these methods is limited. In recent years, machine learning algorithms have developed rapidly and been applied widely. Using Landsat-8 images, models such as decision tree, logistic regression, a random forest, neural network, support vector method (SVM), and Xgboost were adopted in the present research within machine learning algorithms. Based on this, through cross validation and a grid search method, parameters were determined for each model.Moreover, the merits and demerits of several models in water body extraction were discussed and a comparative analysis was performed with three methods for determining thresholds in the traditional NDWI. The results show that the neural network has excellent performances and is a stable model, followed by the SVM and the logistic regression algorithm. Furthermore, the ensemble algorithms including the random forest and Xgboost were affected by sample distribution and the model of the decision tree returned the poorest performance.

Download Full-text

A Prototype to Detect Anomalies Using Machine Learning Algorithms and Deep Neural Network

Computational Vision and Bio Inspired Computing - Lecture Notes in Computational Vision and Biomechanics ◽

10.1007/978-3-319-71767-8_93 ◽

2018 ◽

pp. 1084-1094 ◽

Cited By ~ 1

Author(s):

Malathi A. ◽

Amudha J. ◽

Puneeth Narayana

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Neural Network ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Privacy-preserving Machine Learning as a Service

Proceedings on Privacy Enhancing Technologies ◽

10.1515/popets-2018-0024 ◽

2018 ◽

Vol 2018 (3) ◽

pp. 123-142 ◽

Cited By ~ 29

Author(s):

Ehsan Hesamifard ◽

Hassan Takabi ◽

Mehdi Ghasemi ◽

Rebecca N. Wright

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Deep Neural Network ◽

Learning Algorithms ◽

Privacy Preserving ◽

Machine Learning Algorithms ◽

Cloud Services ◽

Network Algorithms ◽

Encrypted Data

Abstract Machine learning algorithms based on deep Neural Networks (NN) have achieved remarkable results and are being extensively used in different domains. On the other hand, with increasing growth of cloud services, several Machine Learning as a Service (MLaaS) are offered where training and deploying machine learning models are performed on cloud providers’ infrastructure. However, machine learning algorithms require access to the raw data which is often privacy sensitive and can create potential security and privacy risks. To address this issue, we present CryptoDL, a framework that develops new techniques to provide solutions for applying deep neural network algorithms to encrypted data. In this paper, we provide the theoretical foundation for implementing deep neural network algorithms in encrypted domain and develop techniques to adopt neural networks within practical limitations of current homomorphic encryption schemes. We show that it is feasible and practical to train neural networks using encrypted data and to make encrypted predictions, and also return the predictions in an encrypted form. We demonstrate applicability of the proposed CryptoDL using a large number of datasets and evaluate its performance. The empirical results show that it provides accurate privacy-preserving training and classification.

Download Full-text

Predicting Breast Cancer in Chinese Women Using Machine Learning Techniques: Algorithm Development

JMIR Medical Informatics ◽

10.2196/17364 ◽

2020 ◽

Vol 8 (6) ◽

pp. e17364 ◽

Cited By ~ 2

Author(s):

Can Hou ◽

Xiaorong Zhong ◽

Ping He ◽

Bin Xu ◽

Sha Diao ◽

...

Keyword(s):

Breast Cancer ◽

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Deep Neural Network ◽

Chinese Women ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Age At First Birth ◽

Cancer Prediction

Background Risk-based breast cancer screening is a cost-effective intervention for controlling breast cancer in China, but the successful implementation of such intervention requires an accurate breast cancer prediction model for Chinese women. Objective This study aimed to evaluate and compare the performance of four machine learning algorithms on predicting breast cancer among Chinese women using 10 breast cancer risk factors. Methods A dataset consisting of 7127 breast cancer cases and 7127 matched healthy controls was used for model training and testing. We used repeated 5-fold cross-validation and calculated AUC, sensitivity, specificity, and accuracy as the measures of the model performance. Results The three novel machine-learning algorithms (XGBoost, Random Forest and Deep Neural Network) all achieved significantly higher area under the receiver operating characteristic curves (AUCs), sensitivity, and accuracy than logistic regression. Among the three novel machine learning algorithms, XGBoost (AUC 0.742) outperformed deep neural network (AUC 0.728) and random forest (AUC 0.728). Main residence, number of live births, menopause status, age, and age at first birth were considered as top-ranked variables in the three novel machine learning algorithms. Conclusions The novel machine learning algorithms, especially XGBoost, can be used to develop breast cancer prediction models to help identify women at high risk for breast cancer in developing countries.

Download Full-text

Predicting Breast Cancer in Chinese Women Using Machine Learning Techniques: Algorithm Development (Preprint)

10.2196/preprints.17364 ◽

2019 ◽

Author(s):

Can Hou ◽

Xiaorong Zhong ◽

Ping He ◽

Bin Xu ◽

Sha Diao ◽

...

Keyword(s):

Breast Cancer ◽

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Deep Neural Network ◽

Chinese Women ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Age At First Birth ◽

Cancer Prediction

BACKGROUND Risk-based breast cancer screening is a cost-effective intervention for controlling breast cancer in China, but the successful implementation of such intervention requires an accurate breast cancer prediction model for Chinese women. OBJECTIVE This study aimed to evaluate and compare the performance of four machine learning algorithms on predicting breast cancer among Chinese women using 10 breast cancer risk factors. METHODS A dataset consisting of 7127 breast cancer cases and 7127 matched healthy controls was used for model training and testing. We used repeated 5-fold cross-validation and calculated AUC, sensitivity, specificity, and accuracy as the measures of the model performance. RESULTS The three novel machine-learning algorithms (XGBoost, Random Forest and Deep Neural Network) all achieved significantly higher area under the receiver operating characteristic curves (AUCs), sensitivity, and accuracy than logistic regression. Among the three novel machine learning algorithms, XGBoost (AUC 0.742) outperformed deep neural network (AUC 0.728) and random forest (AUC 0.728). Main residence, number of live births, menopause status, age, and age at first birth were considered as top-ranked variables in the three novel machine learning algorithms. CONCLUSIONS The novel machine learning algorithms, especially XGBoost, can be used to develop breast cancer prediction models to help identify women at high risk for breast cancer in developing countries.

Download Full-text