Diagnosis of Problems in Truck Ore Transport Operations in Underground Mines Using Various Machine Learning Models and Data Collected by Internet of Things Systems

This study proposes a method for diagnosing problems in truck ore transport operations in underground mines using four machine learning models (i.e., Gaussian naïve Bayes (GNB), k-nearest neighbor (kNN), support vector machine (SVM), and classification and regression tree (CART)) and data collected by an Internet of Things system. A limestone underground mine with an applied mine production management system (using a tablet computer and Bluetooth beacon) is selected as the research area, and log data related to the truck travel time are collected. The machine learning models are trained and verified using the collected data, and grid search through 5-fold cross-validation is performed to improve the prediction accuracy of the models. The accuracy of CART is highest when the parameters leaf and split are set to 1 and 4, respectively (94.1%). In the validation of the machine learning models performed using the validation dataset (1500), the accuracy of the CART was 94.6%, and the precision and recall were 93.5% and 95.7%, respectively. In addition, it is confirmed that the F1 score reaches values as high as 94.6%. Through field application and analysis, it is confirmed that the proposed CART model can be utilized as a tool for monitoring and diagnosing the status of truck ore transport operations.

Download Full-text

Machine Learning Models on ADC Features to Assess Brain Changes of Children With Pierre Robin Sequence

Frontiers in Neurology ◽

10.3389/fneur.2021.580440 ◽

2021 ◽

Vol 12 ◽

Author(s):

Ying Wang ◽

Feng Yang ◽

Meijiao Zhu ◽

Ming Yang

Keyword(s):

Machine Learning ◽

Validation Dataset ◽

Support Vector ◽

Control Group ◽

Pierre Robin Sequence ◽

Learning Models ◽

Robin Sequence ◽

Brain Changes ◽

Pierre Robin ◽

Machine Learning Models

In order to evaluate brain changes in young children with Pierre Robin sequence (PRs) using machine learning based on apparent diffusion coefficient (ADC) features, we retrospectively enrolled a total of 60 cases (42 in the training dataset and 18 in the testing dataset) which included 30 PRs and 30 controls from the Children's Hospital Affiliated to the Nanjing Medical University from January 2017–December 2019. There were 21 and nine PRs cases in each dataset, with the remainder belonging to the control group in the same age range. A total of 105 ADC features were extracted from magnetic resonance imaging (MRI) data. Features were pruned using least absolute shrinkage and selection operator (LASSO) regression and seven ADC features were developed as the optimal signatures for training machine learning models. Support vector machine (SVM) achieved an area under the receiver operating characteristic curve (AUC) of 0.99 for the training set and 0.85 for the testing set. The AUC of the multivariable logistic regression (MLR) and the AdaBoost for the training and validation dataset were 0.98/0.84 and 0.94/0.69, respectively. Based on the ADC features, the two groups of cases (i.e., the PRs group and the control group) could be well-distinguished by the machine learning models, indicating that there is a significant difference in brain development between children with PRs and normal controls.

Download Full-text

Evaluation of Different Machine Learning Models for Predicting Soil Erosion in Tropical Sloping Lands of Northeast Vietnam

Applied and Environmental Soil Science ◽

10.1155/2021/6665485 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Tuan Vu Dinh ◽

Nhat-Duc Hoang ◽

Xuan-Linh Tran

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Soil Erosion ◽

Wilcoxon Test ◽

Support Vector ◽

Learning Models ◽

K Nearest Neighbor ◽

Sloping Lands ◽

Significant Difference ◽

Machine Learning Models

Soil erosion induced by rainfall under prevailing conditions is a prominent problem to farmers in tropical sloping lands of Northeast Vietnam. This study evaluates possibility of predicting erosion status by machine learning models, including fuzzy k-nearest neighbor (FKNN), artificial neural network (ANN), support vector machine (SVM), least squares support vector machine (LSSVM), and relevance vector machine (RVM). Model evaluation employed a historical dataset consisting of ten explanatory variables and soil erosion featured four different land use managements on hillslopes in Northwest Vietnam. All 236 data samples representing soil erosion/nonerosion events were randomly prepared (80% for training and 20% for testing) to assess the robustness of the five models. This subsampling process was repeatedly carried out by 30 rounds to eliminate the issue of randomness in data selection. Classification accuracy rate (CAR) and area under receiver operating characteristic (AUC) were used to evaluate performance of the five models. Significant difference between different algorithms was verified by the Wilcoxon test. Results of the study showed that RVM model achieves the best outcomes in both training (CAR = 92.22% and AUC = 0.98) and testing phases (CAR = 91.94% and AUC = 0.97). Four other learning algorithms also demonstrated good performance as indicated by their CAR values surpassing 80% and AUC values greater than 0.9. Hence, these results strongly confirm the efficacy of applying machine learning models for soil erosion prediction.

Download Full-text

Classification of Parkinson’s disease and essential tremor based on balance and gait characteristics from wearable motion sensors via machine learning techniques: a data-driven approach

Journal of NeuroEngineering and Rehabilitation ◽

10.1186/s12984-020-00756-5 ◽

2020 ◽

Vol 17 (1) ◽

Author(s):

Sanghee Moon ◽

Hyun-Je Song ◽

Vibhash D. Sharma ◽

Kelly E. Lyons ◽

Rajesh Pahwa ◽

...

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Motion Sensors ◽

Learning Models ◽

K Nearest Neighbor ◽

Gait Characteristics ◽

Machine Learning Models

Abstract Background Parkinson’s disease (PD) and essential tremor (ET) are movement disorders that can have similar clinical characteristics including tremor and gait difficulty. These disorders can be misdiagnosed leading to delay in appropriate treatment. The aim of the study was to determine whether balance and gait variables obtained with wearable inertial motion sensors can be utilized to differentiate between PD and ET using machine learning. Additionally, we compared classification performances of several machine learning models. Methods This retrospective study included balance and gait variables collected during the instrumented stand and walk test from people with PD (n = 524) and with ET (n = 43). Performance of several machine learning techniques including neural networks, support vector machine, k-nearest neighbor, decision tree, random forest, and gradient boosting, were compared with a dummy model or logistic regression using F1-scores. Results Machine learning models classified PD and ET based on balance and gait characteristics better than the dummy model (F1-score = 0.48) or logistic regression (F1-score = 0.53). The highest F1-score was 0.61 of neural network, followed by 0.59 of gradient boosting, 0.56 of random forest, 0.55 of support vector machine, 0.53 of decision tree, and 0.49 of k-nearest neighbor. Conclusions This study demonstrated the utility of machine learning models to classify different movement disorders based on balance and gait characteristics collected from wearable sensors. Future studies using a well-balanced data set are needed to confirm the potential clinical utility of machine learning models to discern between PD and ET.

Download Full-text

Content-based Fish Classification Using Combination of Machine Learning Methods

International Journal of Information Technology and Computer Science ◽

10.5815/ijitcs.2021.01.05 ◽

2021 ◽

Vol 13 (1) ◽

pp. 62-68

Author(s):

S.M. Mohidul Islam ◽

◽

Suriya Islam Bani ◽

Rupa Ghosh

Keyword(s):

Machine Learning ◽

Fish Species ◽

Majority Vote ◽

Species Recognition ◽

Local Features ◽

Support Vector ◽

Global Feature ◽

Learning Models ◽

K Nearest Neighbor ◽

Machine Learning Models

Fish species recognition is an increasing demand to the field of fish ecology, fishing industry sector, fisheries survey applications, and other related concerns. Traditionally, concept-based fish specifies identification procedure is used. But it has some limitations. Content-based classification overcomes these problems. In this paper, a content-based fish recognition system based on the fusion of local features and global feature is proposed. For local features extraction from fish image, Local Binary Pattern (LBP), Speeded-Up Robust Feature (SURF), and Scale Invariant Feature Transform (SIFT) are used. To extract global feature from fish image, Color Coherence Vector (CCV) is used. Five popular machine learning models such as: Decision Tree, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Naïve Bayes, and Artificial Neural Network (ANN) are used for fish species prediction. Finally, prediction decisions of the above machine learning models are combined to select the final fish class based on majority vote. The experiment is performed on a subset of ‘QUT_fish_data’ dataset containing 256 fish images of 21 classes and the result (accuracy 98.46%) shows that though the proposed method does not outperform all existing fish classification methods but it outperforms many existing methods and so, the method is a competitive alternative in this field.

Download Full-text

Machine Learning Methods for Detecting Internet-of-Things (IoT) Malware

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/ijcini.286768 ◽

2021 ◽

Vol 15 (4) ◽

pp. 0-0

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Models ◽

K Nearest Neighbors ◽

Detection Systems ◽

Iot Devices ◽

Application Developers ◽

Machine Learning Models

This study aims to analyze the performance of machine learning models for detecting Internet of Things malware utilizing a recent IoT dataset. Experiments on the IoT dataset were conducted with nine well-known machine learning techniques, consisting of Logistic Regression (LR), Naive Bayes (NB), Decision Tree (DT), k-Nearest Neighbors (KNN), Support Vector Machines (SVM), Neural Networks (NN), Random Forest (RF), Bagging (BG), and Stacking (ST). The results show that the proposed model attains 100% accuracy in detecting IoT malware for DT, SVM, RF, BG; about 99.9% percent for LR, NB, KNN, NN; and only 28.16% for ST classifier. This study also shows higher performance than other proposed machine learning models evaluated on the same dataset. Therefore, the results of this study can help both the researchers and application developers in designing and building intelligent malware detection systems for IoT devices.

Download Full-text

Monitoring the Foliar Nutrients Status of Mango Using Spectroscopy-Based Spectral Indices and PLSR-Combined Machine Learning Models

Remote Sensing ◽

10.3390/rs13040641 ◽

2021 ◽

Vol 13 (4) ◽

pp. 641

Author(s):

Gopal Ramdas Mahajan ◽

Bappa Das ◽

Dayesh Murgaokar ◽

Ittai Herrmann ◽

Katja Berger ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Partial Least Square ◽

Least Square ◽

Partial Least Square Regression ◽

Support Vector ◽

Spectral Indices ◽

Learning Models ◽

Leaf Nutrients ◽

Machine Learning Models

Conventional methods of plant nutrient estimation for nutrient management need a huge number of leaf or tissue samples and extensive chemical analysis, which is time-consuming and expensive. Remote sensing is a viable tool to estimate the plant’s nutritional status to determine the appropriate amounts of fertilizer inputs. The aim of the study was to use remote sensing to characterize the foliar nutrient status of mango through the development of spectral indices, multivariate analysis, chemometrics, and machine learning modeling of the spectral data. A spectral database within the 350–1050 nm wavelength range of the leaf samples and leaf nutrients were analyzed for the development of spectral indices and multivariate model development. The normalized difference and ratio spectral indices and multivariate models–partial least square regression (PLSR), principal component regression, and support vector regression (SVR) were ineffective in predicting any of the leaf nutrients. An approach of using PLSR-combined machine learning models was found to be the best to predict most of the nutrients. Based on the independent validation performance and summed ranks, the best performing models were cubist (R2 ≥ 0.91, the ratio of performance to deviation (RPD) ≥ 3.3, and the ratio of performance to interquartile distance (RPIQ) ≥ 3.71) for nitrogen, phosphorus, potassium, and zinc, SVR (R2 ≥ 0.88, RPD ≥ 2.73, RPIQ ≥ 3.31) for calcium, iron, copper, boron, and elastic net (R2 ≥ 0.95, RPD ≥ 4.47, RPIQ ≥ 6.11) for magnesium and sulfur. The results of the study revealed the potential of using hyperspectral remote sensing data for non-destructive estimation of mango leaf macro- and micro-nutrients. The developed approach is suggested to be employed within operational retrieval workflows for precision management of mango orchard nutrients.

Download Full-text

Machine learning models to identify low adherence to influenza vaccination among Korean adults with cardiovascular disease

BMC Cardiovascular Disorders ◽

10.1186/s12872-021-01925-7 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Moojung Kim ◽

Young Jae Kim ◽

Sung Jin Park ◽

Kwang Gi Kim ◽

Pyung Chun Oh ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Influenza Vaccination ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Age Group ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.

Download Full-text

414 Deep Neural Networks: A Survey Tool for Obstructive Sleep Apnea Prediction

SLEEP ◽

10.1093/sleep/zsab072.413 ◽

2021 ◽

Vol 44 (Supplement_2) ◽

pp. A164-A164

Author(s):

Pahnwat Taweesedt ◽

JungYoon Kim ◽

Jaehyun Park ◽

Jangwoon Park ◽

Munish Sharma ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Obstructive Sleep Apnea ◽

Sleep Apnea ◽

Deep Neural Networks ◽

Support Vector ◽

Learning Models ◽

Obstructive Sleep ◽

Screening Questionnaires ◽

Machine Learning Models

Abstract Introduction Obstructive sleep apnea (OSA) is a common sleep-related breathing disorder with an estimation of one billion people. Full-night polysomnography is considered the gold standard for OSA diagnosis. However, it is time-consuming, expensive and is not readily available in many parts of the world. Many screening questionnaires and scores have been proposed for OSA prediction with high sensitivity and low specificity. The present study is intended to develop models with various machine learning techniques to predict the severity of OSA by incorporating features from multiple questionnaires. Methods Subjects who underwent full-night polysomnography in Torr sleep center, Texas and completed 5 OSA screening questionnaires/scores were included. OSA was diagnosed by using Apnea-Hypopnea Index ≥ 5. We trained five different machine learning models including Deep Neural Networks with the scaled principal component analysis (DNN-PCA), Random Forest (RF), Adaptive Boosting classifier (ABC), and K-Nearest Neighbors classifier (KNC) and Support Vector Machine Classifier (SVMC). Training:Testing subject ratio of 65:35 was used. All features including demographic data, body measurement, snoring and sleepiness history were obtained from 5 OSA screening questionnaires/scores (STOP-BANG questionnaires, Berlin questionnaires, NoSAS score, NAMES score and No-Apnea score). Performance parametrics were used to compare between machine learning models. Results Of 180 subjects, 51.5 % of subjects were male with mean (SD) age of 53.6 (15.1). One hundred and nineteen subjects were diagnosed with OSA. Area Under the Receiver Operating Characteristic Curve (AUROC) of DNN-PCA, RF, ABC, KNC, SVMC, STOP-BANG questionnaire, Berlin questionnaire, NoSAS score, NAMES score, and No-Apnea score were 0.85, 0.68, 0.52, 0.74, 0.75, 0.61, 0.63, 0,61, 0.58 and 0,58 respectively. DNN-PCA showed the highest AUROC with sensitivity of 0.79, specificity of 0.67, positive-predictivity of 0.93, F1 score of 0.86, and accuracy of 0.77. Conclusion Our result showed that DNN-PCA outperforms OSA screening questionnaires, scores and other machine learning models. Support (if any):

Download Full-text

QUBO formulations for training machine learning models

Scientific Reports ◽

10.1038/s41598-021-89461-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Prasanna Date ◽

Davis Arthur ◽

Lauren Pusey-Nazzaro

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Large Scale ◽

Support Vector ◽

Quantum Computers ◽

Np Hard ◽

Learning Models ◽

Moore’S Law ◽

Moore's Law ◽

Machine Learning Models

AbstractTraining machine learning models on classical computers is usually a time and compute intensive process. With Moore’s law nearing its inevitable end and an ever-increasing demand for large-scale data analysis using machine learning, we must leverage non-conventional computing paradigms like quantum computing to train machine learning models efficiently. Adiabatic quantum computers can approximately solve NP-hard problems, such as the quadratic unconstrained binary optimization (QUBO), faster than classical computers. Since many machine learning problems are also NP-hard, we believe adiabatic quantum computers might be instrumental in training machine learning models efficiently in the post Moore’s law era. In order to solve problems on adiabatic quantum computers, they must be formulated as QUBO problems, which is very challenging. In this paper, we formulate the training problems of three machine learning models—linear regression, support vector machine (SVM) and balanced k-means clustering—as QUBO problems, making them conducive to be trained on adiabatic quantum computers. We also analyze the computational complexities of our formulations and compare them to corresponding state-of-the-art classical approaches. We show that the time and space complexities of our formulations are better (in case of SVM and balanced k-means clustering) or equivalent (in case of linear regression) to their classical counterparts.

Download Full-text

A Comparative Study of Machine Learning Models with Hyperparameter Optimization Algorithm for Mapping Mineral Prospectivity

Minerals ◽

10.3390/min11020159 ◽

2021 ◽

Vol 11 (2) ◽

pp. 159

Author(s):

Nan Lin ◽

Yongliang Chen ◽

Haiqi Liu ◽

Hanlin Liu

Keyword(s):

Machine Learning ◽

Swarm Intelligence ◽

Optimization Algorithm ◽

Geochemical Data ◽

Support Vector ◽

Learning Models ◽

Hyperparameter Optimization ◽

Mineral Prospectivity ◽

Swarm Intelligence Optimization ◽

Machine Learning Models

Selecting internal hyperparameters, which can be set by the automatic search algorithm, is important to improve the generalization performance of machine learning models. In this study, the geological, remote sensing and geochemical data of the Lalingzaohuo area in Qinghai province were researched. A multi-source metallogenic information spatial data set was constructed by calculating the Youden index for selecting potential evidence layers. The model for mapping mineral prospectivity of the study area was established by combining two swarm intelligence optimization algorithms, namely the bat algorithm (BA) and the firefly algorithm (FA), with different machine learning models. The receiver operating characteristic (ROC) and prediction-area (P-A) curves were used for performance evaluation and showed that the two algorithms had an obvious optimization effect. The BA and FA differentiated in improving multilayer perceptron (MLP), AdaBoost and one-class support vector machine (OCSVM) models; thus, there was no optimization algorithm that was consistently superior to the other. However, the accuracy of the machine learning models was significantly enhanced after optimizing the hyperparameters. The area under curve (AUC) values of the ROC curve of the optimized machine learning models were all higher than 0.8, indicating that the hyperparameter optimization calculation was effective. In terms of individual model improvement, the accuracy of the FA-AdaBoost model was improved the most significantly, with the AUC value increasing from 0.8173 to 0.9597 and the prediction/area (P/A) value increasing from 3.156 to 10.765, where the mineral targets predicted by the model occupied 8.63% of the study area and contained 92.86% of the known mineral deposits. The targets predicted by the improved machine learning models are consistent with the metallogenic geological characteristics, indicating that the swarm intelligence optimization algorithm combined with the machine learning model is an efficient method for mineral prospectivity mapping.

Download Full-text