Using Machine Learning Algorithms on data residing in SAP ERP Application to predict equipment failures

Asset intensive Organizations have searched long for a framework model that would timely predict equipment failure. Timely prediction of equipment failure substantially reduces direct and indirect costs, unexpected equipment shut-downs, accidents, and unwarranted emission risk. In this paper, the author proposes a model that can predict equipment failure by using data from SAP Plant Maintenance module. To achieve that author has applied data extraction algorithm and numerous data manipulations to prepare a classification data model consisting of maintenance records parameters such as spare parts usage, time elapsed since last completed maintenance and the period to the next scheduled maintained and so on. By using unsupervised learning technique of clustering, the author observed a class to cluster evaluation of 80% accuracy. After that classifier model was trained using various machine language (ML) algorithms and subsequently tested on mutually exclusive data sets with an objective to predict equipment breakdown. The classifier model using ML algorithms such as Support Vector Machine (SVM) and Decision Tree (DT) returned an accuracy and true positive rate (TPR) of greater than 95% to predict equipment failure. The proposed model acts as an Advanced Intelligent Control system contributing to the Cyber-Physical Systems for asset intensive organizations.

Download Full-text

Using Machine Learning Methods to Identify Particle Types from Doppler Lidar Measurements in Iceland

Remote Sensing ◽

10.3390/rs13132433 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2433

Author(s):

Shu Yang ◽

Fengchao Peng ◽

Sibylle von Löwis ◽

Guðrún Nína Petersen ◽

David Christian Finger

Keyword(s):

Machine Learning ◽

Weather Conditions ◽

Dust Storms ◽

Machine Learning Algorithms ◽

Lidar Data ◽

Data Sets ◽

Doppler Lidar ◽

Lidar Measurements ◽

Using Data ◽

Filter Noise

Doppler lidars are used worldwide for wind monitoring and recently also for the detection of aerosols. Automatic algorithms that classify the lidar signals retrieved from lidar measurements are very useful for the users. In this study, we explore the value of machine learning to classify backscattered signals from Doppler lidars using data from Iceland. We combined supervised and unsupervised machine learning algorithms with conventional lidar data processing methods and trained two models to filter noise signals and classify Doppler lidar observations into different classes, including clouds, aerosols and rain. The results reveal a high accuracy for noise identification and aerosols and clouds classification. However, precipitation detection is underestimated. The method was tested on data sets from two instruments during different weather conditions, including three dust storms during the summer of 2019. Our results reveal that this method can provide an efficient, accurate and real-time classification of lidar measurements. Accordingly, we conclude that machine learning can open new opportunities for lidar data end-users, such as aviation safety operators, to monitor dust in the vicinity of airports.

Download Full-text

Evaluation of Machine Learning Algorithms for Classification of Primary Biological Aerosol using a new UV-LIF spectrometer

10.5194/amt-2016-214 ◽

2016 ◽

Cited By ~ 1

Author(s):

Simon Ruske ◽

David O. Topping ◽

Virginia E. Foot ◽

Paul H. Kaye ◽

Warren R. Stanley ◽

...

Keyword(s):

Supervised Learning ◽

Fungal Spores ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Data Sets ◽

Agglomerative Clustering ◽

Real World Data ◽

Linear Discriminant ◽

Accuracy Of Measurements

Abstract. Characterisation of bio-aerosols has important implications within Environment and Public Health sectors. Recent developments in Ultra-Violet Light Induced Fluorescence (UV-LIF) detectors such as the Wideband Integrated bio-aerosol Spectrometer (WIBS) and the newly introduced Multiparameter bio-aerosol Spectrometer (MBS) has allowed for the real time collection of fluorescence, size and morphology measurements for the purpose of discriminating between bacteria, fungal Spores and pollen. This new generation of instruments has enabled ever larger data sets to be compiled with the aim of studying more complex environments. In real world data sets, particularly those from an urban environment, the population may be dominated by non-biological fluorescent interferents bringing into question the accuracy of measurements of quantities such as concentrations. It is therefore imperative that we validate the performance of different algorithms which can be used for the task of classification. For unsupervised learning we test Hierarchical Agglomerative Clustering with various different linkages. For supervised learning, ten methods were tested; including decision trees, ensemble methods: Random Forests, Gradient Boosting and AdaBoost; two implementations for support vector machines: libsvm and liblinear; Gaussian methods: Gaussian naïve Bayesian, quadratic and linear discriminant analysis and finally the k-nearest neighbours algorithm. The methods were applied to two different data sets measured using a new Multiparameter bio-aerosol Spectrometer which provides multichannel UV-LIF fluorescence signatures for single airborne biological particles. Clustering, in general performs slightly worse than the supervised learning methods correctly classifying, at best, only 72.7 and 91.1 percent for the two data sets respectively. For supervised learning the gradient boosting algorithm was found to be the most effective, on average correctly classifying 88.1 and 97.8 percent of the testing data respectively across the two data sets.

Download Full-text

Data Mining Classification Techniques for Diabetes Prediction

Qubahan Academic Journal ◽

10.48161/qaj.v1n2a55 ◽

2021 ◽

Vol 1 (2) ◽

pp. 125-133

Author(s):

Hindreen Rashid Abdulqadir ◽

Adnan Mohsin Abdulazeez ◽

Dilovan Assad Zebari

Keyword(s):

Data Mining ◽

Random Forest ◽

Drug Targets ◽

Data Extraction ◽

Critical Role ◽

Extraction Methods ◽

Machine Learning Algorithms ◽

Significant Feature ◽

Support Vector ◽

Diabetes Prediction

Diabetes may be predicted and prevented by exploring critical diabetes characteristics by computational data extraction methods. This study proposed a system biology approach to the pathogenic process to identify essential biomarkers as drug targets. The fact that disease recognition and investigation require many details, data mining plays a critical role in healthcare. This study aims to evaluate the efficiency of the methods used that are based on classification. Besides, the researchers have highlighted the most widely employed techniques and the strategies with the best precision. Many analyses include multiple Machine Learning algorithms for various disease assessments and predictions to improve overall issues. The detection and prediction of diseases is an aspect of classification and prediction. This paper estimates diabetes by its key features and also categorizes the relations between conflicting elements. The recursive random forest removal function provided a significant feature range. Random Forest Classifier investigated the diabetes estimate. RF offers 75,7813 greater precisions than Support Vector Machine (SVM).and may assist medical professionals in making care decisions.

Download Full-text

Automatic Classification of Locomotion in Sport: A Case Study from Elite Netball.

International Journal of Computer Science in Sport ◽

10.2478/ijcss-2020-0007 ◽

2020 ◽

Vol 19 (2) ◽

pp. 1-20

Author(s):

P.D. Smith ◽

A. Bedford

Keyword(s):

Frequency Domain ◽

Classification Accuracy ◽

Work Load ◽

Machine Learning Algorithms ◽

Support Vector ◽

Test Case ◽

Data Sets ◽

Rotation Rates ◽

Movement Type ◽

Measurement Units

AbstractIn team sport Human Activity Recognition (HAR) using inertial measurement units (IMUs) has been limited to athletes performing a set routine in a controlled environment, or identifying a high intensity event within periods of relatively low work load. The purpose of this study was to automatically classify locomotion in an elite sports match where subjects perform rapid changes in movement type, direction, and intensity. Using netball as a test case, six athletes wore a tri-axial accelerometer and gyroscope. Feature extraction of player acceleration and rotation rates was conducted on the time and frequency domain over a 1s sliding window. Applying several machine learning algorithms Support Vector Machines (SVM) was found to have the highest classification accuracy (92.0%, Cohen’s kappa Ƙ = 0.88). Highest accuracy was achieved using both accelerometer and gyroscope features mapped to the time and frequency domain. Time and frequency domain data sets achieved identical classification accuracy (91%). Model accuracy was greatest when excluding windows with two or more classes, however detecting the athlete transitioning between locomotion classes was successful (69%). The proposed method demonstrated HAR of locomotion is possible in elite sport, and a far more efficient process than traditional video coding methods.

Download Full-text

Application of Classification Models to Pharyngeal High-Resolution Manometry

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2011/11-0088) ◽

2012 ◽

Vol 55 (3) ◽

pp. 892-902 ◽

Cited By ~ 26

Author(s):

Jason D. Mielens ◽

Matthew R. Hoffman ◽

Michelle R. Ciucci ◽

Timothy M. McCulloch ◽

Jack J. Jiang

Keyword(s):

High Resolution ◽

Data Extraction ◽

Upper Esophageal Sphincter ◽

Support Vector ◽

Data Sets ◽

Classification Models ◽

High Resolution Manometry ◽

Individual Contributions ◽

The Individual ◽

Time Required

Purpose The authors present 3 methods of performing pattern recognition on spatiotemporal plots produced by pharyngeal high-resolution manometry (HRM). Method Classification models, including the artificial neural networks (ANNs) multilayer perceptron (MLP) and learning vector quantization (LVQ), as well as support vector machines (SVM), were evaluated for their ability to identify disordered swallowing. Data were collected from 12 control subjects and 13 subjects with swallowing disorders; for this experiment, these subjects swallowed 5-ml water boluses. Following extraction of relevant parameters, a subset of the data was used to train the models, and the remaining swallows were then independently classified by the networks. Results All methods produced high average classification accuracies, with MLP, SVM, and LVQ achieving accuracies of 96.44%, 91.03%, and 85.39%, respectively. When evaluating the individual contributions of each parameter and groups of parameters to the classification accuracy, parameters pertaining to the upper esophageal sphincter were most valuable. Conclusion Classification models show high accuracy in segregating HRM data sets and represent 1 method of facilitating application of HRM to the clinical setting by eliminating the time required for some aspects of data extraction and interpretation.

Download Full-text

Artificial Intelligence in the Fight Against COVID-19: Scoping Review (Preprint)

10.2196/preprints.20756 ◽

2020 ◽

Author(s):

Alaa Abd-Alrazaq ◽

Mohannad Alajlani ◽

Dari Alhuwail ◽

Jens Schneider ◽

Saif Al-Kuwari ◽

...

Keyword(s):

Artificial Intelligence ◽

Scoping Review ◽

Data Extraction ◽

Length Of Hospital Stay ◽

Support Vector ◽

Data Sets ◽

Daily Lives ◽

Scoping Reviews ◽

Target Disease ◽

Meta Analyses

BACKGROUND In December 2019, COVID-19 broke out in Wuhan, China, leading to national and international disruptions in health care, business, education, transportation, and nearly every aspect of our daily lives. Artificial intelligence (AI) has been leveraged amid the COVID-19 pandemic; however, little is known about its use for supporting public health efforts. OBJECTIVE This scoping review aims to explore how AI technology is being used during the COVID-19 pandemic, as reported in the literature. Thus, it is the first review that describes and summarizes features of the identified AI techniques and data sets used for their development and validation. METHODS A scoping review was conducted following the guidelines of PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews). We searched the most commonly used electronic databases (eg, MEDLINE, EMBASE, and PsycInfo) between April 10 and 12, 2020. These terms were selected based on the target intervention (ie, AI) and the target disease (ie, COVID-19). Two reviewers independently conducted study selection and data extraction. A narrative approach was used to synthesize the extracted data. RESULTS We considered 82 studies out of the 435 retrieved studies. The most common use of AI was diagnosing COVID-19 cases based on various indicators. AI was also employed in drug and vaccine discovery or repurposing and for assessing their safety. Further, the included studies used AI for forecasting the epidemic development of COVID-19 and predicting its potential hosts and reservoirs. Researchers used AI for patient outcome–related tasks such as assessing the severity of COVID-19, predicting mortality risk, its associated factors, and the length of hospital stay. AI was used for infodemiology to raise awareness to use water, sanitation, and hygiene. The most prominent AI technique used was convolutional neural network, followed by support vector machine. CONCLUSIONS The included studies showed that AI has the potential to fight against COVID-19. However, many of the proposed methods are not yet clinically accepted. Thus, the most rewarding research will be on methods promising value beyond COVID-19. More efforts are needed for developing standardized reporting protocols or guidelines for studies on AI.

Download Full-text

Detecting Driver’s Fatigue, Distraction and Activity Using a Non-Intrusive Ai-Based Monitoring System

Journal of Artificial Intelligence and Soft Computing Research ◽

10.2478/jaiscr-2019-0007 ◽

2019 ◽

Vol 9 (4) ◽

pp. 247-266 ◽

Cited By ~ 3

Author(s):

Miguel Costa ◽

Daniel Oliveira ◽

Sandro Pinto ◽

Adriano Tavares

Keyword(s):

Road Safety ◽

Data Extraction ◽

Full Range ◽

Autonomous Driving ◽

Driver Distraction ◽

Machine Learning Algorithms ◽

Support Vector ◽

Road Accidents ◽

Vector Machines ◽

Driving Task

Abstract The lack of attention during the driving task is considered as a major risk factor for fatal road accidents around the world. Despite the ever-growing trend for autonomous driving which promises to bring greater road-safety benefits, the fact is today’s vehicles still only feature partial and conditional automation, demanding frequent driver action. Moreover, the monotony of such a scenario may induce fatigue or distraction, reducing driver awareness and impairing the regain of the vehicle’s control. To address this challenge, we introduce a non-intrusive system to monitor the driver in terms of fatigue, distraction, and activity. The proposed system explores state-of-the-art sensors, as well as machine learning algorithms for data extraction and modeling. In the domain of fatigue supervision, we propose a feature set that considers the vehicle’s automation level. In terms of distraction assessment, the contributions concern (i) a holistic system that covers the full range of driver distraction types and (ii) a monitoring unit that predicts the driver activity causing the faulty behavior. By comparing the performance of Support Vector Machines against Decision Trees, conducted experiments indicated that our system can predict the driver’s state with an accuracy ranging from 89% to 93%.

Download Full-text

Diagnosis of Various Thyroid Ailments using Data Mining Classification Techniques

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit195119 ◽

2019 ◽

pp. 131-136

Author(s):

Umar Sidiq ◽

Syed Mutahar Aaqib ◽

Rafi Ahmad Khan

Keyword(s):

Data Mining ◽

Decision Tree ◽

Research Work ◽

Support Vector ◽

Data Sets ◽

Data Mining Technique ◽

K Nearest Neighbors ◽

Data Set ◽

Classification Techniques ◽

Using Data

Classification is one of the most considerable supervised learning data mining technique used to classify predefined data sets the classification is mainly used in healthcare sectors for making decisions, diagnosis system and giving better treatment to the patients. In this work, the data set used is taken from one of recognized lab of Kashmir. The entire research work is to be carried out with ANACONDA3-5.2.0 an open source platform under Windows 10 environment. An experimental study is to be carried out using classification techniques such as k nearest neighbors, Support vector machine, Decision tree and Naïve bayes. The Decision Tree obtained highest accuracy of 98.89% over other classification techniques.

Download Full-text

Comparative Study of Various Machine Learning Algorithms for Prediction of Insomnia

Advances in Medical Technologies and Clinical Practice - Advanced Classification Techniques for Healthcare Analysis ◽

10.4018/978-1-5225-7796-6.ch011 ◽

2019 ◽

pp. 234-257 ◽

Cited By ~ 5

Author(s):

Ravinder Ahuja ◽

Vishal Vivek ◽

Manika Chandna ◽

Shivani Virmani ◽

Alisha Banga

Keyword(s):

Machine Learning ◽

Heart Diseases ◽

False Positive Rate ◽

Learning Algorithms ◽

True Positive Rate ◽

Machine Learning Algorithms ◽

Support Vector ◽

Mobility Problem ◽

Positive Rate ◽

F Measure

An early diagnosis of insomnia can prevent further medical aids such as anger issues, heart diseases, anxiety, depression, and hypertension. Fifteen machine learning algorithms have been applied and 14 leading factors have been taken into consideration for predicting insomnia. Seven performance parameters (accuracy, kappa, the true positive rate, false positive rate, precision, f-measure, and AUC) are used and for implementation. The authors have used python language. The support vector machine is giving higher performance out of all algorithms giving accuracy 91.6%, f-measure is 92.13, and kappa is 0.83. Further, SVM is applied on another dataset of 100 patients and giving accuracy 92%. In addition, an analysis of the variable importance of CART, C5.0, decision tree, random forest, adaptive boost, and XG boost is calculated. The analysis shows that insomnia primarily depends on the factors, which are the vision problem, mobility problem, and sleep disorder. This chapter mainly finds the usages and effectiveness of machine learning algorithms in Insomnia diseases prediction.

Download Full-text

An Innovative and Implementable Approach for Online Fake News Detection Through Machine Learning

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.8639 ◽

2020 ◽

Vol 17 (1) ◽

pp. 130-135

Author(s):

T. Anushaya Prabha ◽

T. Aisuwariya ◽

M. Vamsee Krishna Kiran ◽

Shriram K. Vasudevan

Keyword(s):

Machine Learning ◽

Language Model ◽

Machine Learning Algorithms ◽

Machine Language ◽

Support Vector ◽

Fake News ◽

Election Cycle ◽

Data Set ◽

The Usa ◽

Identical Number

One should recollect the USA 2015 and 2016 U.S. presidential election cycle dealt with numerous scandals which were triggered by the forged news articles that blowout through the social media like Twitter and Facebook. When it was found that these articles were purposefully uploaded for financial and political gain, it’s become evident that bogus news has to be identified and removed to prevent public from being deceived for someone’s personal gain. This study builds a supervised machine language model to detect the fake news articles published during 2015 and 2016 U.S. election cycle. The data set contains identical number of bogusand factual news. The standard set of machine learning algorithms like K-Nearest Neighbors, Support Vector Machine, Naive Bayes and Passive Aggressive Classifier are trained using either the title or the content of the article. There results show that the PAC classifier produces the highest accuracy of 94.63% over the other three classifiers using diagram term frequency.

Download Full-text