A Case Study on Machine Learning Applications and Performance Improvement in Learning Algorithm

Hohyun Lee; Seung-Hyun Chung; Eun-Jung Choi

doi:10.14400/jdc.2016.14.2.245

Identification of abnormal tribological regimes using a microphone and semi-supervised machine-learning algorithm

Friction ◽

10.1007/s40544-021-0518-0 ◽

2021 ◽

Author(s):

Vigneashwara Pandiyan ◽

Josef Prost ◽

Georg Vorlaufer ◽

Markus Varga ◽

Kilian Wasmer

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Industrial Applications ◽

Supervised Machine Learning ◽

Normal Operation ◽

Promising Alternative ◽

Wear And Tear ◽

Validation Procedure ◽

And Performance

AbstractFunctional surfaces in relative contact and motion are prone to wear and tear, resulting in loss of efficiency and performance of the workpieces/machines. Wear occurs in the form of adhesion, abrasion, scuffing, galling, and scoring between contacts. However, the rate of the wear phenomenon depends primarily on the physical properties and the surrounding environment. Monitoring the integrity of surfaces by offline inspections leads to significant wasted machine time. A potential alternate option to offline inspection currently practiced in industries is the analysis of sensors signatures capable of capturing the wear state and correlating it with the wear phenomenon, followed by in situ classification using a state-of-the-art machine learning (ML) algorithm. Though this technique is better than offline inspection, it possesses inherent disadvantages for training the ML models. Ideally, supervised training of ML models requires the datasets considered for the classification to be of equal weightage to avoid biasing. The collection of such a dataset is very cumbersome and expensive in practice, as in real industrial applications, the malfunction period is minimal compared to normal operation. Furthermore, classification models would not classify new wear phenomena from the normal regime if they are unfamiliar. As a promising alternative, in this work, we propose a methodology able to differentiate the abnormal regimes, i.e., wear phenomenon regimes, from the normal regime. This is carried out by familiarizing the ML algorithms only with the distribution of the acoustic emission (AE) signals captured using a microphone related to the normal regime. As a result, the ML algorithms would be able to detect whether some overlaps exist with the learnt distributions when a new, unseen signal arrives. To achieve this goal, a generative convolutional neural network (CNN) architecture based on variational auto encoder (VAE) is built and trained. During the validation procedure of the proposed CNN architectures, we were capable of identifying acoustics signals corresponding to the normal and abnormal wear regime with an accuracy of 97% and 80%. Hence, our approach shows very promising results for in situ and real-time condition monitoring or even wear prediction in tribological applications.

Download Full-text

Work Towards Using Micro-services to Build a Data Pipeline for Machine Learning Applications: A Case Study in Predicting Customer Churn

2020 International Conference on Innovative Trends in Communication and Computer Engineering (ITCE) ◽

10.1109/itce48509.2020.9047807 ◽

2020 ◽

Author(s):

Laurie Butgereit

Keyword(s):

Machine Learning ◽

Customer Churn ◽

Data Pipeline ◽

Machine Learning Applications

Download Full-text

A machine learning approach to predicting short-term mortality risk in patients starting chemotherapy

10.1101/204081 ◽

2017 ◽

Cited By ~ 2

Author(s):

Aymen A. Elfiky ◽

Maximilian J. Pany ◽

Ravi B. Parikh ◽

Ziad Obermeyer

Keyword(s):

Machine Learning ◽

Mortality Risk ◽

Palliative Chemotherapy ◽

Learning Algorithm ◽

Cancer Center ◽

Short Term ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Short Term Mortality ◽

And Performance

ABSTRACTBackgroundCancer patients who die soon after starting chemotherapy incur costs of treatment without benefits. Accurately predicting mortality risk from chemotherapy is important, but few patient data-driven tools exist. We sought to create and validate a machine learning model predicting mortality for patients starting new chemotherapy.MethodsWe obtained electronic health records for patients treated at a large cancer center (26,946 patients; 51,774 new regimens) over 2004-14, linked to Social Security data for date of death. The model was derived using 2004-11 data, and performance measured on non-overlapping 2012-14 data.Findings30-day mortality from chemotherapy start was 2.1%. Common cancers included breast (21.1%), colorectal (19.3%), and lung (18.0%). Model predictions were accurate for all patients (AUC 0.94). Predictions for patients starting palliative chemotherapy (46.6% of regimens), for whom prognosis is particularly important, remained highly accurate (AUC 0.92). To illustrate model discrimination, we ranked patients initiating palliative chemotherapy by model-predicted mortality risk, and calculated observed mortality by risk decile. 30-day mortality in the highest-risk decile was 22.6%; in the lowest-risk decile, no patients died. Predictions remained accurate across all primary cancers, stages, and chemotherapies—even for clinical trial regimens that first appeared in years after the model was trained (AUC 0.94). The model also performed well for prediction of 180-day mortality (AUC 0.87; mortality 74.8% in the highest risk decile vs. 0.2% in the lowest). Predictions were more accurate than data from randomized trials of individual chemotherapies, or SEER estimates.InterpretationA machine learning algorithm accurately predicted short-term mortality in patients starting chemotherapy using EHR data. Further research is necessary to determine generalizability and the feasibility of applying this algorithm in clinical settings.

Download Full-text

Twitter Users' Classification Based on Interest

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2020010101 ◽

2020 ◽

Vol 10 (1) ◽

pp. 1-12

Author(s):

Noura A. AlSomaikhi ◽

Zakarya A. Alzamil

Keyword(s):

Machine Learning ◽

Experimental Study ◽

Knowledge Sharing ◽

Naive Bayes ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Software System ◽

Web Based ◽

Twitter Users

Microblogging platforms, such as Twitter, have become a popular interaction media that are used widely for different daily purposes, such as communication and knowledge sharing. Understanding the behaviors and interests of these platforms' users become a challenge that can help in different areas such as recommendation and filtering. In this article, an approach is proposed for classifying Twitter users with respect to their interests based on their Arabic tweets. A Multinomial Naïve Bayes machine learning algorithm is used for such classification. The proposed approach has been developed as a web-based software system that is integrated with Twitter using Twitter API. An experimental study on Arabic tweets has been investigated on the proposed system as a case study.

Download Full-text

Programming Language Support for Implementing Machine Learning Algorithms

Handbook of Research on Applications and Implementations of Machine Learning Techniques - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-9902-9.ch021 ◽

2020 ◽

pp. 402-421

Author(s):

Anitha Elavarasi S. ◽

Jayanthi J.

Keyword(s):

Machine Learning ◽

Programming Languages ◽

Programming Language ◽

Domain Knowledge ◽

Learning Algorithm ◽

Mathematical Optimization ◽

Machine Learning Algorithms ◽

Application Development ◽

Machine Learning Applications ◽

And Control

Machine learning provides the system to automatically learn without human intervention and improve their performance with the help of previous experience. It can access the data and use it for learning by itself. Even though many algorithms are developed to solve machine learning issues, it is difficult to handle all kinds of inputs data in-order to arrive at accurate decisions. The domain knowledge of statistical science, probability, logic, mathematical optimization, reinforcement learning, and control theory plays a major role in developing machine learning based algorithms. The key consideration in selecting a suitable programming language for implementing machine learning algorithm includes performance, concurrence, application development, learning curve. This chapter deals with few of the top programming languages used for developing machine learning applications. They are Python, R, and Java. Top three programming languages preferred by data scientist are (1) Python more than 57%, (2) R more than 31%, and (3) Java used by 17% of the data scientists.

Download Full-text

Predicting Bank Operational Efficiency Using Machine Learning Algorithm: Comparative Study of Decision Tree, Random Forest, and Neural Networks

Advances in Fuzzy Systems ◽

10.1155/2020/8581202 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Peter Appiahene ◽

Yaw Marfo Missah ◽

Ussiph Najim

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Banking Sector ◽

Banking Industry ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

And Performance

The financial crisis that hit Ghana from 2015 to 2018 has raised various issues with respect to the efficiency of banks and the safety of depositors’ in the banking industry. As part of measures to improve the banking sector and also restore customers’ confidence, efficiency and performance analysis in the banking industry has become a hot issue. This is because stakeholders have to detect the underlying causes of inefficiencies within the banking industry. Nonparametric methods such as Data Envelopment Analysis (DEA) have been suggested in the literature as a good measure of banks’ efficiency and performance. Machine learning algorithms have also been viewed as a good tool to estimate various nonparametric and nonlinear problems. This paper presents a combined DEA with three machine learning approaches in evaluating bank efficiency and performance using 444 Ghanaian bank branches, Decision Making Units (DMUs). The results were compared with the corresponding efficiency ratings obtained from the DEA. Finally, the prediction accuracies of the three machine learning algorithm models were compared. The results suggested that the decision tree (DT) and its C5.0 algorithm provided the best predictive model. It had 100% accuracy in predicting the 134 holdout sample dataset (30% banks) and a P value of 0.00. The DT was followed closely by random forest algorithm with a predictive accuracy of 98.5% and a P value of 0.00 and finally the neural network (86.6% accuracy) with a P value 0.66. The study concluded that banks in Ghana can use the result of this study to predict their respective efficiencies. All experiments were performed within a simulation environment and conducted in R studio using R codes.

Download Full-text

PISIoT: A Machine Learning and IoT-Based Smart Health Platform for Overweight and Obesity Control

Applied Sciences ◽

10.3390/app9153037 ◽

2019 ◽

Vol 9 (15) ◽

pp. 3037 ◽

Cited By ~ 1

Author(s):

Isaac Machorro-Cano ◽

Giner Alor-Hernández ◽

Mario Andrés Paredes-Valverde ◽

Uriel Ramos-Deonati ◽

José Luis Sánchez-Cervantes ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Overweight And Obesity ◽

Process Data ◽

Associated Conditions ◽

Smart Health ◽

Medical Recommendations ◽

And Control

Overweight and obesity are affecting productivity and quality of life worldwide. The Internet of Things (IoT) makes it possible to interconnect, detect, identify, and process data between objects or services to fulfill a common objective. The main advantages of IoT in healthcare are the monitoring, analysis, diagnosis, and control of conditions such as overweight and obesity and the generation of recommendations to prevent them. However, the objects used in the IoT have limited resources, so it has become necessary to consider other alternatives to analyze the data generated from monitoring, analysis, diagnosis, control, and the generation of recommendations, such as machine learning. This work presents PISIoT: a machine learning and IoT-based smart health platform for the prevention, detection, treatment, and control of overweight and obesity, and other associated conditions or health problems. Weka API and the J48 machine learning algorithm were used to identify critical variables and classify patients, while Apache Mahout and RuleML were used to generate medical recommendations. Finally, to validate the PISIoT platform, we present a case study on the prevention of myocardial infarction in elderly patients with obesity by monitoring biomedical variables.

Download Full-text

Semi-Supervised Machine Learning Algorithm in Near Infrared Spectral Calibration: A Case Study on Diesel Fuels

Advanced Science Letters ◽

10.1166/asl.2012.3044 ◽

2012 ◽

Vol 11 (1) ◽

pp. 416-419

Author(s):

Songjing Wan ◽

Di Wu ◽

Kangsheng Liu

Keyword(s):

Machine Learning ◽

Near Infrared ◽

Learning Algorithm ◽

Supervised Machine Learning ◽

Machine Learning Algorithm ◽

Diesel Fuels ◽

Spectral Calibration ◽

Infrared Spectral

Download Full-text

Semi-supervised Machine Learning Algorithm in Near Infrared Spectral Calibration: A Case Study to Determine Cetane Number and Total Aromatics of Diesel Fuels

2012 Fifth International Conference on Intelligent Computation Technology and Automation ◽

10.1109/icicta.2012.84 ◽

2012 ◽

Cited By ~ 1

Author(s):

Songjing Wang ◽

Di Wu ◽

Kangsheng Liu

Keyword(s):

Machine Learning ◽

Near Infrared ◽

Learning Algorithm ◽

Cetane Number ◽

Supervised Machine Learning ◽

Machine Learning Algorithm ◽

Diesel Fuels ◽

Spectral Calibration ◽

Infrared Spectral

Download Full-text

Determination of Time-to-Failure for Automotive System Components Using Machine Learning

Journal of Computing and Information Science in Engineering ◽

10.1115/1.4046818 ◽

2020 ◽

Vol 20 (6) ◽

Author(s):

John O’Donnell ◽

Hwan-Sik Yoon

Keyword(s):

Machine Learning ◽

Degradation Rate ◽

Shock Absorber ◽

Learning Algorithm ◽

Health Condition ◽

Performance Degradation ◽

Remaining Useful Life ◽

Machine Learning Algorithm ◽

Time To Failure ◽

And Performance

Abstract In recent years, there has been a growing interest in the connectivity of vehicles. This connectivity allows for the monitoring and analysis of large amount of sensor data from vehicles during their normal operations. In this paper, an approach is proposed for analyzing such data to determine a vehicle component’s remaining useful life named time-to-failure (TTF). The collected data is first used to determine the type of performance degradation and then to train a regression model to predict the health condition and performance degradation rate of the component using a machine learning algorithm. When new data is collected later for the same component in a different system, the trained model can be used to estimate the time-to-failure of the component based on the predicted health condition and performance degradation rate. To validate the proposed approach, a quarter-car model is simulated, and a machine learning algorithm is applied to determine the time-to-failure of a failing shock absorber. The results show that a tap-delayed nonlinear autoregressive network with exogenous input (NARX) can accurately predict the health condition and degradation rate of the shock absorber and can estimate the component’s time-to-failure. To the best of the authors’ knowledge, this research is the first attempt to determine a component’s time-to-failure using a machine learning algorithm.

Download Full-text