Optimal Model Selection of Suport Vector Classifiers for Rolling Element Bearings Fault Detection Using Statistical Time-Domain Features

Author(s):  
Z. Hameed ◽  
Y. S. Hong ◽  
Y. M. Cho ◽  
S. H. Ahn ◽  
C. K. Song

Support Vector Machines (SVMs) are being used extensively now days in the arena of pattern recognition and regression analysis. It has become a good choice for machine learning both for supervised and unsupervised learning purposes. The SVM is primarily based on the mapping the data to a hyperplane using some kernel function and then increasing the margin between the hype planes so this hyperplane classifies the data in the normal and fault state. Due to large amount of input data, it is computationally cumbersome to yield the desired results in shortest possible time by using SVM. To overcome this difficulty in this work, we have employed statistical Time-Domain Features like Root Mean Square (RMS), Variance, Skewness and Kurtosis as pre-processors to the input raw data. Then various combinations of these time-domains signals and features have been used as inputs and their effects on the optimal model selection have been investigated thoroughly and optimal one has been suggested. The procedure presented here is computational less expensive otherwise to process the input data for model selection we may have to use super computer. The implementation of proposed method for machine learning is not much complicated and by using this procedure, an impending fault/abnormal behavior of the machine can be detected beforehand.

2021 ◽  
Author(s):  
S. H. Al Gharbi ◽  
A. A. Al-Majed ◽  
A. Abdulraheem ◽  
S. Patil ◽  
S. M. Elkatatny

Abstract Due to high demand for energy, oil and gas companies started to drill wells in remote areas and unconventional environments. This raised the complexity of drilling operations, which were already challenging and complex. To adapt, drilling companies expanded their use of the real-time operation center (RTOC) concept, in which real-time drilling data are transmitted from remote sites to companies’ headquarters. In RTOC, groups of subject matter experts monitor the drilling live and provide real-time advice to improve operations. With the increase of drilling operations, processing the volume of generated data is beyond a human's capability, limiting the RTOC impact on certain components of drilling operations. To overcome this limitation, artificial intelligence and machine learning (AI/ML) technologies were introduced to monitor and analyze the real-time drilling data, discover hidden patterns, and provide fast decision-support responses. AI/ML technologies are data-driven technologies, and their quality relies on the quality of the input data: if the quality of the input data is good, the generated output will be good; if not, the generated output will be bad. Unfortunately, due to the harsh environments of drilling sites and the transmission setups, not all of the drilling data is good, which negatively affects the AI/ML results. The objective of this paper is to utilize AI/ML technologies to improve the quality of real-time drilling data. The paper fed a large real-time drilling dataset, consisting of over 150,000 raw data points, into Artificial Neural Network (ANN), Support Vector Machine (SVM) and Decision Tree (DT) models. The models were trained on the valid and not-valid datapoints. The confusion matrix was used to evaluate the different AI/ML models including different internal architectures. Despite the slowness of ANN, it achieved the best result with an accuracy of 78%, compared to 73% and 41% for DT and SVM, respectively. The paper concludes by presenting a process for using AI technology to improve real-time drilling data quality. To the author's knowledge based on literature in the public domain, this paper is one of the first to compare the use of multiple AI/ML techniques for quality improvement of real-time drilling data. The paper provides a guide for improving the quality of real-time drilling data.


2020 ◽  
Vol 10 (20) ◽  
pp. 7153 ◽  
Author(s):  
Ehsan Harirchian ◽  
Vandana Kumari ◽  
Kirti Jadhav ◽  
Rohan Raj Das ◽  
Shahla Rasulzade ◽  
...  

Although averting a seismic disturbance and its physical, social, and economic disruption is practically impossible, using the advancements in computational science and numerical modeling shall equip humanity to predict its severity, understand the outcomes, and equip for post-disaster management. Many buildings exist amidst the developed metropolitan areas, which are senile and still in service. These buildings were also designed before establishing national seismic codes or without the introduction of construction regulations. In that case, risk reduction is significant for developing alternatives and designing suitable models to enhance the existing structure’s performance. Such models will be able to classify risks and casualties related to possible earthquakes through emergency preparation. Thus, it is crucial to recognize structures that are susceptible to earthquake vibrations and need to be prioritized for retrofitting. However, each building’s behavior under seismic actions cannot be studied through performing structural analysis, as it might be unrealistic because of the rigorous computations, long period, and substantial expenditure. Therefore, it calls for a simple, reliable, and accurate process known as Rapid Visual Screening (RVS), which serves as a primary screening platform, including an optimum number of seismic parameters and predetermined performance damage conditions for structures. In this study, the damage classification technique was studied, and the efficacy of the Machine Learning (ML) method in damage prediction via a Support Vector Machine (SVM) model was explored. The ML model is trained and tested separately on damage data from four different earthquakes, namely Ecuador, Haiti, Nepal, and South Korea. Each dataset consists of varying numbers of input data and eight performance modifiers. Based on the study and the results, the ML model using SVM classifies the given input data into the belonging classes and accomplishes the performance on hazard safety evaluation of buildings.


2019 ◽  
Vol 33 (25) ◽  
pp. 1950303 ◽  
Author(s):  
Bagesh Kumar ◽  
O. P. Vyas ◽  
Ranjana Vyas

Machine learning (ML) represents the automated extraction of models (or patterns) from data. All ML techniques start with data. These data describe the desired relationship between the ML model inputs and outputs, the latter of which may be implicit for unsupervised approaches. Equivalently, these data encode the requirements we wish to be embodied in our ML model. Thereafter, the model selection comes in action, to select an efficient ML model. In this paper, we have focused on various ML models which are the extensions of the well-known ML model, i.e. Support vector machines (SVMs). The main objective of this paper is to compare the existing ML models with the variants of SVM. Limitations of the existing techniques including the variants of SVM are then drawn. Finally, future directions are presented.


2020 ◽  
Vol 15 (1) ◽  
Author(s):  
Julia Schaefer ◽  
Moritz Lehne ◽  
Josef Schepers ◽  
Fabian Prasser ◽  
Sylvia Thun

Abstract Background Emerging machine learning technologies are beginning to transform medicine and healthcare and could also improve the diagnosis and treatment of rare diseases. Currently, there are no systematic reviews that investigate, from a general perspective, how machine learning is used in a rare disease context. This scoping review aims to address this gap and explores the use of machine learning in rare diseases, investigating, for example, in which rare diseases machine learning is applied, which types of algorithms and input data are used or which medical applications (e.g., diagnosis, prognosis or treatment) are studied. Methods Using a complex search string including generic search terms and 381 individual disease names, studies from the past 10 years (2010–2019) that applied machine learning in a rare disease context were identified on PubMed. To systematically map the research activity, eligible studies were categorized along different dimensions (e.g., rare disease group, type of algorithm, input data), and the number of studies within these categories was analyzed. Results Two hundred eleven studies from 32 countries investigating 74 different rare diseases were identified. Diseases with a higher prevalence appeared more often in the studies than diseases with a lower prevalence. Moreover, some rare disease groups were investigated more frequently than to be expected (e.g., rare neurologic diseases and rare systemic or rheumatologic diseases), others less frequently (e.g., rare inborn errors of metabolism and rare skin diseases). Ensemble methods (36.0%), support vector machines (32.2%) and artificial neural networks (31.8%) were the algorithms most commonly applied in the studies. Only a small proportion of studies evaluated their algorithms on an external data set (11.8%) or against a human expert (2.4%). As input data, images (32.2%), demographic data (27.0%) and “omics” data (26.5%) were used most frequently. Most studies used machine learning for diagnosis (40.8%) or prognosis (38.4%) whereas studies aiming to improve treatment were relatively scarce (4.7%). Patient numbers in the studies were small, typically ranging from 20 to 99 (35.5%). Conclusion Our review provides an overview of the use of machine learning in rare diseases. Mapping the current research activity, it can guide future work and help to facilitate the successful application of machine learning in rare diseases.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3155
Author(s):  
Olivia Vargas-Lopez ◽  
Carlos A. Perez-Ramirez ◽  
Martin Valtierra-Rodriguez ◽  
Jesus J. Yanez-Borjas ◽  
Juan P. Amezquita-Sanchez

The economic and personal consequences that a car accident generates for society have been increasing in recent years. One of the causes that can generate a car accident is the stress level the driver has; consequently, the detection of stress events is a highly desirable task. In this article, the efficacy that statistical time features (STFs), such as root mean square, mean, variance, and standard deviation, among others, can reach in detecting stress events using electromyographical signals in drivers is investigated, since they can measure subtle changes that a signal can have. The obtained results show that the variance and standard deviation coupled with a support vector machine classifier with a cubic kernel are effective for detecting stress events where an AUC of 0.97 is reached. In this sense, since SVM has different kernels that can be trained, they are used to find out which one has the best efficacy using the STFs as feature inputs and a training strategy; thus, information about model explain ability can be determined. The explainability of the machine learning algorithm allows generating a deeper comprehension about the model efficacy and what model should be selected depending on the features used to its development.


2020 ◽  
Vol 3 (2) ◽  
pp. 196-206
Author(s):  
Mausumi Das Nath ◽  
◽  
Tapalina Bhattasali

Due to the enormous usage of the Internet, users share resources and exchange voluminous amounts of data. This increases the high risk of data theft and other types of attacks. Network security plays a vital role in protecting the electronic exchange of data and attempts to avoid disruption concerning finances or disrupted services due to the unknown proliferations in the network. Many Intrusion Detection Systems (IDS) are commonly used to detect such unknown attacks and unauthorized access in a network. Many approaches have been put forward by the researchers which showed satisfactory results in intrusion detection systems significantly which ranged from various traditional approaches to Artificial Intelligence (AI) based approaches.AI based techniques have gained an edge over other statistical techniques in the research community due to its enormous benefits. Procedures can be designed to display behavior learned from previous experiences. Machine learning algorithms are used to analyze the abnormal instances in a particular network. Supervised learning is essential in terms of training and analyzing the abnormal behavior in a network. In this paper, we propose a model of Naïve Bayes and SVM (Support Vector Machine) to detect anomalies and an ensemble approach to solve the weaknesses and to remove the poor detection results


2021 ◽  
Vol 79 (2) ◽  
pp. 125-135
Author(s):  
Binghua Cao ◽  
Enze Cai ◽  
Mengbao Fan

Internal discontinuities are critical factors that can lead to premature failure of thermal barrier coatings (TBCs). This paper proposes a technique that combines terahertz (THz) time-domain spectroscopy and machine learning classifiers to identify discontinuities in TBCs. First, the finite-difference time-domain method was used to build a theoretical model of THz signals due to discontinuities in TBCs. Then, simulations were carried out to compute THz waveforms of different discontinuities in TBCs. Further, six machine learning classifiers were employed to classify these different discontinuities. Principal component analysis (PCA) was used for dimensionality reduction, and the Grid Search method was utilized to optimize the hyperparameters of the designed machine learning classifiers. Accuracy and running time were used to characterize their performances. The results show that the support vector machine (SVM) has a better performance than the others in TBC discontinuity classification. Using PCA, the average accuracy of the SVM classifier is 94.3%, and the running time is 65.6 ms.


2020 ◽  
Author(s):  
Xiaoyu Dai ◽  
Siqi Dai ◽  
Xi Yang ◽  
Jing Zhuang ◽  
Jin Liu ◽  
...  

Abstract Background: Colorectal cancer (CRC) is the third most common malignancy in the world and metastasis is responsible for a major proportion of the cancer-related deaths in CRC patients.Aims: To construct machine learning models for predicting lymph node and distant metastases in colorectal cancer and analyze biological functions features of metastasis-related genes.Methods: RNA-seq and miRNA-seq data as well as corresponding clinical data from colon adenocarcinoma (COAD) and rectum adenocarcinoma (READ) were obtained from The Cancer Genome Atlas (TCGA) database. The differentially expressed RNAs (DE-RNAs) in non-LNM (N0) and LNM (N1/N2) as well as non-distant metastases (M0) and distant metastases (M1) were analyzed. Six machine learning models including logistic regression (LR), random forest (RF), support vector machine (SVM), Catboost, gradient boosting decision tree (GBDT), and artificial neural network (NN) were constructed to predict cancer metastasis and the feature genes of the optimal model were further analyzed by functional enrichment, protein-protein interaction (PPI) network, and drug-target analyses.Results: Differential RNA expression profiles of LNM and non-LNM as well as M0 vs. M1 were observed in both COAD and READ samples. NN model was determined to be the optimal model for predicting distant metastases, while Catboost and LR models were the optimal models for predicting LNM in COAD and READ samples, respectively. PPI analysis indicated that KIR2DL4, chemokine-related genes CXCL9/10/11/13 and CCL25, and gamma-aminobutyric acid (GABA) receptor genes (GABRR1, GABRB2 and GABRA3) were key genes in metastasis. In addition, atorvastatin and eszopiclone were identified as potential therapeutic agents as they target these genes.Conclusions: We constructed six machine learning models for predicting colorectal cancer metastases and identify the optimal model. We analyzed biological functions features of metastasis-related RNAs in colorectal cancer.


Sign in / Sign up

Export Citation Format

Share Document