GK BASED FUZZY CLUSTERING FOR THE DIAGNOSIS OF CARDIAC ARRHYTHMIA

Author(s):  
AHMED M. MEHDI ◽  
ALADIN ZAYEGH ◽  
REZAUL BEGG ◽  
RUBBIYA ALI

Abstract-Cardiac arrhythmia is one of the major causes of human death, and most of the time it cannot be predicted well in advance at the right time. Computational intelligence algorithms can help in extracting the hidden patterns of biological datasets. This paper explores the use of advanced and intelligent computational algorithms for automated detection, classification and clustering of cardiac arrhythmia (CA). Application of Fuzzy C-Mean and Extended Fuzzy C-Mean method to the arrhythmia dataset (165 normal healthy and 138 with CA) demonstrated their good CA classification capabilities. Fuzzy C Mean algorithm was able to classify the two group of data set with an overall accuracy of 97.2% [sensitivity 96.4%, specificity 98.12% and area under the receiver operating curve (AUC-ROC = 0.963)]. The classification accuracy improved significantly when GK-based extended Fuzzy was employed, and an overall accuracy of 99.14% was achieved (sensitivity 97.11%, specificity 99.18% and AUC-ROC = 0.995). These accuracy results were respectively, 19.02%, 7%, 9.14% and 11.06% higher when compared to multi-input single layer perceptron (SLP), feed forward back propagation (FFBP), self organizing maps (SOM) and support vector machine (SVM). The performance measures of fuzzy techniques were found to be better if a Principal Component Analysis (PCA) technique was used to preprocess the arrhythmia datasets.

2021 ◽  
Vol 11 (10) ◽  
pp. 978
Author(s):  
Siti Fairuz Mat Radzi ◽  
Muhammad Khalis Abdul Karim ◽  
M Iqbal Saripan ◽  
Mohd Amiruddin Abdul Rahman ◽  
Iza Nurzawani Che Isa ◽  
...  

Automated machine learning (AutoML) has been recognized as a powerful tool to build a system that automates the design and optimizes the model selection machine learning (ML) pipelines. In this study, we present a tree-based pipeline optimization tool (TPOT) as a method for determining ML models with significant performance and less complex breast cancer diagnostic pipelines. Some features of pre-processors and ML models are defined as expression trees and optimal gene programming (GP) pipelines, a stochastic search system. Features of radiomics have been presented as a guide for the ML pipeline selection from the breast cancer data set based on TPOT. Breast cancer data were used in a comparative analysis of the TPOT-generated ML pipelines with the selected ML classifiers, optimized by a grid search approach. The principal component analysis (PCA) random forest (RF) classification was proven to be the most reliable pipeline with the lowest complexity. The TPOT model selection technique exceeded the performance of grid search (GS) optimization. The RF classifier showed an outstanding outcome amongst the models in combination with only two pre-processors, with a precision of 0.83. The grid search optimized for support vector machine (SVM) classifiers generated a difference of 12% in comparison, while the other two classifiers, naïve Bayes (NB) and artificial neural network—multilayer perceptron (ANN-MLP), generated a difference of almost 39%. The method’s performance was based on sensitivity, specificity, accuracy, precision, and receiver operating curve (ROC) analysis.


2020 ◽  
Vol 16 (8) ◽  
pp. 1088-1105
Author(s):  
Nafiseh Vahedi ◽  
Majid Mohammadhosseini ◽  
Mehdi Nekoei

Background: The poly(ADP-ribose) polymerases (PARP) is a nuclear enzyme superfamily present in eukaryotes. Methods: In the present report, some efficient linear and non-linear methods including multiple linear regression (MLR), support vector machine (SVM) and artificial neural networks (ANN) were successfully used to develop and establish quantitative structure-activity relationship (QSAR) models capable of predicting pEC50 values of tetrahydropyridopyridazinone derivatives as effective PARP inhibitors. Principal component analysis (PCA) was used to a rational division of the whole data set and selection of the training and test sets. A genetic algorithm (GA) variable selection method was employed to select the optimal subset of descriptors that have the most significant contributions to the overall inhibitory activity from the large pool of calculated descriptors. Results: The accuracy and predictability of the proposed models were further confirmed using crossvalidation, validation through an external test set and Y-randomization (chance correlations) approaches. Moreover, an exhaustive statistical comparison was performed on the outputs of the proposed models. The results revealed that non-linear modeling approaches, including SVM and ANN could provide much more prediction capabilities. Conclusion: Among the constructed models and in terms of root mean square error of predictions (RMSEP), cross-validation coefficients (Q2 LOO and Q2 LGO), as well as R2 and F-statistical value for the training set, the predictive power of the GA-SVM approach was better. However, compared with MLR and SVM, the statistical parameters for the test set were more proper using the GA-ANN model.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3003
Author(s):  
Ting Pan ◽  
Haibo Wang ◽  
Haiqing Si ◽  
Yao Li ◽  
Lei Shang

Fatigue is an important factor affecting modern flight safety. It can easily lead to a decline in pilots’ operational ability, misjudgments, and flight illusions. Moreover, it can even trigger serious flight accidents. In this paper, a wearable wireless physiological device was used to obtain pilots’ electrocardiogram (ECG) data in a simulated flight experiment, and 1440 effective samples were determined. The Friedman test was adopted to select the characteristic indexes that reflect the fatigue state of the pilot from the time domain, frequency domain, and non-linear characteristics of the effective samples. Furthermore, the variation rules of the characteristic indexes were analyzed. Principal component analysis (PCA) was utilized to extract the features of the selected feature indexes, and the feature parameter set representing the fatigue state of the pilot was established. For the study on pilots’ fatigue state identification, the feature parameter set was used as the input of the learning vector quantization (LVQ) algorithm to train the pilots’ fatigue state identification model. Results show that the recognition accuracy of the LVQ model reached 81.94%, which is 12.84% and 9.02% higher than that of traditional back propagation neural network (BPNN) and support vector machine (SVM) model, respectively. The identification model based on the LVQ established in this paper is suitable for identifying pilots’ fatigue states. This is of great practical significance to reduce flight accidents caused by pilot fatigue, thus providing a theoretical foundation for pilot fatigue risk management and the development of intelligent aircraft autopilot systems.


2013 ◽  
Vol 2013 ◽  
pp. 1-6 ◽  
Author(s):  
Ersen Yılmaz

An expert system having two stages is proposed for cardiac arrhythmia diagnosis. In the first stage, Fisher score is used for feature selection to reduce the feature space dimension of a data set. The second stage is classification stage in which least squares support vector machines classifier is performed by using the feature subset selected in the first stage to diagnose cardiac arrhythmia. Performance of the proposed expert system is evaluated by using an arrhythmia data set which is taken from UCI machine learning repository.


Author(s):  
Zhixian Chen ◽  
Jialin Tang ◽  
Xueyuan Gong ◽  
Qinglang Su

In order to improve the low accuracy of the face recognition methods in the case of e-health, this paper proposed a novel face recognition approach, which is based on convolutional neural network (CNN). In detail, through resolving the convolutional kernel, rectified linear unit (ReLU) activation function, dropout, and batch normalization, this novel approach reduces the number of parameters of the CNN model, improves the non-linearity of the CNN model, and alleviates overfitting of the CNN model. In these ways, the accuracy of face recognition is increased. In the experiments, the proposed approach is compared with principal component analysis (PCA) and support vector machine (SVM) on ORL, Cohn-Kanade, and extended Yale-B face recognition data set, and it proves that this approach is promising.


Symmetry ◽  
2019 ◽  
Vol 11 (3) ◽  
pp. 380 ◽  
Author(s):  
Kai Ye

When identifying the key features of the network intrusion signal based on the GA-RBF algorithm (using the genetic algorithm to optimize the radial basis) to identify the key features of the network intrusion signal, the pre-processing process of the network intrusion signal data is neglected, resulting in an increase in network signal data noise, reducing the accuracy of key feature recognition. Therefore, a key feature recognition algorithm for network intrusion signals based on neural network and support vector machine is proposed. The principal component neural network (PCNN) is used to extract the characteristics of the network intrusion signal and the support vector machine multi-classifier is constructed. The feature extraction result is input into the support vector machine classifier. Combined with PCNN and SVM (Support Vector Machine) algorithms, the key features of network intrusion signals are identified. The experimental results show that the algorithm has the advantages of high precision, low false positive rate and the recognition time of key features of R2L (it is a common way of network intrusion attack) data set is only 3.18 ms.


2011 ◽  
Vol 11 (04) ◽  
pp. 897-915 ◽  
Author(s):  
ROSHAN JOY MARTIS ◽  
CHANDAN CHAKRABORTY

This work aims at presenting a methodology for electrocardiogram (ECG)-based arrhythmia disease detection using genetic algorithm (GA)-optimized k-means clustering. The open-source ECG data from MIT-BIH arrhythmia database and MIT-BIH normal sinus rhythm database are subjected to a sequence of steps including segmentation using R-point detection, extraction of features using principal component analysis (PCA), and pattern classification. Here, the classical classifiers viz., k-means clustering, error back propagation neural network (EBPNN), and support vector machine (SVM) have been initially attempted and subsequently m-fold (m = 3) cross validation is used to reduce the bias during training of the classifier. The average classification accuracy is computed as the average over all the three folds. It is observed that EBPNN and SVM with different order polynomial kernel provide significant accuracies in comparison with k-means one. In fact, the parameters (centroids) of k-means algorithm are locally optimized by minimizing its objective function. In order to overcome this limitation, a global optimization technique viz., GA is suggested here and implemented to find more robust parameters of k-means clustering. Finally, it is shown that GA-optimized k-means algorithm enhances its accuracy to those of other classifiers. The results are discussed and compared. It is concluded that the GA-optimized k-means algorithm is an alternate approach for classification whose accuracy will be near to that of supervised (viz., EBPNN and SVM) classifiers.


Author(s):  
T. Zh. Mazakov ◽  
D. N. Narynbekovna

Now a day’s security is a big issue, the whole world has been working on the face recognition techniques as face is used for the extraction of facial features. An analysis has been done of the commonly used face recognition techniques. This paper presents a system for the recognition of face for identification and verification purposes by using Principal Component Analysis (PCA) with Back Propagation Neural Networks (BPNN) and the implementation of face recognition system is done by using neural network. The use of neural network is to produce an output pattern from input pattern. This system for facial recognition is implemented in MATLAB using neural networks toolbox. Back propagation Neural Network is multi-layered network in which weights are fixed but adjustment of weights can be done on the basis of sigmoidal function. This algorithm is a learning algorithm to train input and output data set. It also calculates how the error changes when weights are increased or decreased. This paper consists of background and future perspective of face recognition techniques and how these techniques can be improved.


2013 ◽  
Vol 14 (1) ◽  
pp. 10-17

Artificial neural networks (ANNs) are being used increasingly to predict water variables. This study offers an alternative approach to quantify the relationship between time of chlorination in potable water (due to convectional treatment procedure) and chlorination by-products concentration (expressed as carbon and bromine) with an ANN model, i.e., capturing non-linear relationships among the water quality variables. Thus, carbon and bromine concentrations in potable water (the second chosen due to the toxicity of brominated trihalomethanes, THMs) were predicted using artificial neural networks (ANNs) based mainly on multi-layer perceptrons (MLPs) architecture. The chlorination (detention) time as much as 58 hours in Athens distributed network, comprised the input variables to the ANNs models. Moreover, to develop an ANN model for estimating carbon and bromine, the available data set was partitioned into training, validation and test set. In order to reach an optimum amount of hidden layers or nodes, different architectures were tested. The quality of the ANN simulations was evaluated in terms of the error in the validation sample set for the proper interpretation of the results. The calculated sum-squared errors for training, validation and test set were 0.056, 0.039 and 0.060 respectively for the best model selected. Comparison of the results showed that a two-layer feed-forward back propagation ANN model could be used as an acceptable model for predicting carbon and bromine contained in potable water THMs.


Energies ◽  
2019 ◽  
Vol 12 (2) ◽  
pp. 218 ◽  
Author(s):  
Nan Wei ◽  
Changjun Li ◽  
Jiehao Duan ◽  
Jinyuan Liu ◽  
Fanhua Zeng

Forecasting daily natural gas load accurately is difficult because it is affected by various factors. A large number of redundant factors existing in the original dataset will increase computational complexity and decrease the accuracy of forecasting models. This study aims to provide accurate forecasting of natural gas load using a deep learning (DL)-based hybrid model, which combines principal component correlation analysis (PCCA) and (LSTM) network. PCCA is an improved principal component analysis (PCA) and is first proposed here in this paper. Considering the correlation between components in the eigenspace, PCCA can not only extract the components that affect natural gas load but also remove the redundant components. LSTM is a famous DL network, and it was used to predict daily natural gas load in our work. The proposed model was validated by using recent natural gas load data from Xi’an (China) and Athens (Greece). Additionally, 14 weather factors were introduced into the input dataset of the forecasting model. The results showed that PCCA–LSTM demonstrated better performance compared with LSTM, PCA–LSTM, back propagation neural network (BPNN), and support vector regression (SVR). The lowest mean absolute percentage errors of PCCA–LSTM were 3.22% and 7.29% for Xi’an and Athens, respectively. On these bases, the proposed model can be regarded as an accurate and robust model for daily natural gas load forecasting.


Sign in / Sign up

Export Citation Format

Share Document