scholarly journals Detecting Abnormalities in Heart Sounds

Author(s):  
Muhammed Telceken ◽  
Yakup Kutlu

Heart sounds are important data that reflect the state of the heart. It is possible to prevent larger problems that may occur with early diagnosis of abnormalities in heart sounds. Therefore, in this study, the detection of abnormalities in heart sounds has been studied. In order to detect abnormalities in heart sounds, the heartbeat-sounds data set obtained free of charge from the kaggle.com website was examined. Mel frequency cepstral coefficients (MFCCs) were used in the selection of the characteristics of the sounds. Parameters such as the number of filters to be applied for MFCCs, the number of attributes to be extracted are examined separately with different values. The classification performance of heart sounds with feature matrices extracted in different parameters of MFCCs with K-nearest neighbor algorithm was investigated. The classification performance of different feature extractions was compared and the best case was tried to be determined. Two different records that make up the data set were examined separately as normal and abnormal. Then, the new data set obtained by combining the two records was examined as normal and abnormal.

2021 ◽  
Vol 14 (1) ◽  
pp. 24-29
Author(s):  
Gabriel Popan ◽  
Lorena Muscar ◽  
Lacrimioara Grama

Abstract The goal of this paper is to create a security system to identify a specific person who wants to access private information or enter a building using their voice. To perform this system, we identified a database containing the audio files of the users who will be able to authenticate with this system. Several steps were sequentially performed in order to extract the characteristics of the Mel Frequency Cepstral Coefficients from the audio files. Based on the k-Nearest Neighbor algorithm with an Euclidean distance and 4 neighbors, a training model was created. Through experimental results we prove in two ways, using confusion matrix and scatter plot, that the overall voice fingerprint recognition is 100%, for this particular configuration.


2020 ◽  
Vol 3 (1) ◽  
pp. 27-41
Author(s):  
Achmad Saiful Rizal ◽  
Moch. Lutfi

Elections in Indonesia from period to period have undergone some changes. Elections legislative candidates not determined voters, but instead became a political elite authority in accordance with the order of the list of legislative candidates and their number sequence. To perform a prediction one of them with data mining. Data mining can be applied in the political sphere for example to predict the results of the legislative election and others. K-nearest neighbor algorithm is one of the data mining algorithm that performs classification based on learning object against which are closest to the object. Election-related research has been done with the k-nearest neighbor algorithm, but accuracy is obtained that method is still too low, so it takes an additional algorithm to improve accuracy. In this study, the proposed method, namely the method of k-nearest neighbor method combined with backward elimination as a selection of features. The dataset that will be used in the study comes from the KPU Sidoarjo that has special attributes 1 and 13 regular attributes. From the results of the analysis and computation of some methods, it can be concluded that the method of k-nearest neighbor method combined with backward elimination produced some conclusions. First, of the 14 attributes in the dataset, retrieved 8 most influential attribute. Second, the best accuracy are of 96.03% when k = 2 and tested by 10 fold cross validation.


2020 ◽  
Vol 4 (2) ◽  
pp. 39-47
Author(s):  
Junta Zeniarja ◽  
Anisatawalanita Ukhifahdhina ◽  
Abu Salam

Heart is one of the essential organs that assume a significant part in the human body. However, heart can also cause diseases that affect the death. World Health Organization (WHO) data from 2012 showed that all deaths from cardiovascular disease (vascular) 7.4 million (42.3%) were caused by heart disease. Increased cases of heart disease require a step as an early prevention and prevention efforts by making early diagnosis of heart disease. In this research will be done early diagnosis of heart disease by using data mining process in the form of classification. The algorithm used is K-Nearest Neighbor algorithm with Forward Selection method. The K-Nearest Neighbor algorithm is used for classification in order to obtain a decision result from the diagnosis of heart disease, while the forward selection is used as a feature selection whose purpose is to increase the accuracy value. Forward selection works by removing some attributes that are irrelevant to the classification process. In this research the result of accuracy of heart disease diagnosis with K-Nearest Neighbor algorithm is 73,44%, while result of K-Nearest Neighbor algorithm accuracy with feature selection method 78,66%. It is clear that the incorporation of the K-Nearest Neighbor algorithm with the forward selection method has improved the accuracy result. Keywords - K-Nearest Neighbor, Classification, Heart Disease, Forward Selection, Data Mining


2020 ◽  
Vol 2020 ◽  
pp. 1-6 ◽  
Author(s):  
Luo GuangJun ◽  
Shah Nazir ◽  
Habib Ullah Khan ◽  
Amin Ul Haq

The spam detection is a big issue in mobile message communication due to which mobile message communication is insecure. In order to tackle this problem, an accurate and precise method is needed to detect the spam in mobile message communication. We proposed the applications of the machine learning-based spam detection method for accurate detection. In this technique, machine learning classifiers such as Logistic regression (LR), K-nearest neighbor (K-NN), and decision tree (DT) are used for classification of ham and spam messages in mobile device communication. The SMS spam collection data set is used for testing the method. The dataset is split into two categories for training and testing the research. The results of the experiments demonstrated that the classification performance of LR is high as compared with K-NN and DT, and the LR achieved a high accuracy of 99%. Additionally, the proposed method performance is good as compared with the existing state-of-the-art methods.


2021 ◽  
Vol 25 (2) ◽  
pp. 321-338
Author(s):  
Leandro A. Silva ◽  
Bruno P. de Vasconcelos ◽  
Emilio Del-Moral-Hernandez

Due to the high accuracy of the K nearest neighbor algorithm in different problems, KNN is one of the most important classifiers used in data mining applications and is recognized in the literature as a benchmark algorithm. Despite its high accuracy, KNN has some weaknesses, such as the time taken by the classification process, which is a disadvantage in many problems, particularly in those that involve a large dataset. The literature presents some approaches to reduce the classification time of KNN by selecting only the most important dataset examples. One of these methods is called Prototype Generation (PG) and the idea is to represent the dataset examples in prototypes. Thus, the classification process occurs in two steps; the first is based on prototypes and the second on the examples represented by the nearest prototypes. The main problem of this approach is a lack of definition about the ideal number of prototypes. This study proposes a model that allows the best grid dimension of Self-Organizing Maps and the ideal number of prototypes to be estimated using the number of dataset examples as a parameter. The approach is contrasted with other PG methods from the literature based on artificial intelligence that propose to automatically define the number of prototypes. The main advantage of the proposed method tested here using eighteen public datasets is that it allows a better relationship between a reduced number of prototypes and accuracy, providing a sufficient number that does not degrade KNN classification performance.


2021 ◽  
Vol 32 (2) ◽  
pp. 20-25
Author(s):  
Efraim Kurniawan Dairo Kette

In pattern recognition, the k-Nearest Neighbor (kNN) algorithm is the simplest non-parametric algorithm. Due to its simplicity, the model cases and the quality of the training data itself usually influence kNN algorithm classification performance. Therefore, this article proposes a sparse correlation weight model, combined with the Training Data Set Cleaning (TDC) method by Classification Ability Ranking (CAR) called the CAR classification method based on Coefficient-Weighted kNN (CAR-CWKNN) to improve kNN classifier performance. Correlation weight in Sparse Representation (SR) has been proven can increase classification accuracy. The SR can show the 'neighborhood' structure of the data, which is why it is very suitable for classification based on the Nearest Neighbor. The Classification Ability (CA) function is applied to classify the best training sample data based on rank in the cleaning stage. The Leave One Out (LV1) concept in the CA works by cleaning data that is considered likely to have the wrong classification results from the original training data, thereby reducing the influence of the training sample data quality on the kNN classification performance. The results of experiments with four public UCI data sets related to classification problems show that the CAR-CWKNN method provides better performance in terms of accuracy.


2015 ◽  
Vol 4 (1) ◽  
pp. 61-81
Author(s):  
Mohammad Masoud Javidi

Multi-label classification is an extension of conventional classification in which a single instance can be associated with multiple labels. Problems of this type are ubiquitous in everyday life. Such as, a movie can be categorized as action, crime, and thriller. Most algorithms on multi-label classification learning are designed for balanced data and don’t work well on imbalanced data. On the other hand, in real applications, most datasets are imbalanced. Therefore, we focused to improve multi-label classification performance on imbalanced datasets. In this paper, a state-of-the-art multi-label classification algorithm, which called IBLR_ML, is employed. This algorithm is produced from combination of k-nearest neighbor and logistic regression algorithms. Logistic regression part of this algorithm is combined with two ensemble learning algorithms, Bagging and Boosting. My approach is called IB-ELR. In this paper, for the first time, the ensemble bagging method whit stable learning as the base learner and imbalanced data sets as the training data is examined. Finally, to evaluate the proposed methods; they are implemented in JAVA language. Experimental results show the effectiveness of proposed methods. Keywords: Multi-label classification, Imbalanced data set, Ensemble learning, Stable algorithm, Logistic regression, Bagging, Boosting


Author(s):  
M. Jeyanthi ◽  
C. Velayutham

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.


Sign in / Sign up

Export Citation Format

Share Document