scholarly journals A New Approach of Rough Set Theory for ‎Feature Selection and Bayes Net Classifier ‎Applied on Heart Disease Dataset

2017 ◽  
Vol 26 (2) ◽  
pp. 15-26
Author(s):  
Eman S. Al-Shamery ◽  
Ali A.Rahoomi Al-Obaidi

In this paper a new approach of rough set features selection has been proposed. Feature selection has been used for several reasons a) decrease time of prediction b) feature possibly is not found c) present of feature case bad prediction. Rough set has been used to select most significant features. The proposed rough set has been applied on heart diseases data sets. The main problem is how to predict patient has heart disease or not depend on given features. The problem is challenge, because it cannot determine decision directly .Rough set has been modified to get attributes for prediction by ignored unnecessary and bad features. Bayes net has been used for classified method. 10-fold cross validation is used for evaluation. The Correct Classified Instances were 82.17, 83.49, and 74.58 when use full, 12, 7 length of attributes respectively. Traditional rough set has been applied, the minimum Correct Classified Instances were 58.41 and 81.51 when use 2 length of attributes respectively

2019 ◽  
Vol 1 (2) ◽  
pp. 23-35
Author(s):  
Dwi Normawati ◽  
Dewi Pramudi Ismi

Coronary heart disease is a disease that often causes human death, occurs when there is atherosclerosis blocking blood flow to the heart muscle in the coronary arteries. The doctor's referral method for diagnosing coronary heart disease is coronary angiography, but it is invasive, high risk and expensive. The purpose of this study is to analyze the effect of implementing the k-Fold Cross Validation (CV) dataset on the rule-based feature selection to diagnose coronary heart disease, using the Cleveland heart disease dataset. The research conducted a feature selection using a medical expert-based (MFS) and computer-based method, namely the Variable Precision Rough Set (VPRS), which is the development of the Rough Set theory. Evaluation of classification performance using the k-Fold method of 10-Fold, 5-Fold and 3-Fold. The results of the study are the number of attributes of the feature selection results are different in each Fold, both for the VPRS and MFS methods, for accuracy values obtained from the average accuracy resulting from 10-Fold, 5-Fold and 3-Fold. The result was the highest accuracy value in the VPRS method 76.34% with k = 5, while the MTF accuracy was 71.281% with k = 3. So, the k-fold implementation for this case is less effective, because the division of data is still structured, according to the order of records that apply in each fold, while the amount of testing data is too small and too structured. This affects the results of the accuracy because the testing rules are not thoroughly represented


2013 ◽  
Vol 3 (1) ◽  
Author(s):  
Suresh Satapathy ◽  
Anima Naik ◽  
K. Parvathi

AbstractRough set theory has been one of the most successful methods used for feature selection. However, this method is still not able to find optimal subsets. But it can be made to be optimal using different optimization techniques. This paper proposes a new feature selection method based on Rough Set theory with Teaching learning based optimization (TLBO). The proposed method is experimentally compared with other hybrid Rough Set methods such as Genetic Algorithm (GA), Particle Swarm Optimization (PSO) and Differential Evolution (DE) and the empirical results reveal that the proposed approach could be used for feature selection as this performs better in terms of finding optimal features and doing so in quick time.


2021 ◽  
pp. 107993
Author(s):  
Peng Zhou ◽  
Peipei Li ◽  
Shu Zhao ◽  
Yanping Zhang

2014 ◽  
Vol 1 (1) ◽  
pp. 1-14 ◽  
Author(s):  
Sharmistha Bhattacharya Halder

The concept of rough set was first developed by Pawlak (1982). After that it has been successfully applied in many research fields, such as pattern recognition, machine learning, knowledge acquisition, economic forecasting and data mining. But the original rough set model cannot effectively deal with data sets which have noisy data and latent useful knowledge in the boundary region may not be fully captured. In order to overcome such limitations, some extended rough set models have been put forward which combine with other available soft computing technologies. Many researchers were motivated to investigate probabilistic approaches to rough set theory. Variable precision rough set model (VPRSM) is one of the most important extensions. Bayesian rough set model (BRSM) (Slezak & Ziarko, 2002), as the hybrid development between rough set theory and Bayesian reasoning, can deal with many practical problems which could not be effectively handled by original rough set model. Based on Bayesian decision procedure with minimum risk, Yao (1990) puts forward a new model called decision theoretic rough set model (DTRSM) which brings new insights into the probabilistic approaches to rough set theory. Throughout this paper, the concept of decision theoretic rough set is studied and also a new concept of Bayesian decision theoretic rough set is introduced. Lastly a comparative study is done between Bayesian decision theoretic rough set and Rough set defined by Pawlak (1982).


Information ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 207
Author(s):  
Asma Baccouche ◽  
Begonya Garcia-Zapirain ◽  
Cristian Castillo Olea ◽  
Adel Elmaghraby

Heart diseases are highly ranked among the leading causes of mortality in the world. They have various types including vascular, ischemic, and hypertensive heart disease. A large number of medical features are reported for patients in the Electronic Health Records (EHR) that allow physicians to diagnose and monitor heart disease. We collected a dataset from Medica Norte Hospital in Mexico that includes 800 records and 141 indicators such as age, weight, glucose, blood pressure rate, and clinical symptoms. Distribution of the collected records is very unbalanced on the different types of heart disease, where 17% of records have hypertensive heart disease, 16% of records have ischemic heart disease, 7% of records have mixed heart disease, and 8% of records have valvular heart disease. Herein, we propose an ensemble-learning framework of different neural network models, and a method of aggregating random under-sampling. To improve the performance of the classification algorithms, we implement a data preprocessing step with features selection. Experiments were conducted with unidirectional and bidirectional neural network models and results showed that an ensemble classifier with a BiLSTM or BiGRU model with a CNN model had the best classification performance with accuracy and F1-score between 91% and 96% for the different types of heart disease. These results are competitive and promising for heart disease dataset. We showed that ensemble-learning framework based on deep models could overcome the problem of classifying an unbalanced heart disease dataset. Our proposed framework can lead to highly accurate models that are adapted for clinical real data and diagnosis use.


Author(s):  
Qinrong Feng ◽  
Duoqian Miao ◽  
Ruizhi Wang

Decision rules mining is an important technique in machine learning and data mining, it has been studied intensively during the past few years. However, most existing algorithms are based on flat data tables, from which sets of decision rules mined may be very large for massive data sets. Such sets of rules are not easily understandable and really useful for users. Moreover, too many rules may lead to over-fitting. Thus, a method of decision rules mining from different abstract levels was provided in this chapter, which aims to improve the efficiency of decision rules mining by combining the hierarchical structure of multidimensional model and the techniques of rough set theory. Our algorithm for decision rules mining follows the so called separate-and-conquer strategy. Namely, certain rules were mined beginning from the most abstract level, and supporting sets of those certain rules were removed from the universe, then drill down to the next level to recursively mine other certain rules which supporting sets are included in the remaining objects until no objects remain in the universe or getting to the primitive level. So this algorithm can output some generalized rules with different degree of generalization.


Author(s):  
Qing-Hua Zhang ◽  
Long-Yang Yao ◽  
Guan-Sheng Zhang ◽  
Yu-Ke Xin

In this paper, a new incremental knowledge acquisition method is proposed based on rough set theory, decision tree and granular computing. In order to effectively process dynamic data, describing the data by rough set theory, computing equivalence classes and calculating positive region with hash algorithm are analyzed respectively at first. Then, attribute reduction, value reduction and the extraction of rule set by hash algorithm are completed efficiently. Finally, for each new additional data, the incremental knowledge acquisition method is proposed and used to update the original rules. Both algorithm analysis and experiments show that for processing the dynamic information systems, compared with the traditional algorithms and the incremental knowledge acquisition algorithms based on granular computing, the time complexity of the proposed algorithm is lower due to the efficiency of hash algorithm and also this algorithm is more effective when it is used to deal with the huge data sets.


Sign in / Sign up

Export Citation Format

Share Document