Improve the Accuracy of Heart Disease Predictions Using Machine Learning and Feature Selection Techniques

Author(s):  
Abdelmegeid Amin Ali ◽  
Hassan Shaban Hassan ◽  
Eman M. Anwar
Author(s):  
Md Arafatur Rahman ◽  
A. Taufiq Asyhari ◽  
Ong Wei Wen ◽  
Husnul Ajra ◽  
Yussuf Ahmed ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4821
Author(s):  
Rami Ahmad ◽  
Raniyah Wazirali ◽  
Qusay Bsoul ◽  
Tarik Abu-Ain ◽  
Waleed Abu-Ain

Wireless Sensor Networks (WSNs) continue to face two major challenges: energy and security. As a consequence, one of the WSN-related security tasks is to protect them from Denial of Service (DoS) and Distributed DoS (DDoS) attacks. Machine learning-based systems are the only viable option for these types of attacks, as traditional packet deep scan systems depend on open field inspection in transport layer security packets and the open field encryption trend. Moreover, network data traffic will become more complex due to increases in the amount of data transmitted between WSN nodes as a result of increasing usage in the future. Therefore, there is a need to use feature selection techniques with machine learning in order to determine which data in the DoS detection process are most important. This paper examined techniques for improving DoS anomalies detection along with power reservation in WSNs to balance them. A new clustering technique was introduced, called the CH_Rotations algorithm, to improve anomaly detection efficiency over a WSN’s lifetime. Furthermore, the use of feature selection techniques with machine learning algorithms in examining WSN node traffic and the effect of these techniques on the lifetime of WSNs was evaluated. The evaluation results showed that the Water Cycle (WC) feature selection displayed the best average performance accuracy of 2%, 5%, 3%, and 3% greater than Particle Swarm Optimization (PSO), Simulated Annealing (SA), Harmony Search (HS), and Genetic Algorithm (GA), respectively. Moreover, the WC with Decision Tree (DT) classifier showed 100% accuracy with only one feature. In addition, the CH_Rotations algorithm improved network lifetime by 30% compared to the standard LEACH protocol. Network lifetime using the WC + DT technique was reduced by 5% compared to other WC + DT-free scenarios.


2021 ◽  
Author(s):  
Tammo P.A. Beishuizen ◽  
Joaquin Vanschoren ◽  
Peter A.J. Hilbers ◽  
Dragan Bošnački

Abstract Background: Automated machine learning aims to automate the building of accurate predictive models, including the creation of complex data preprocessing pipelines. Although successful in many fields, they struggle to produce good results on biomedical datasets, especially given the high dimensionality of the data. Result: In this paper, we explore the automation of feature selection in these scenarios. We analyze which feature selection techniques are ideally included in an automated system, determine how to efficiently find the ones that best fit a given dataset, integrate this into an existing AutoML tool (TPOT), and evaluate it on four very different yet representative types of biomedical data: microarray, mass spectrometry, clinical and survey datasets. We focus on feature selection rather than latent feature generation since we often want to explain the model predictions in terms of the intrinsic features of the data. Conclusion: Our experiments show that for none of these datasets we need more than 200 features to accurately explain the output. Additional features did not increase the quality significantly. We also find that the automated machine learning results are significantly improved after adding additional feature selection methods and prior knowledge on how to select and tune them.


Author(s):  
Amalu Michael ◽  
Deepa S S

Diabetic retinopathy is one of the common forms of diabetic eye disease. DR occurs due to a high ratio of glucose in the blood, which causes alterations in the retinal vessels. Machine learning may be a broad multidisciplinary field that has its roots in statistics, algebra, data processing, and information analytics, etc. Machine learning is used to discover patterns from medical data and provide an efficient way to predict diseases.ML is an application of artificial intelligence it collects information from training data. There are several machine learning techniques are used for the diagnosis of diabetic retinopathy. This paper mainly focuses on the survey of such techniques and also various feature selection mechanisms. This study provides the basic categorization of feature selection techniques and discussing their use.


2021 ◽  
Vol 13 (3) ◽  
pp. 901-913
Author(s):  
S. Gupta ◽  
R. R. Sedamkar

Enhancing the diagnostic ability of Machine Learning models for acceptable prediction in the healthcare community is still a concern. There are critical care disease datasets available online on which researchers have experimented with a different number of instances and features for similar disease prediction. Further, different Machine Learning (ML) models have different preprocessing requirements. Framingham heart disease data is multicollinear and has missing values. Thus, the proposed model aims to explore the differential preprocessing needs of ML models followed by feature selection in consensus with domain experts and feature extraction to resolve multicollinearity issues. Missing values have been imputed differently for each feature. The work also identifies optimal train set size by plotting a learning curve that provides a minimum generalization gap. When testing is done on this hyperparameter tuned model, performance is enhanced with respect to the F score weighted by support and stratification since the data is imbalanced. Experimental results demonstrate improvement in performance metrics, i.e., weighted F score, precision, recall, accuracy up to 3 %, and F1 score by 8 % for Logistic Regression Classifier with the proposed model. Further, the time required for hyperparameter tuning is reduced by 50% for tree-based models, particularly Classification and Regression Tree (CART).


Sign in / Sign up

Export Citation Format

Share Document