scholarly journals DMP_MI: An Effective Diabetes Mellitus Classification Algorithm on Imbalanced Data With Missing Values

IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 102232-102238 ◽  
Author(s):  
Qian Wang ◽  
Weijia Cao ◽  
Jiawei Guo ◽  
Jiadong Ren ◽  
Yongqiang Cheng ◽  
...  
Author(s):  
Bin Li ◽  
Xiaofeng Gong ◽  
Chen Wang ◽  
Ruijuan Wu ◽  
Tong Bian ◽  
...  

Author(s):  
Shwet Ketu ◽  
Pramod Kumar Mishra

AbstractIn the last decade, we have seen drastic changes in the air pollution level, which has become a critical environmental issue. It should be handled carefully towards making the solutions for proficient healthcare. Reducing the impact of air pollution on human health is possible only if the data is correctly classified. In numerous classification problems, we are facing the class imbalance issue. Learning from imbalanced data is always a challenging task for researchers, and from time to time, possible solutions have been developed by researchers. In this paper, we are focused on dealing with the imbalanced class distribution in a way that the classification algorithm will not compromise its performance. The proposed algorithm is based on the concept of the adjusting kernel scaling (AKS) method to deal with the multi-class imbalanced dataset. The kernel function's selection has been evaluated with the help of weighting criteria and the chi-square test. All the experimental evaluation has been performed on sensor-based Indian Central Pollution Control Board (CPCB) dataset. The proposed algorithm with the highest accuracy of 99.66% wins the race among all the classification algorithms i.e. Adaboost (59.72%), Multi-Layer Perceptron (95.71%), GaussianNB (80.87%), and SVM (96.92). The results of the proposed algorithm are also better than the existing literature methods. It is also clear from these results that our proposed algorithm is efficient for dealing with class imbalance problems along with enhanced performance. Thus, accurate classification of air quality through our proposed algorithm will be useful for improving the existing preventive policies and will also help in enhancing the capabilities of effective emergency response in the worst pollution situation.


2018 ◽  
Vol 12 (3) ◽  
pp. 341-347 ◽  
Author(s):  
Feng Wang ◽  
Shaojiang Liu ◽  
Weichuan Ni ◽  
Zhiming Xu ◽  
Zemin Qiu ◽  
...  

BMJ Open ◽  
2018 ◽  
Vol 8 (10) ◽  
pp. e022947
Author(s):  
Susanne Stolpe ◽  
Mary Ouma ◽  
Volker Winkler ◽  
Christa Meisinger ◽  
Heiko Becher ◽  
...  

ObjectivesTo assess the status and change in self-rated health among Aussiedler, ethnic German immigrants from the former Soviet Union, as a predictor for premature death 10 years after first assessment. Moreover, to identify subgroups which are particular at risk of anticipated severe health impairment.DesignCross-sectional questionnaire.SettingThe study was conducted in the catchment area of Augsburg, a city in southern Bavaria, Germany, in 2011/2012 that has a large community of Aussiedler.Participants595 Aussiedler (231 male, 364 female, mean age 55 years) who in majority migrated to Germany between 1990 and 1999.OutcomePrimary outcome: self-rated health (very good/good/not so good/bad) and its association with demographic, social and morbidity related variables.MethodsSelf-rated health was dichotomised as ‘very good’ and ‘good’ versus ‘not so good’ and ‘bad’. Multivariable logistic models were created. Missing values with regard to pain were addressed by a second analysis.ResultsAlthough low response suggests a healthier sample, the findings are alarming. Altogether47% of the Aussiedler perceived their health as less than good, which is worse compared with the first assessment in 2000 (25% compared with 20% of the general public). Prevalence of high blood pressure was present in 52% of Aussiedler, 34.5% were obese, 40.7% suffered from frequent pain and 13.1% had diabetes mellitus. According to the multivariable models, individuals suffering from pain, limited mobility, diabetes mellitus and high blood pressure are particularly in jeopardy.Conclusions10 years after the first assessment of self-rated health among Aussiedler their situation deteriorated. Tailored risk factor counselling of general practitioners is highly recommended.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Fei Yang ◽  
Jiazhi Du ◽  
Jiying Lang ◽  
Weigang Lu ◽  
Lei Liu ◽  
...  

Electrocardiogram (ECG) signal is critical to the classification of cardiac arrhythmia using some machine learning methods. In practice, the ECG datasets are usually with multiple missing values due to faults or distortion. Unfortunately, many established algorithms for classification require a fully complete matrix as input. Thus it is necessary to impute the missing data to increase the effectiveness of classification for datasets with a few missing values. In this paper, we compare the main methods for estimating the missing values in electrocardiogram data, e.g., the “Zero method”, “Mean method”, “PCA-based method”, and “RPCA-based method” and then propose a novel KNN-based classification algorithm, i.e., a modified kernel Difference-Weighted KNN classifier (MKDF-WKNN), which is fit for the classification of imbalance datasets. The experimental results on the UCI database indicate that the “RPCA-based method” can successfully handle missing values in arrhythmia dataset no matter how many values in it are missing and our proposed classification algorithm, MKDF-WKNN, is superior to other state-of-the-art algorithms like KNN, DS-WKNN, DF-WKNN, and KDF-WKNN for uneven datasets which impacts the accuracy of classification.


Sign in / Sign up

Export Citation Format

Share Document