Mortality Prediction of ICU Patient Based on Imbalanced Data Classification Model

A Novel Model for Imbalanced Data Classification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6145 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6680-6687

Author(s):

Jian Yin ◽

Chunjing Gan ◽

Kaiqi Zhao ◽

Xuan Lin ◽

Zhe Quan ◽

...

Keyword(s):

Imbalanced Data ◽

Data Classification ◽

Classification Performance ◽

Classification Model ◽

Proposed Model ◽

Imbalanced Data Classification ◽

Public Datasets ◽

Distribution Cost ◽

Novel Model ◽

Learning Data

Recently, imbalanced data classification has received much attention due to its wide applications. In the literature, existing researches have attempted to improve the classification performance by considering various factors such as the imbalanced distribution, cost-sensitive learning, data space improvement, and ensemble learning. Nevertheless, most of the existing methods focus on only part of these main aspects/factors. In this work, we propose a novel imbalanced data classification model that considers all these main aspects. To evaluate the performance of our proposed model, we have conducted experiments based on 14 public datasets. The results show that our model outperforms the state-of-the-art methods in terms of recall, G-mean, F-measure and AUC.

Download Full-text

Imbalanced Data Classification Algorithm Based on Clustering and SVM

Journal of Circuits System and Computers ◽

10.1142/s0218126621500365 ◽

2020 ◽

pp. 2150036

Author(s):

Bo Huang ◽

Yimin Zhu ◽

Zhongzhen Wang ◽

Zhijun Fang

Keyword(s):

Class Imbalance ◽

Imbalanced Data ◽

Data Classification ◽

Classification Algorithm ◽

Classification Model ◽

Imbalance Data ◽

Imbalance Problem ◽

Imbalanced Data Classification ◽

Under Sampling ◽

Feature Dimension

The class-imbalance learning is one of the most significant research topics in the data mining and machine learning. Imbalance problem means that one of the classes has much more samples than that of other classes. To deal with the issues of low classification accuracy and high time complexity, this paper proposes an novel imbalance data classification algorithm based on clustering and SVM. The algorithm suggests under-sampling in majority samples based on the distribution characteristics of minority samples. First, specific clusters are detected by cluster analysis on the minority. Second, a cluster boundary strategy is proposed to eliminate the bad influence of noise samples. To structure a balanced dataset for imbalance data, this paper proposes three principles of under-sampling on majority samples according to the characteristic of samples in the cluster. Finally, the optimal classification model from the linear combination of hybrid-kernel SVM is obtained. The experiments based on datasets in UCI and KEEL database show that our algorithm effectively decreases the interference of noise samples. Compared with the SMOTE and Fast-CBUS, the proposed algorithm not only reduces the feature dimension, but also improves the precision of the minor classes under the different labeled sample rates generally.

Download Full-text

A novel imbalanced data classification approach for suicidal ideation detection on social media

Computing ◽

10.1007/s00607-021-00984-0 ◽

2021 ◽

Author(s):

Mohamed Ali Ben Hassine ◽

Safa Abdellatif ◽

Sadok Ben Yahia

Keyword(s):

Social Media ◽

Suicidal Ideation ◽

Imbalanced Data ◽

Data Classification ◽

Classification Approach ◽

Imbalanced Data Classification

Download Full-text

Radial-Based Undersampling for imbalanced data classification

Pattern Recognition ◽

10.1016/j.patcog.2020.107262 ◽

2020 ◽

Vol 102 ◽

pp. 107262 ◽

Cited By ~ 7

Author(s):

Michał Koziarski

Keyword(s):

Imbalanced Data ◽

Data Classification ◽

Imbalanced Data Classification

Download Full-text

Research of Medical High-Dimensional Imbalanced Data Classification Ensemble Feature Selection Algorithm with Random Forest

2017 International Conference on Smart Grid and Electrical Automation (ICSGEA) ◽

10.1109/icsgea.2017.158 ◽

2017 ◽

Cited By ~ 2

Author(s):

Min Zhu ◽

Bo Su ◽

Gangmin Ning

Keyword(s):

Feature Selection ◽

Random Forest ◽

Imbalanced Data ◽

Data Classification ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Imbalanced Data Classification

Download Full-text

Data reduction and stacking for imbalanced data classification

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-179335 ◽

2019 ◽

Vol 37 (6) ◽

pp. 7239-7249

Author(s):

Ireneusz Czarnowski ◽

Piotr Jędrzejowicz

Keyword(s):

Data Reduction ◽

Imbalanced Data ◽

Data Classification ◽

Imbalanced Data Classification

Download Full-text

An Under-Sampling Method with Support Vectors in Multi-class Imbalanced Data Classification

2019 13th International Conference on Software, Knowledge, Information Management and Applications (SKIMA) ◽

10.1109/skima47702.2019.8982391 ◽

2019 ◽

Cited By ~ 1

Author(s):

Md. Yasir Arafat ◽

Sabera Hoque ◽

Shuxiang Xu ◽

Dewan Md. Farid

Keyword(s):

Sampling Method ◽

Imbalanced Data ◽

Data Classification ◽

Support Vectors ◽

Imbalanced Data Classification ◽

Under Sampling

Download Full-text

Imbalanced data classification using complementary fuzzy support vector machine techniques and SMOTE

2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) ◽

10.1109/smc.2017.8122737 ◽

2017 ◽

Cited By ~ 5

Author(s):

Ratchakoon Pruengkarn ◽

Kok Wai Wong ◽

Chun Che Fung

Keyword(s):

Support Vector Machine ◽

Imbalanced Data ◽

Data Classification ◽

Support Vector ◽

Fuzzy Support Vector Machine ◽

Imbalanced Data Classification

Download Full-text

Imbalanced data classification algorithm based on boosting and cascade model

2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC) ◽

10.1109/icsmc.2012.6378183 ◽

2012 ◽

Author(s):

Xiaolong Zhang ◽

Chao Cheng

Keyword(s):

Imbalanced Data ◽

Data Classification ◽

Classification Algorithm ◽

Cascade Model ◽

Imbalanced Data Classification

Download Full-text

UFFDFR: Undersampling framework with denoising, fuzzy c-means clustering, and representative sample selection for imbalanced data classification

Information Sciences ◽

10.1016/j.ins.2021.07.053 ◽

2021 ◽

Author(s):

Ming Zheng ◽

Tong Li ◽

Xiaoyao Zheng ◽

Qingying Yu ◽

Chuanming Chen ◽

...

Keyword(s):

Representative Sample ◽

Sample Selection ◽

Imbalanced Data ◽

Data Classification ◽

Fuzzy C Means ◽

Imbalanced Data Classification ◽

Fuzzy C Means Clustering ◽

Selection For

Download Full-text