class imbalance problem Latest Research Papers

HCBST: An Efficient Hybrid Sampling Technique for Class Imbalance Problems

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3488280 ◽

2022 ◽

Vol 16 (3) ◽

pp. 1-37

Author(s):

Robert A. Sowah ◽

Bernard Kuditchar ◽

Godfrey A. Mills ◽

Amevi Acakpovi ◽

Raphael A. Twum ◽

...

Keyword(s):

Geometric Mean ◽

Class Imbalance ◽

Sampling Technique ◽

Data Repository ◽

Support Vector ◽

Classification Algorithms ◽

Class Imbalance Problem ◽

Imbalance Problem ◽

High Degree ◽

Hybrid Sampling

Class imbalance problem is prevalent in many real-world domains. It has become an active area of research. In binary classification problems, imbalance learning refers to learning from a dataset with a high degree of skewness to the negative class. This phenomenon causes classification algorithms to perform woefully when predicting positive classes with new examples. Data resampling, which involves manipulating the training data before applying standard classification techniques, is among the most commonly used techniques to deal with the class imbalance problem. This article presents a new hybrid sampling technique that improves the overall performance of classification algorithms for solving the class imbalance problem significantly. The proposed method called the Hybrid Cluster-Based Undersampling Technique (HCBST) uses a combination of the cluster undersampling technique to under-sample the majority instances and an oversampling technique derived from Sigma Nearest Oversampling based on Convex Combination, to oversample the minority instances to solve the class imbalance problem with a high degree of accuracy and reliability. The performance of the proposed algorithm was tested using 11 datasets from the National Aeronautics and Space Administration Metric Data Program data repository and University of California Irvine Machine Learning data repository with varying degrees of imbalance. Results were compared with classification algorithms such as the K-nearest neighbours, support vector machines, decision tree, random forest, neural network, AdaBoost, naïve Bayes, and quadratic discriminant analysis. Tests results revealed that for the same datasets, the HCBST performed better with average performances of 0.73, 0.67, and 0.35 in terms of performance measures of area under curve, geometric mean, and Matthews Correlation Coefficient, respectively, across all the classifiers used for this study. The HCBST has the potential of improving the performance of the class imbalance problem, which by extension, will improve on the various applications that rely on the concept for a solution.

Download Full-text

Novel regularization method for the class imbalance problem

Expert Systems with Applications ◽

10.1016/j.eswa.2021.115974 ◽

2022 ◽

Vol 188 ◽

pp. 115974

Author(s):

Bosung Kim ◽

Youngjoong Ko ◽

Jungyun Seo

Keyword(s):

Regularization Method ◽

Class Imbalance ◽

Class Imbalance Problem ◽

Imbalance Problem

Download Full-text

A classification method to classify bone marrow cells with class imbalance problem

Biomedical Signal Processing and Control ◽

10.1016/j.bspc.2021.103296 ◽

2022 ◽

Vol 72 ◽

pp. 103296

Author(s):

Liang Guo ◽

Peiduo Huang ◽

Dehao Huang ◽

Zilan Li ◽

Chenglong She ◽

...

Keyword(s):

Bone Marrow ◽

Bone Marrow Cells ◽

Class Imbalance ◽

Classification Method ◽

Class Imbalance Problem ◽

Imbalance Problem ◽

Marrow Cells

Download Full-text

A Framework for Pedestrian Attribute Recognition Using Deep Learning

Applied Sciences ◽

10.3390/app12020622 ◽

2022 ◽

Vol 12 (2) ◽

pp. 622

Author(s):

Saadman Sakib ◽

Kaushik Deb ◽

Pranab Kumar Dhar ◽

Oh-Jin Kwon

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Class Imbalance ◽

Recognition Task ◽

Fine Tuning ◽

Class Imbalance Problem ◽

Imbalance Problem ◽

Technological Advances ◽

Learning Techniques ◽

Attribute Recognition

The pedestrian attribute recognition task is becoming more popular daily because of its significant role in surveillance scenarios. As the technological advances are significantly more than before, deep learning came to the surface of computer vision. Previous works applied deep learning in different ways to recognize pedestrian attributes. The results are satisfactory, but still, there is some scope for improvement. The transfer learning technique is becoming more popular for its extraordinary performance in reducing computation cost and scarcity of data in any task. This paper proposes a framework that can work in surveillance scenarios to recognize pedestrian attributes. The mask R-CNN object detector extracts the pedestrians. Additionally, we applied transfer learning techniques on different CNN architectures, i.e., Inception ResNet v2, Xception, ResNet 101 v2, ResNet 152 v2. The main contribution of this paper is fine-tuning the ResNet 152 v2 architecture, which is performed by freezing layers, last 4, 8, 12, 14, 20, none, and all. Moreover, data balancing techniques are applied, i.e., oversampling, to resolve the class imbalance problem of the dataset and analysis of the usefulness of this technique is discussed in this paper. Our proposed framework outperforms state-of-the-art methods, and it provides 93.41% mA and 89.24% mA on the RAP v2 and PARSE100K datasets, respectively.

Download Full-text

Credit Card Fraud Detection: An Exploration of Different Sampling Methods to Solve the Class Imbalance Problem

Algorithms for Intelligent Systems - Proceedings of the International Conference on Paradigms of Communication, Computing and Data Sciences ◽

10.1007/978-981-16-5747-4_71 ◽

2022 ◽

pp. 825-837

Author(s):

Mythili Krishnan ◽

Madhan Kumar Srinivasan

Keyword(s):

Credit Card ◽

Sampling Methods ◽

Class Imbalance ◽

Fraud Detection ◽

Class Imbalance Problem ◽

Credit Card Fraud ◽

Imbalance Problem

Download Full-text

Class Imbalance Learning to Heterogeneous Cross Software Projects Defect Prediction

International Journal of Software Innovation ◽

10.4018/ijsi.292021 ◽

2022 ◽

Vol 10 (1) ◽

pp. 0-0

Keyword(s):

Research Work ◽

Class Imbalance ◽

Training Dataset ◽

Software Projects ◽

Class Imbalance Problem ◽

Software Application ◽

Imbalance Problem ◽

Under Sampling ◽

Imbalance Learning ◽

Class Imbalance Learning

Heterogeneous CPDP (HCPDP) attempts to forecast defects in a software application having insufficient previous defect data. Nonetheless, with a Class Imbalance Problem (CIP) perspective, one should have a clear view of data distribution in the training dataset otherwise the trained model would lead to biased classification results. Class Imbalance Learning (CIL) is the method of achieving an equilibrium ratio between two classes in imbalanced datasets. There are a range of effective solutions to manage CIP such as resampling techniques like Over-Sampling (OS) & Under-Sampling (US) methods. The proposed research work employs Synthetic Minority Oversampling TEchnique (SMOTE) and Random Under Sampling (RUS) technique to handle CIP. In addition to this, the paper proposes a novel four-phase HCPDP model and contrasts the efficiency of basic HCPDP model with CIP and after handling CIP using SMOTE & RUS with three prediction pairs. Results show that training performance with SMOTE is substantially improved but RUS displays variations in relation to HCPDP for all three prediction pairs.

Download Full-text

Improving Detection of False Data Injection Attacks Using Machine Learning with Feature Selection and Oversampling

Energies ◽

10.3390/en15010212 ◽

2021 ◽

Vol 15 (1) ◽

pp. 212

Author(s):

Ajit Kumar ◽

Neetesh Saxena ◽

Souhwan Jung ◽

Bong Jun Choi

Keyword(s):

Machine Learning ◽

Class Imbalance ◽

Machine Learning Algorithms ◽

Skewed Distribution ◽

Critical Infrastructures ◽

Detection Accuracy ◽

Class Imbalance Problem ◽

False Data Injection ◽

Injection Attacks ◽

Imbalance Problem

Critical infrastructures have recently been integrated with digital controls to support intelligent decision making. Although this integration provides various benefits and improvements, it also exposes the system to new cyberattacks. In particular, the injection of false data and commands into communication is one of the most common and fatal cyberattacks in critical infrastructures. Hence, in this paper, we investigate the effectiveness of machine-learning algorithms in detecting False Data Injection Attacks (FDIAs). In particular, we focus on two of the most widely used critical infrastructures, namely power systems and water treatment plants. This study focuses on tackling two key technical issues: (1) finding the set of best features under a different combination of techniques and (2) resolving the class imbalance problem using oversampling methods. We evaluate the performance of each algorithm in terms of time complexity and detection accuracy to meet the time-critical requirements of critical infrastructures. Moreover, we address the inherent skewed distribution problem and the data imbalance problem commonly found in many critical infrastructure datasets. Our results show that the considered minority oversampling techniques can improve the Area Under Curve (AUC) of GradientBoosting, AdaBoost, and kNN by 10–12%.

Download Full-text

Inductive Power Transmission System for Electric Car Charging Phase: Modeling plus Frequency Analysis

World Electric Vehicle Journal ◽

10.3390/wevj12040267 ◽

2021 ◽

Vol 12 (4) ◽

pp. 267

Author(s):

Naoui Mohamed ◽

Flah Aymen ◽

Mohammed Alqarni

Keyword(s):

Power Transmission ◽

Class Imbalance ◽

Frequency Variation ◽

Class Imbalance Problem ◽

Inductive Power Transfer ◽

Imbalance Problem ◽

Summary Graph ◽

Inductive Power Transmission ◽

The Relationship ◽

Inductive Power

The effectiveness of inductive power transfer (IPT) presents a serious challenge for improving the global recharge system performance. An electric vehicle (EVs) needs to be charged rapidly and have maximum power when it is charged with wireless technology. Based on various research, the performance of this recharge system is attached to several points and the frequency resonance is one of those parameters that can influence. In this paper, we try to explore the relationship between the obtained power and the signal input frequency for charging a lithium battery, solve the class imbalance problem and understand the maximum allowed frequency. To obtain the results, a mathematical model was first created to demonstrate the relationship, then the dynamic model was validated and tested using the Matlab Simulink platform. The performance of the worldwide wireless recharging system in terms of frequency variation is depicted in a summary graph.

Download Full-text

Attention-based Graph ResNet with focal loss for epileptic seizure detection

Journal of Ambient Intelligence and Smart Environments ◽

10.3233/ais-210086 ◽

2021 ◽

pp. 1-13

Author(s):

Changxu Dong ◽

Yanna Zhao ◽

Gaobo Zhang ◽

Mingrui Xue ◽

Dengyu Chu ◽

...

Keyword(s):

Class Imbalance ◽

Seizure Detection ◽

Brain Regions ◽

Class Imbalance Problem ◽

Imbalance Problem ◽

Topological Relationships ◽

Average Accuracy ◽

Proposed Model ◽

Eeg Data ◽

The Central Nervous System

Epilepsy is a chronic brain disease resulted from the central nervous system lesion, which leads to repeated seizure occurs for the patients. Automatic seizure detection with Electroencephalogram (EEG) has witnessed great progress. However, existing methods paid little attention to the topological relationships of different EEG electrodes. Latest neuroscience researches have demonstrated the connectivity between different brain regions. Besides, class-imbalance is a common problem in EEG based seizure detection. The duration of epileptic EEG signals is much shorter than that of normal signals. In order to deal with the above mentioned two challenges, we propose to model the multi-channel EEG data using the Attention-based Graph ResNet (AGRN). In particular, each channel of the EEG signal represents a node of the graph and the inter-channel relations are modeled via the adjacency matrix in the graph. The loss function of the ARGN model is re-designed using focal loss to cope with the class-imbalance problem. The proposed ARGN with focal model could learn discriminative features from the raw EEG data. Experiments are carried out on the CHB-MIT dataset. The proposed model achieves an average accuracy of 98.70%, a sensitivity of 97.94%, a specificity of 98.66% and a precision of 98.62%. The Area Under the ROC Curve (AUC) is 98.69%.

Download Full-text

A Novel Oversampling Technique to Solve Class Imbalance Problem: A Case Study of Students’ Grades Evaluation

10.1109/contesa52813.2021.9657151 ◽

2021 ◽

Author(s):

Dilshad Jahin ◽

Israt Jahan Emu ◽

Subrina Akter ◽

Muhammed J.A. Patwary ◽

Mohammad Arif Sobhan Bhuiyan ◽

...

Keyword(s):

Class Imbalance ◽

Class Imbalance Problem ◽

Imbalance Problem

Download Full-text

class imbalance problem
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

HCBST: An Efficient Hybrid Sampling Technique for Class Imbalance Problems

Novel regularization method for the class imbalance problem

A classification method to classify bone marrow cells with class imbalance problem

A Framework for Pedestrian Attribute Recognition Using Deep Learning

Credit Card Fraud Detection: An Exploration of Different Sampling Methods to Solve the Class Imbalance Problem

Class Imbalance Learning to Heterogeneous Cross Software Projects Defect Prediction

Improving Detection of False Data Injection Attacks Using Machine Learning with Feature Selection and Oversampling

Inductive Power Transmission System for Electric Car Charging Phase: Modeling plus Frequency Analysis

Attention-based Graph ResNet with focal loss for epileptic seizure detection

A Novel Oversampling Technique to Solve Class Imbalance Problem: A Case Study of Students’ Grades Evaluation

Export Citation Format

class imbalance problemRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

HCBST: An Efficient Hybrid Sampling Technique for Class Imbalance Problems

Novel regularization method for the class imbalance problem

A classification method to classify bone marrow cells with class imbalance problem

A Framework for Pedestrian Attribute Recognition Using Deep Learning

Credit Card Fraud Detection: An Exploration of Different Sampling Methods to Solve the Class Imbalance Problem

Class Imbalance Learning to Heterogeneous Cross Software Projects Defect Prediction

Improving Detection of False Data Injection Attacks Using Machine Learning with Feature Selection and Oversampling

Inductive Power Transmission System for Electric Car Charging Phase: Modeling plus Frequency Analysis

Attention-based Graph ResNet with focal loss for epileptic seizure detection

A Novel Oversampling Technique to Solve Class Imbalance Problem: A Case Study of Students’ Grades Evaluation

class imbalance problem
Recently Published Documents