Instance Selection Methods and Resampling Techniques for Dissimilarity Representation with Imbalanced Data Sets

Pattern Recognition - Applications and Methods - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-642-36530-0_12 ◽

2013 ◽

pp. 149-160 ◽

Author(s):

M. Millán-Giraldo ◽

V. García ◽

J. S. Sánchez

Keyword(s):

Imbalanced Data ◽

Instance Selection ◽

Data Sets ◽

Selection Methods ◽

Imbalanced Data Sets ◽

Dissimilarity Representation

Download Full-text

OligoIS: Scalable Instance Selection for Class-Imbalanced Data Sets

IEEE Transactions on Cybernetics ◽

10.1109/tsmcb.2012.2206381 ◽

2013 ◽

Vol 43 (1) ◽

pp. 332-346 ◽

Author(s):

Nicolas Garcia-Pedrajas ◽

Javier Pérez-Rodríguez ◽

Aida de Haro-García

Keyword(s):

Imbalanced Data ◽

Instance Selection ◽

Data Sets ◽

Imbalanced Data Sets ◽

Download Full-text

An Instance Selection Algorithm Based on ReliefF

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213019500015 ◽

2019 ◽

Vol 28 (01) ◽

pp. 1950001 ◽

Author(s):

Zeinab Abbasi ◽

Mohsen Rahmani

Keyword(s):

Missing Values ◽

Imbalanced Data ◽

Jaccard Index ◽

Instance Selection ◽

Data Sets ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Data Set ◽

Imbalanced Data Sets ◽

Due to the increasing growth of data, many methods are proposed to extract useful data and remove noisy data. Instance selection is one of these methods which selects some instances of a data set and removes others. This paper proposes a new instance selection algorithm based on ReliefF, which is a feature selection algorithm. In the proposed algorithm, based on the Jaccard index, the nearest instances of each class are found for each instance. Then, based on the nearest neighbor’s set, the weight of each instance is calculated. Finally, only instances with more weights are selected. This algorithm can reduce data at a specified rate and have the ability to run parallel on the instances. It can work on a variety of data sets with nominal and numeric data with missing values and is also suitable for working with imbalanced data sets. The proposed algorithm tests on three data sets. Results show that the proposed algorithm can reduce the volume of data, without a significant change in classification accuracy of these datasets.

Download Full-text

Imbalanced Data Detection Kernel Method in Closed Systems

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.756-759.3652 ◽

2013 ◽

Vol 756-759 ◽

pp. 3652-3658

Author(s):

You Li Lu ◽

Jun Luo

Keyword(s):

Kernel Methods ◽

Kernel Method ◽

Imbalanced Data ◽

Data Detection ◽

Data Sets ◽

System Call ◽

Data Set ◽

Imbalanced Data Sets ◽

Lower Complexity ◽

Under the study of Kernel Methods, this paper put forward two improved algorithm which called R-SVM & I-SVDD in order to cope with the imbalanced data sets in closed systems. R-SVM used K-means algorithm clustering space samples while I-SVDD improved the performance of original SVDD by imbalanced sample training. Experiment of two sets of system call data set shows that these two algorithms are more effectively and R-SVM has a lower complexity.

Download Full-text

Automatic Annotation of Protein Functional Class from Sparse and Imbalanced Data Sets

Data Mining and Bioinformatics - Lecture Notes in Computer Science ◽

10.1007/11960669_7 ◽

2006 ◽

pp. 65-77 ◽

Author(s):

Jaehee Jung ◽

Michael R. Thon

Keyword(s):

Imbalanced Data ◽

Functional Class ◽

Data Sets ◽

Automatic Annotation ◽

Imbalanced Data Sets

Download Full-text

An Improved Algorithm for SVMs Classification of Imbalanced Data Sets

Engineering Applications of Neural Networks - Communications in Computer and Information Science ◽

10.1007/978-3-642-03969-0_11 ◽

2009 ◽

pp. 108-118 ◽

Author(s):

Cristiano Leite Castro ◽

Mateus Araujo Carvalho ◽

Antônio Padua Braga

Keyword(s):

Imbalanced Data ◽

Data Sets ◽

Imbalanced Data Sets ◽

Improved Algorithm

Download Full-text

Multi-class Imbalanced Data-Sets with Linguistic Fuzzy Rule Based Classification Systems Based on Pairwise Learning

Computational Intelligence for Knowledge-Based Systems Design - Lecture Notes in Computer Science ◽

10.1007/978-3-642-14049-5_10 ◽

2010 ◽

pp. 89-98 ◽

Author(s):

Alberto Fernández ◽

Mara José del Jesus ◽

Francisco Herrera

Keyword(s):

Imbalanced Data ◽

Classification Systems ◽

Data Sets ◽

Imbalanced Data Sets ◽

Pairwise Learning

Download Full-text

An Optimized Random Forest Classification Method for Processing Imbalanced Data Sets of Alzheimer's Disease

10.1109/ccdc52312.2021.9602177 ◽

2021 ◽

Author(s):

Haijing Sun ◽

Anna Wang ◽

Yun Feng ◽

Chen Liu

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Random Forest ◽

Imbalanced Data ◽

Classification Method ◽

Data Sets ◽

Imbalanced Data Sets ◽

Random Forest Classification ◽

Forest Classification

Download Full-text

Minority–Majority Mix mean Oversampling Technique: An Efficient Technique to Improve Classification of Imbalanced Data Sets

Advances in Intelligent Systems and Computing - Computing in Engineering and Technology ◽

10.1007/978-981-32-9515-5_48 ◽

2019 ◽

pp. 501-509

Author(s):

Sachin Patil ◽

Shefali Sonavane

Keyword(s):

Imbalanced Data ◽

Efficient Technique ◽

Data Sets ◽

Imbalanced Data Sets

Download Full-text

A Novel Clustering Based Undersampling Algorithm for Imbalanced Data Sets Using Artificial Bee Colony Algorithm

Advances in Intelligent Systems and Computing - Innovations in Bio-Inspired Computing and Applications ◽

10.1007/978-3-030-73603-3_3 ◽

2021 ◽

pp. 32-42

Author(s):

O. A. Ajilisa ◽

V. P. Jagathyraj ◽

M. K. Sabu

Keyword(s):

Artificial Bee Colony Algorithm ◽

Artificial Bee Colony ◽

Imbalanced Data ◽

Data Sets ◽

Imbalanced Data Sets ◽

Download Full-text

HYBS: A Novel Hybrid Sampling Method for Learning from Imbalanced Data Sets

International Journal of Advancements in Computing Technology ◽

10.4156/ijact.vol4.issue10.33 ◽

2012 ◽

Vol 4 (10) ◽

pp. 281-288

Author(s):

Zhiyong Liu ◽

Hualong Yu

Keyword(s):

Sampling Method ◽

Imbalanced Data ◽

Data Sets ◽

Imbalanced Data Sets ◽

Hybrid Sampling

Download Full-text