Supervised Feature Selection With Orthogonal Regression and Feature Weighting

Abstract.Comparison of Weighted Criteria and Selection Criteria for Employee Performance Grouping with Fuzzy C-Means. The development of information technology makes it easier for companies to do many things and affect company operations. One of the objects affecting the company development is employees. Employees’ performance can be observed from their discipline, honesty, cooperation, and work quality. The purpose of this study is to group the employees based on their performance using fuzzy c-means. There are two kinds of clustering explained in this paper, i.e., clustering with feature weighting and clustering with feature selection. Using the feature weights of 25%, 30%, 25%, and 20% for work discipline, honesty, cooperation, and work quality, respectively, the clustering with feature weighting gives an accuracy rate of 0.8462. While using feature selection, the fuzzy c-means give 1, where the work discipline and honesty are the critical features in clustering. Therefore, we find that honesty is the most essential feature to cluster the employees based on their performance from this research.Keywords: clustering, employees, fuzzy c-means, feature weighting, feature selectionAbstrak.Perkembangan teknologi informasi mempermudah perusahaan dalam melakukan banyak hal dan mempengaruhi operasional perusahaan. Salah satu objek yang mempengaruhi operasional perusahaan adalah kinerja karyawan. Penilaian kinerja karyawan didasarkan pada empat kriteria, yaitu kedisiplinan, kejujuran, kerja sama, dan kualitas kerja, Tujuan penelitian ini untuk melakukan pengelompokan karyawan dengan fuzzy c-means. Pengelompokan yang dilakukan dalam penelitian ini terdiri dari dua macam, yaitu pengelompokan dengan pembobotan kriteria dan pengelompokan dengan seleksi kriteria. Dengan bobot sebesar 25%, 30%, 25%, dan 20% untuk kriteria kedisiplinan, kejujuran, kerja sama, dan kualitas kerja, pengelompokan dengan pembobotan kriteria menghasilkan akurasi sebesar 0.8462. Pengelompokan FCM dengan seleksi kriteria menghasilkan kriteria kedisiplinan dan kejujuran merupakan dua kriteria yang penting dalam pengelompokan karyawan, dengan akurasi sebesar 1. Dari hasil perbandingan dua macam pengelompokan tersebut didapatkan bahwa kejujuran merupakan kriteria terpenting dalam pengelompokan karyawan berdasarkan kinerjanya.Kata Kunci: pengelompokan, karyawan, fuzzy c-means, pembobotan kriteria, seleksi kriteria

Download Full-text

Evolutionary Computation for Feature Manipulation in Salient Object Detection

10.26686/wgtn.17145578.v1 ◽

2021 ◽

Author(s):

◽

Shima Afzali Vahed Moghaddam

Keyword(s):

Feature Selection ◽

Feature Space ◽

Salient Object Detection ◽

Feature Weighting ◽

Feature Combination ◽

Salient Object ◽

Combination Process ◽

Input Feature ◽

And Performance ◽

High Level

<p>The human visual system can efficiently cope with complex natural scenes containing various objects at different scales using the visual attention mechanism. Salient object detection (SOD) aims to simulate the capability of the human visual system in prioritizing objects for high-level processing. SOD is a process of identifying and localizing the most attention grabbing object(s) of a scene and separating the whole extent of the object(s) from the scene. In SOD, significant research has been dedicated to design and introduce new features to the domain. The existing saliency feature space suffers from some difficulties such as having high dimensionality, features are not equally important, some features are irrelevant, and the original features are not informative enough. These difficulties can lead to various performance limitations. Feature manipulation is the process which improves the input feature space to enhance the learning quality and performance. Evolutionary computation (EC) techniques have been employed in a wide range of tasks due to their powerful search abilities. Genetic programming (GP) and particle swarm optimization (PSO) are well-known EC techniques which have been used for feature manipulation. The overall goal of this thesis is to develop feature manipulation methods including feature weighting, feature selection, and feature construction using EC techniques to improve the input feature set for SOD. This thesis proposes a feature weighting method utilizing PSO to explore the relative contribution of each saliency feature in the feature combination process. Saliency features are referred to the features which are extracted from different levels (e.g., pixel, segmentation) of an image to compute the saliency values over the entire image. The experimental results show that different datasets favour different weights for the employed features. The results also reveal that by considering the importance of each feature in the combination process, the proposed method has achieved better performance than that of the competitive methods. This thesis proposes a new bottom-up SOD method to detect salient objects by constructing two new informative saliency features and designing a new feature combination framework. The proposed method aims at developing features which target to identify different regions of the image. The proposed method makes a good balance between computational time and performance. This thesis proposes a GP-based method to automatically construct foreground and background saliency features. The automatically constructed features do not require domain-knowledge and they are more informative compared to the manually constructed features. The results show that GP is robust towards the changes in the input feature set (e.g., adding more features to the input feature set) and improves the performance by introducing more informative features to the SOD domain. This thesis proposes a GP-based SOD method which automatically produces saliency maps (a 2-D map containing saliency values) for different types of images. This GP-based SOD method applies feature selection and feature combination during the learning process for SOD. GP with built-in feature selection process which selects informative features from the original set and combines the selected features to produce the final saliency map. The results show that GP can potentially explore a large search space and find a good way to combine different input features. This thesis introduces GP for the first time to construct high-level saliency features from the low-level features for SOD, which aims to improve the performance of SOD, particularly on challenging and complex SOD tasks. The proposed method constructs fewer features that achieve better saliency performance than the original full feature set.</p>

Download Full-text

Improved Stability of Feature Selection by Combining Instance and Feature Weighting

Research and Development in Intelligent Systems XXXI ◽

10.1007/978-3-319-12069-0_3 ◽

2014 ◽

pp. 35-49

Author(s):

Gabriel Prat ◽

Lluís A. Belanche

Keyword(s):

Feature Selection ◽

Feature Weighting ◽

Improved Stability

Download Full-text

Feature Selection Under Orthogonal Regression with Redundancy Minimizing

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9053249 ◽

2020 ◽

Author(s):

Xueyuan Xu ◽

Xia Wu

Keyword(s):

Feature Selection ◽

Orthogonal Regression

Download Full-text

A General Framework for Feature Selection under Orthogonal Regression with Global Redundancy Minimization

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2021.3059523 ◽

2021 ◽

pp. 1-1

Author(s):

Xueyuan Xu ◽

Xia Wu ◽

Fulin Wei ◽

Wei Zhong ◽

Feiping Nie

Keyword(s):

Feature Selection ◽

General Framework ◽

Orthogonal Regression

Download Full-text

A comparative analysis of meta-heuristic optimization algorithms for feature selection and feature weighting in neural networks

Evolutionary Intelligence ◽

10.1007/s12065-021-00634-6 ◽

2021 ◽

Author(s):

P. M. Diaz ◽

M. Julie Emerald Jiju

Keyword(s):

Neural Networks ◽

Feature Selection ◽

Comparative Analysis ◽

Optimization Algorithms ◽

Feature Weighting ◽

Heuristic Optimization

Download Full-text

Feature Selection on Elite Hybrid Binary Cuckoo Search in Binary Label Classification

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/5588385 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Maoxian Zhao ◽

Yue Qin

Keyword(s):

Feature Selection ◽

Search Algorithm ◽

Binary Classification ◽

Cuckoo Search ◽

Cuckoo Search Algorithm ◽

Feature Weighting ◽

Svm Classifier ◽

Binary Particle Swarm Optimization ◽

Elite Strategy ◽

Low Dimensional

For the low optimization accuracy of the cuckoo search algorithm, a new search algorithm, the Elite Hybrid Binary Cuckoo Search (EHBCS) algorithm, is improved by feature weighting and elite strategy. The EHBCS algorithm has been designed for feature selection on a series of binary classification datasets, including low-dimensional and high-dimensional samples by SVM classifier. The experimental results show that the EHBCS algorithm achieves better classification performances compared with binary genetic algorithm and binary particle swarm optimization algorithm. Besides, we explain its superiority in terms of standard deviation, sensitivity, specificity, precision, and F -measure.

Download Full-text

Improve the Accuracy of Support Vector Machine Using Chi Square Statistic and Term Frequency Inverse Document Frequency on Movie Review Sentiment Analysis

Scientific Journal of Informatics ◽

10.15294/sji.v6i1.14244 ◽

2019 ◽

Vol 6 (1) ◽

pp. 138-149

Author(s):

Ukhti Ikhsani Larasati ◽

Much Aziz Muslim ◽

Riza Arifudin ◽

Alamsyah Alamsyah

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Text Mining ◽

Sentiment Analysis ◽

Feature Weighting ◽

Support Vector ◽

Chi Square ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency

Data processing can be done with text mining techniques. To process large text data is required a machine to explore opinions, including positive or negative opinions. Sentiment analysis is a process that applies text mining methods. Sentiment analysis is a process that aims to determine the content of the dataset in the form of text is positive or negative. Support vector machine is one of the classification algorithms that can be used for sentiment analysis. However, support vector machine works less well on the large-sized data. In addition, in the text mining process there are constraints one is number of attributes used. With many attributes it will reduce the performance of the classifier so as to provide a low level of accuracy. The purpose of this research is to increase the support vector machine accuracy with implementation of feature selection and feature weighting. Feature selection will reduce a large number of irrelevant attributes. In this study the feature is selected based on the top value of K = 500. Once selected the relevant attributes are then performed feature weighting to calculate the weight of each attribute selected. The feature selection method used is chi square statistic and feature weighting using Term Frequency Inverse Document Frequency (TFIDF). Result of experiment using Matlab R2017b is integration of support vector machine with chi square statistic and TFIDF that uses 10 fold cross validation gives an increase of accuracy of 11.5% with the following explanation, the accuracy of the support vector machine without applying chi square statistic and TFIDF resulted in an accuracy of 68.7% and the accuracy of the support vector machine by applying chi square statistic and TFIDF resulted in an accuracy of 80.2%.

Download Full-text

Feature Selection by Ordered Rough Set Based Feature Weighting

Lecture Notes in Computer Science - Database and Expert Systems Applications ◽

10.1007/11546924_11 ◽

2005 ◽

pp. 105-112

Author(s):

Qasem A. Al-Radaideh ◽

Md Nasir Sulaiman ◽

Mohd Hasan Selamat ◽

Hamidah Ibrahim

Keyword(s):

Feature Selection ◽

Rough Set ◽

Feature Weighting

Download Full-text