Learning Target Class Feature Subspace (LTC-FS) Using Eigenspace Analysis and N-ary Search-Based Autonomous Hyperparameter Tuning for OCSVM

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001421510150 ◽

2021 ◽

Author(s):

Sanjay Kumar Sonbhadra ◽

Sonali Agarwal ◽

P. Nagabhushan

Keyword(s):

Principal Component ◽

Feature Space ◽

Support Vector ◽

Feature Subset ◽

Target Class ◽

Significant Information ◽

Feature Extraction Method ◽

Specificity And Sensitivity ◽

Feature Subspace ◽

Novel Target

Existing dimensionality reduction (DR) techniques such as principal component analysis (PCA) and its variants are not suitable for target class mining due to the negligence of unique statistical properties of class-of-interest (CoI) samples. Conventionally, these approaches utilize higher or lower eigenvalued principal components (PCs) for data transformation; but the higher eigenvalued PCs may split the target class, whereas lower eigenvalued PCs do not contribute significant information and wrong selection of PCs leads to performance degradation. Considering these facts, the present research offers a novel target class-guided feature extraction method. In this approach, initially, the eigendecomposition is performed on variance–covariance matrix of only the target class samples, where the higher- and lower-valued eigenvectors are rejected via statistical analysis, and the selected eigenvectors are utilized to extract the most promising feature subspace. The extracted feature-subset gives a more tighter description of the CoI with enhanced associativity among target class samples and ensures the strong separation from nontarget class samples. One-class support vector machine (OCSVM) is evaluated to validate the performance of learned features. To obtain optimized values of hyperparameters of OCSVM a novel [Formula: see text]-ary search-based autonomous method is also proposed. Exhaustive experiments with a wide variety of datasets are performed in feature-space (original and reduced) and eigenspace (obtained from original and reduced features) to validate the performance of the proposed approach in terms of accuracy, precision, specificity and sensitivity.

Download Full-text

A Method Based on Support Vector Machine for Feature Selection of Latent Semantic Features

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.181-182.830 ◽

2011 ◽

Vol 181-182 ◽

pp. 830-835

Author(s):

Min Song Li

Keyword(s):

Support Vector Machine ◽

Text Categorization ◽

Latent Semantic Indexing ◽

Classification Performance ◽

Compact Representation ◽

Support Vector ◽

Semantic Features ◽

Semantic Indexing ◽

Feature Extraction Method ◽

Feature Subspace

Latent Semantic Indexing(LSI) is an effective feature extraction method which can capture the underlying latent semantic structure between words in documents. However, it is probably not the most appropriate for text categorization to use the method to select feature subspace, since the method orders extracted features according to their variance,not the classification power. We proposed a method based on support vector machine to extract features and select a Latent Semantic Indexing that be suited for classification. Experimental results indicate that the method improves classification performance with more compact representation.

Download Full-text

An Expert System Based on Fisher Score and LS-SVM for Cardiac Arrhythmia Diagnosis

Computational and Mathematical Methods in Medicine ◽

10.1155/2013/849674 ◽

2013 ◽

Vol 2013 ◽

pp. 1-6 ◽

Cited By ~ 19

Author(s):

Ersen Yılmaz

Keyword(s):

Expert System ◽

Cardiac Arrhythmia ◽

Feature Space ◽

Support Vector ◽

Feature Subset ◽

Fisher Score ◽

Data Set ◽

Second Stage ◽

Vector Machines ◽

Two Stages

An expert system having two stages is proposed for cardiac arrhythmia diagnosis. In the first stage, Fisher score is used for feature selection to reduce the feature space dimension of a data set. The second stage is classification stage in which least squares support vector machines classifier is performed by using the feature subset selected in the first stage to diagnose cardiac arrhythmia. Performance of the proposed expert system is evaluated by using an arrhythmia data set which is taken from UCI machine learning repository.

Download Full-text

DETERMINATION OF OPTIMUM CLASSIFICATION SYSTEM FOR HYPERSPECTRAL IMAGERY AND LIDAR DATA BASED ON BEES ALGORITHM

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-1-w5-651-2015 ◽

2015 ◽

Vol XL-1-W5 ◽

pp. 651-656

Author(s):

F. Samadzadega ◽

H. Hasani

Keyword(s):

Urban Area ◽

Hyperspectral Imagery ◽

Feature Space ◽

Classification Performance ◽

Feature Subset Selection ◽

Bees Algorithm ◽

Support Vector ◽

Svm Classifier ◽

Lidar Data ◽

Feature Subset

Hyperspectral imagery is a rich source of spectral information and plays very important role in discrimination of similar land-cover classes. In the past, several efforts have been investigated for improvement of hyperspectral imagery classification. Recently the interest in the joint use of LiDAR data and hyperspectral imagery has been remarkably increased. Because LiDAR can provide structural information of scene while hyperspectral imagery provide spectral and spatial information. The complementary information of LiDAR and hyperspectral data may greatly improve the classification performance especially in the complex urban area. In this paper feature level fusion of hyperspectral and LiDAR data is proposed where spectral and structural features are extract from both dataset, then hybrid feature space is generated by feature stacking. Support Vector Machine (SVM) classifier is applied on hybrid feature space to classify the urban area. In order to optimize the classification performance, two issues should be considered: SVM parameters values determination and feature subset selection. Bees Algorithm (BA) is powerful meta-heuristic optimization algorithm which is applied to determine the optimum SVM parameters and select the optimum feature subset simultaneously. The obtained results show the proposed method can improve the classification accuracy in addition to reducing significantly the dimension of feature space.

Download Full-text

Weed recognition by SVM texture feature classification in outdoor vegetable crops images

Ingeniería e Investigación ◽

10.15446/ing.investig.v37n1.54703 ◽

2017 ◽

Vol 37 (1) ◽

pp. 68 ◽

Cited By ~ 13

Author(s):

Camilo Pulido Rojas ◽

Leonardo Solaque Guzmán ◽

Nelson Velasco Toledo

Keyword(s):

Scale Parameter ◽

Texture Feature ◽

Principal Component ◽

Feature Space ◽

Support Vector ◽

Gray Level ◽

Vegetable Crops ◽

Nonlinear Case ◽

Classifier Performance ◽

Weed Recognition

This paper presents a classification system for weeds and vegetables from outdoor crop images. The classifier is based on support vector machine (SVM) with its extension to nonlinear case using radial basis function (RBF) and optimizing its scale parameter σ to smooth the decision boundary. The feature space is the result of principal component analysis (PCA) for 10 texture measurements calculated from gray level co-occurrence matrices (GLCM). The results indicate that classifier performance is above 90%, validated with specificity, sensitivity and precision calculations.

Download Full-text

An empirical analysis of machine learning models for automated essay grading

10.7287/peerj.preprints.3518 ◽

2018 ◽

Author(s):

Deva Surya Vivek Madala ◽

Ayushree Gangal ◽

Shreyash Krishna ◽

Anjali Goyal ◽

Ashish Sureka

Keyword(s):

Machine Learning ◽

Feature Subset Selection ◽

Support Vector ◽

Feature Subset ◽

Features Selection ◽

Target Class ◽

Automated Essay Scoring ◽

Essay Grading ◽

Feature Values ◽

Context Specific

Background. Automated Essay Scoring (AES) is an area which falls at the intersection of computing and linguistics. AES systems conduct a linguistic analysis of a given essay or prose and then estimates the writing skill or the essay quality in the form a numeric score or a letter grade. AES systems are useful for the school, university and testing company community for efficiently and effectively scaling the task of grading a large number of essays. Methods. We propose an approach for automatically grading a given essay based on 9 surface level and deep linguistic features, 2 feature selection and ranking techniques and 4 text classification algorithms. We conduct a series of experiments on publicly available manually graded and annotated essay data and demonstrate the effectiveness of our approach. We investigate the performance of two different features selection techniques (1) RELIEF (2) Correlation-based Feature Subset Selection (CFS) with three different machine learning classifiers (kNN, SVM and Linear Regression). We also apply feature normalization and scaling. Results. Our results indicate that features like world count with respect to the world limit, appropriate use of vocabulary, relevance of the terms in the essay with the given topic and coherency between sentences and paragraphs are good predictors of essay score. Our analysis reveals that not all features are equally important and few features are more relevant and better correlated with respect to the target class. We conduct experiments with k-nearest neighbour, logistic regression and support vector machine based classifiers. Our results on 4075 essays across multiple topics and grade score range are encouraging with an accuracy of 73% to 93%. Discussion. Our experiments and approach are based on Grade 7 to Grade 10 essays which can be generalized to essays from other grades and level after doing context specific customization. Few features are more relevant and important than other features and it is interplay or combination of multiple feature values which determines the final score. We observe that different classifiers result in difference accuracy.

Download Full-text

Biometric authenticator algorithm based on multiresolution analysis

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v20.i3.pp1332-1341 ◽

2020 ◽

Vol 20 (3) ◽

pp. 1332

Author(s):

Soumia Kerrache ◽

Beladgham Mohammed ◽

Hamza Aymen ◽

Kadri Ibrahim

Keyword(s):

Feature Extraction ◽

Multiresolution Analysis ◽

Nearest Neighbor ◽

Curvelet Transform ◽

Principal Component ◽

Image Features ◽

Support Vector ◽

K Nearest Neighbor ◽

Feature Extraction Method ◽

Fusion Approach

Features extraction is an essential process in identifying person biometrics because the effectiveness of the system depends on it. Multiresolution Analysis success can be used in the system of a person’s identification and pattern recognition. In this paper, we present a feature extraction method for two-dimensional face and iris authentication. Our approach is a combination of principal component analysis (PCA) and curvelet transform as an improved fusion approach for feature extraction. The proposed fusion approach involves image denoising using 2D-Curvelet transform to achieve compact representations of curves singularities. This is followed by the application of PCA as a fusion rule to improve upon the spatial resolution. The limitations of the only PCA algorithm are a poor recognition speed and complex mathematical calculating load, to reduce these limitations, we are applying the curvelet transform. <br /> To assess the performance of the presented method, we have employed three classification techniques: Neural networks (NN), K-Nearest Neighbor (KNN) and Support Vector machines (SVM).<br />The results reveal that the extraction of image features is more efficient using Curvelet/PCA.

Download Full-text

Analysis of Wheat Samples Using the Calculation of Multifractal Spectrum

Computer Tools in Education ◽

10.32603/2071-2340-2021-1-5-20 ◽

2021 ◽

pp. 5-20

Author(s):

Ivan Murenin ◽

◽

Natalia Ampilova ◽

Keyword(s):

Random Forest ◽

Local Density ◽

Principal Component ◽

Feature Space ◽

Multifractal Spectrum ◽

Support Vector ◽

Clustering Methods ◽

Wheat Varieties ◽

Crystallization With Additives ◽

Classi Fication

The computational analysis of wheat images to identify wheat varieties and quality has wide applications in agriculture and production. This paper presents an approach to the analysis and classiﬁcation of images of wheat samples obtained by the method of crystallization with additives. In tests 3 concentration and 4 times for each concentration were used, such that each type of wheat was characterized by 12 images. We used the images obtained for 5 classes. All the images have similar visual characteristics, that makes it diﬃcult to use statistical methods of analysis. The multifractal spectrum obtained by calculating the local density function was used as a classifying feature. The classiﬁcation was performed on a set of 60 wheat images corresponding to 5 different samples (classes) by various machine learning methods such as linear regression, naive Bayesian classiﬁer, support vector machine, and random forest. In some cases, to reduce the dimension of the feature space the method of principal components was applied. To identify the relationships between wheat samples obtained at different concentrations, 3 different clustering methods were used. The classiﬁcation results showed that the multifractal spectrum as classifying sign and using the random forest method in combination with the principal component analysis allow identifying wheat samples obtained by crystallization with additives, being the highest average classi- ﬁcation accuracy is 74 %.

Download Full-text

Performance Evaluation of MadBoost on Face Detection

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.892.200 ◽

2019 ◽

Vol 892 ◽

pp. 200-209

Author(s):

Rayner Pailus ◽

Rayner Alfred

Keyword(s):

Feature Extraction ◽

Face Detection ◽

Recognition Performance ◽

Principal Component ◽

Haar Wavelet ◽

Extraction Methods ◽

Face Image ◽

Support Vector ◽

Feature Extraction Method ◽

Face Images

Adaboost Viola-Jones method is indeed a profound discovery in detecting face images mainly because it is fast, light and one of the easiest methods of detecting face images among other techniques of face detection. Viola Jones uses Haar wavelet filter to detect face images and it produces almost 80%accuracy of face detection. This paper discusses proposed methodology and algorithms that involved larger library of filters used to create more discrimination features among the images by processing the proposed 15 Haar rectangular features (an extension from 4 Haar wavelet filters of Viola Jones) and used them in multiple adaptive ensemble process of detecting face image. After facial detection, the process continues with normalization processes by applying feature extraction such as PCA combined with LDA or LPP to extract our week learners’ wavelet for more classification features. Upon the process of feature extraction proposed feature selection to index these extracted data. These extracted vectors are used for training and creating MADBoost (Multiple Adaptive Diversified Boost)(an improvement of Adaboost, which uses multiple feature extraction methods combined with multiple classifiers) is able to capture, recognize and distinguish face image (s) faster. MADBoost applies the ensemble approach with better weights for classification to produce better face recognition results. Three experiments have been conducted to investigate the performance of the proposed MADBoost with three other classifiers, Neural Network (NN), Support Vector Machines (SVM) and Adaboost classifiers using Principal Component Analysis (PCA) as the feature extraction method. These experiments were tested against obstacles of POIES (Pose, Obstruction, Illumination, Expression, Sizes). Based on the results obtained, Madboost is found to be able to improve the recognition performance in matching failures, incorrect matching, matching success percentages and acceptable time taken to perform the classification task.

Download Full-text

Arrhythmia Classification Based on Multiple Features Fusion and Random Forest Using ECG

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2019.2798 ◽

2019 ◽

Vol 9 (8) ◽

pp. 1645-1654

Author(s):

Zhizhong Wang ◽

Hongyi Li ◽

Chuang Han ◽

Songwei Wang ◽

Li Shi

Keyword(s):

Random Forest ◽

Wavelet Packet ◽

Back Propagation ◽

Principal Component ◽

Support Vector ◽

Features Fusion ◽

Specificity And Sensitivity ◽

Average Accuracy ◽

Skewness Coefficient ◽

Novel Method

Cardiovascular diseases have become more and more prominent in recent years, which have proven to be a major threat to people's health. Accurate detection of arrhythmia in patients has important implications for clinical treatment. The aim of this study was to propose a novel automatic classification method for arrhythmia in order to improve classification accuracy. The electrocardiogram (ECG) signal was subjected preprocessing for denoising purposes using a wavelet transform. Then, the local and global characteristics of the beat, which contained RR interval features according with the clinical diagnosis criterion, morphology features based on wavelet packet decomposition and statistical features along with kurtosis coefficient, skewness coefficient and variance are exploited and fused. Meanwhile, the dimensionality of wavelet packet coefficients were reduced via principal component analysis (PCA). Finally, these features were used as the input of the random forest classifier to train the model and were then compared with the support vector machine (SVM) and back propagation (BP) neural networks. Based on 100,647 beats from the MIT-BIH database, the proposed method achieved an average accuracy, specificity and sensitivity of 99.08%, 99.00% and 89.31%, respectively, using the intra-patient beats, and 92.31%, 89.98% and 37.47%, respectively, using the inter-patient beats. Moreover, two classification schemes, namely, inter-patient and intra-patient scheme, were validated. Compared with the other methods referred to in this paper, the performance of the novel method yielded better results.

Download Full-text

Somatic Cells Recognition by Application of Gabor Feature-Based (2D)2PCA

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001417570099 ◽

2017 ◽

Vol 31 (12) ◽

pp. 1757009 ◽

Cited By ~ 2

Author(s):

Xiaojing Gao ◽

Heru Xue ◽

Xin Pan ◽

Xinhua Jiang ◽

Yanqing Zhou ◽

...

Keyword(s):

Gabor Filter ◽

Somatic Cells ◽

Bovine Mastitis ◽

Principal Component ◽

Feature Space ◽

Support Vector ◽

Large Set ◽

Novel Approach ◽

Gabor Feature ◽

Feature Based

In this paper, we propose a novel approach of Gabor feature based on bi-directional two-dimensional principal component analysis ((2D)2PCA) for somatic cells recognition. Firstly, Gabor features of different orientations and scales are extracted by the convolution of Gabor filter bank. Secondly, dimensionality reduction of the feature space applies (2D)2PCA in both row and column. Finally, the classifier uses Support Vector Machine (SVM) to achieve our goal. The experimental results are obtained using a large set of images from different sources. The results of our proposed method are not only efficient in accuracy and speed, but also robust to illumination in bovine mastitis via optical microscopy.

Download Full-text