Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier

In Microarray Data, it is complicated to achieve more classification accuracy due to the presence of high dimensions, irrelevant and noisy data. And also It had more gene expression data and fewer samples. To increase the classification accuracy and the processing speed of the model, an optimal number of features need to extract, this can be achieved by applying the feature selection method. In this paper, we propose a hybrid ensemble feature selection method. The proposed method has two phases, filter and wrapper phase in filter phase ensemble technique is used for aggregating the feature ranks of the Relief, minimum redundancy Maximum Relevance (mRMR), and Feature Correlation (FC) filter feature selection methods. This paper uses the Fuzzy Gaussian membership function ordering for aggregating the ranks. In wrapper phase, Improved Binary Particle Swarm Optimization (IBPSO) is used for selecting the optimal features, and the RBF Kernel-based Support Vector Machine (SVM) classifier is used as an evaluator. The performance of the proposed model are compared with state of art feature selection methods using five benchmark datasets. For evaluation various performance metrics such as Accuracy, Recall, Precision, and F1-Score are used. Furthermore, the experimental results show that the performance of the proposed method outperforms the other feature selection methods.

Download Full-text

Feature Selection Method Based on Mutual Information and Support Vector Machine

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800142150021x ◽

2021 ◽

pp. 2150021

Author(s):

Gang Liu ◽

Chunlei Yang ◽

Sen Liu ◽

Chunbao Xiao ◽

Bin Song

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Mutual Information ◽

Classification Accuracy ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Standard Data ◽

Feature Dimension

A feature selection method based on mutual information and support vector machine (SVM) is proposed in order to eliminate redundant feature and improve classification accuracy. First, local correlation between features and overall correlation is calculated by mutual information. The correlation reflects the information inclusion relationship between features, so the features are evaluated and redundant features are eliminated with analyzing the correlation. Subsequently, the concept of mean impact value (MIV) is defined and the influence degree of input variables on output variables for SVM network based on MIV is calculated. The importance weights of the features described with MIV are sorted by descending order. Finally, the SVM classifier is used to implement feature selection according to the classification accuracy of feature combination which takes MIV order of feature as a reference. The simulation experiments are carried out with three standard data sets of UCI, and the results show that this method can not only effectively reduce the feature dimension and high classification accuracy, but also ensure good robustness.

Download Full-text

NICFS: A novel feature selection method applied to lexicon based sentiment analysis

Intelligent Decision Technologies ◽

10.3233/idt-190361 ◽

2019 ◽

Vol 13 (1) ◽

pp. 41-48

Author(s):

Poornima Mehta ◽

Satish Chandra

Keyword(s):

Feature Selection ◽

Sentiment Analysis ◽

Feature Selection Method ◽

Selection Method

Download Full-text

Improved SVM-RFE feature selection method for multi-SVM classifier

2011 International Conference on Electrical and Control Engineering ◽

10.1109/iceceng.2011.6058060 ◽

2011 ◽

Cited By ~ 2

Author(s):

Jianchen Wang ◽

Ganlin Shan ◽

Xiusheng Duan ◽

Bo Wen

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Svm Classifier

Download Full-text

QER: a new feature selection method for sentiment analysis

Human-centric Computing and Information Sciences ◽

10.1186/s13673-018-0135-8 ◽

2018 ◽

Vol 8 (1) ◽

Cited By ~ 6

Author(s):

Tuba Parlar ◽

Selma Ayşe Özel ◽

Fei Song

Keyword(s):

Feature Selection ◽

Sentiment Analysis ◽

Feature Selection Method ◽

Selection Method ◽

New Feature

Download Full-text

Classification of Crop Residue Cover in High-Resolution RGB Images Using Machine Learning

Journal of the ASABE ◽

10.13031/ja.14572 ◽

2022 ◽

Vol 65 (1) ◽

pp. 75-86

Author(s):

Parth C. Upadhyay ◽

John A. Lory ◽

Guilherme N. DeSouza ◽

Timotius A. P. Lagaunne ◽

Christine M. Spinka

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Feature Selection Method ◽

Texture Features ◽

Ground Truth ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Aerial Vehicle ◽

Rgb Images

HighlightsA machine learning framework estimated residue cover in RGB images taken at three resolutions from 88 locations.The best results primarily used texture features, the RFE-SVM feature selection method, and the SVM classifier.Accounting for shadows and plants plus modifying and optimizing the texture features may improve performance.An automated system developed using machine learning is a viable strategy to estimate residue cover from RGB images obtained with handheld or UAV platforms.Abstract. Maintaining plant residue on the soil surface contributes to sustainable cultivation of arable land. Applying machine learning methods to RGB images of residue could overcome the subjectivity of manual methods. The objectives of this study were to use supervised machine learning while identifying the best feature selection method, the best classifier, and the most effective image feature types for classifying residue levels in RGB imagery. Imagery was collected from 88 locations in 40 row-crop fields in five Missouri counties between early May and late June in 2018 and 2019 using a tripod-mounted camera (0.014 cm pixel-1 ground sampling distance, GSD) and an unmanned aerial vehicle (UAV, 0.05 and 0.14 GSD). At each field location, 50 contiguous 0.3 × 0.2 m region of interest (ROI) images were extracted from the imagery, resulting in a dataset of 4,400 ROI images at each GSD. Residue percentages for ground truth were estimated using a bullseye grid method (n = 100 points) based on the 0.014 GSD images. Representative color, texture, and shape features were extracted and evaluated using four feature selection methods and two classifiers. Recursive feature elimination using support vector machine (RFE-SVM) was the best feature selection method, and the SVM classifier performed best for classifying the amount of residue as a three-class problem. The best features for this application were associated with texture, with local binary pattern (LBP) features being the most prevalent for all three GSDs. Shape features were irrelevant. The three residue classes were correctly identified with 88%, 84%, and 81% 10-fold cross-validation scores for the 2018 training data and 81%, 69%, and 65% accuracy for the 2019 testing data in decreasing resolution order. Converting image-wise data (0.014 GSD) to location residue estimates using a Bayesian model showed good agreement with the location-based ground truth (r2 = 0.90). This initial assessment documents the use of RGB images to match other methods of estimating residue, with potential to replace or be used as a quality control for line-transect assessments. Keywords: Feature selection, Soil erosion, Support vector machine, Texture features, Unmanned aerial vehicle.

Download Full-text

Feature Selection Scheme Based on Multi-time-scales for Analyzing Congestive Heart Failure

10.21203/rs.3.rs-338866/v1 ◽

2021 ◽

Author(s):

Chunyuan Wang ◽

Yatao Zhang ◽

Xinge Jiang ◽

Feifei Liu ◽

Zhimin Zhang ◽

...

Keyword(s):

Heart Failure ◽

Congestive Heart Failure ◽

Feature Selection ◽

Time Scales ◽

Feature Selection Method ◽

Selection Method ◽

Multiple Time Scales ◽

Support Vector ◽

Svm Classifier ◽

Selection Scheme

Abstract This paper proposed a feature selection method combined with multi-time-scales analysis and heart rate variability (HRV) analysis for middle and early diagnosis of congestive heart failure (CHF). In previous studies regarding the diagnosis of CHF, researchers have tended to increase the variety of HRV features by searching for new ones or to use different machine learning algorithms to optimize the classification of CHF and normal sinus rhythms subject (NSR). In fact, the full utilization of traditional HRV features can also improve classification accuracy. The proposed method constructs a multi-time-scales feature matrix according to traditional HRV features that exhibit good stability in multiple time-scales and differences in different time-scales. The multi-scales features yield better performance than the traditional single-time-scales features when the features are fed into a support vector machine (SVM) classifier, and the results of the SVM classifier exhibit a sensitivity, a specificity, and an accuracy of 99.52%, 100.00%, and 99.83%, respectively. These results indicate that the proposed feature selection method can effectively reduce redundant features and computational load when used for automatic diagnosis of CHF.

Download Full-text

Modify Random Forest Algorithm Using Hybrid Feature Selection Method

International Journal on Perceptive and Cognitive Computing ◽

10.31436/ijpcc.v4i2.59 ◽

2018 ◽

Vol 4 (2) ◽

pp. 1-6

Author(s):

Ahmed T. Sadiqâ€Ž ◽

Karrar Shareef Musawi

Keyword(s):

Feature Selection ◽

Random Forest ◽

Gini Index ◽

Feature Selection Method ◽

Selection Method ◽

Random Selection ◽

Experimental Results ◽

Random Forest Algorithm ◽

Selection For

The Importance of Random Forrest(RF) is one of the most powerful â€Žmethods â€Žof â€Žmachine learning in â€ŽDecision Tree.â€Ž The Proposed hybrid feature selection for Random Forest depend on â€Žtwo â€Žmeasure â€Žâ€ŽInformation Gain and Gini Index in varying percentages â€Žbased on â€Žweight.â€Ž In this paper, we tend to â€Žpropose a modify Random Forrestâ€ â€â€Žalgorithm named â€ŽRandom Forest algorithm using hybrid â€Žfeature â€Žâ€Žselection â€Žthat uses hybrid feature â€Žselection instead of â€Žusing â€Žone feature selection. The â€Žmain plan is to â€Žcomputation the â€Žâ€Ž Information â€ŽGain for all random selection â€Žfeature then search for â€Žthe best split â€Žâ€Žpoint in â€Žthe node that gives the best â€Žvalue for a hybrid â€Žequation with â€ŽGini Index. â€ŽThe experimental results on the â€Ždataset â€Žshowed that the proposed â€Žmodification is â€Žbetter than the classic Random â€ŽForest compared to â€Žthe standard static Random â€ŽForest the hybrid feature â€Žâ€Žselection Random Forrest shows significant â€Žimprovement â€Žin accuracy measure.â€Ž

Download Full-text

Aspect-Based Sentiment Analysis of Arabic Tweets in the Education Sector Using a Hybrid Feature Selection Method

2020 14th International Conference on Innovations in Information Technology (IIT) ◽

10.1109/iit50501.2020.9299026 ◽

2020 ◽

Author(s):

Manar Alassaf ◽

Ali Mustafa Qamar

Keyword(s):

Feature Selection ◽

Sentiment Analysis ◽

Feature Selection Method ◽

Selection Method ◽

Education Sector

Download Full-text

Sentiment Analysis of Chinese Micro Blog Using Machine Learning and an Improved Feature Selection Method

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.631-632.1219 ◽

2014 ◽

Vol 631-632 ◽

pp. 1219-1223

Author(s):

Jia Hao Chen ◽

Jian Hua Wu

Keyword(s):

Machine Learning ◽

Social Media ◽

Feature Selection ◽

Sentiment Analysis ◽

Rapid Development ◽

Feature Selection Method ◽

Selection Method ◽

Media Services ◽

Social Media Service ◽

Better Than

With the rapid development of Internet and occurrence of social media services, many users are becoming the creators of social information. However, the normal manual work can't deal with a large number of subjective messages. As a new kind of social media service, micro blog has been widely accepted and can be used for sentiment analysis. This paper compared performances of three machine learning methods on sentiment analysis of Chinese micro blog. We also proposed an improved feature selection method that increases the accuracy of classification. Experiment results show that SVM is closed to Naïve Bayes and they are better than logistic regression in most cases.

Download Full-text