scholarly journals A Novel Feature Selection Technique to Better Predict Climate Change Stage of Change

2021 ◽  
Vol 14 (1) ◽  
pp. 40
Author(s):  
Hamed Naseri ◽  
E. Owen D. Waygood ◽  
Bobin Wang ◽  
Zachary Patterson ◽  
Ricardo A. Daziano

Indications of people’s environmental concern are linked to transport decisions and can provide great support for policymaking on climate change. This study aims to better predict individual climate change stage of change (CC-SoC) based on different features of transport-related behavior, General Ecological Behavior, New Environmental Paradigm, and socio-demographic characteristics. Together these sources result in over 100 possible features that indicate someone’s level of environmental concern. Such a large number of features may create several analytical problems, such as overfitting, accuracy reduction, and high computational costs. To this end, a new feature selection technique, named the Coyote Optimization Algorithm-Quadratic Discriminant Analysis (COA-QDA), is first proposed to find the optimal features to predict CC-SoC with the highest accuracy. Different conventional feature selection methods (Lasso, Elastic Net, Random Forest Feature Selection, Extra Trees, and Principal Component Analysis Feature Selection) are employed to compare with the COA-QDA. Afterward, eight classification techniques are applied to solve the prediction problem. Finally, a sensitivity analysis is performed to determine the most important features affecting the prediction of CC-SoC. The results indicate that COA-QDA outperforms conventional feature selection methods by increasing average testing data accuracy from 0.7 to 5.6%. Logistic Regression surpasses other classifiers with the highest prediction accuracy.

Author(s):  
Norsyela Muhammad Noor Mathivanan ◽  
Nor Azura Md.Ghani ◽  
Roziah Mohd Janor

<span>The curse of dimensionality and the empty space phenomenon emerged as a critical problem in text classification. One way of dealing with this problem is applying a feature selection technique before performing a classification model. This technique helps to reduce the time complexity and sometimes increase the classification accuracy. This study introduces a feature selection technique using K-Means clustering to overcome the weaknesses of traditional feature selection technique such as principal component analysis (PCA) that require a lot of time to transform all the inputs data. This proposed technique decides on features to retain based on the significance value of each feature in a cluster. This study found that k-means clustering helps to increase the efficiency of KNN model for a large data set while KNN model without feature selection technique is suitable for a small data set. A comparison between K-Means clustering and PCA as a feature selection technique shows that proposed technique is better than PCA especially in term of computation time. Hence, k-means clustering is found to be helpful in reducing the data dimensionality with less time complexity compared to PCA without affecting the accuracy of KNN model for a high frequency data.</span>


2016 ◽  
Vol 6 (4) ◽  
pp. 243-253 ◽  
Author(s):  
Christina Brester ◽  
Eugene Semenkin ◽  
Maxim Sidorov

Abstract If conventional feature selection methods do not show sufficient effectiveness, alternative algorithmic schemes might be used. In this paper we propose an evolutionary feature selection technique based on the two-criterion optimization model. To diminish the drawbacks of genetic algorithms, which are applied as optimizers, we design a parallel multicriteria heuristic procedure based on an island model. The performance of the proposed approach was investigated on the Speech-based Emotion Recognition Problem, which reflects one of the most essential points in the sphere of human-machine communications. A number of multilingual corpora (German, English and Japanese) were involved in the experiments. According to the results obtained, a high level of emotion recognition was achieved (up to a 12.97% relative improvement compared with the best F-score value on the full set of attributes).


An Ad-hoc network is a kind of wireless construction from one to another computer, without having Wi-Fi access point or Router. However, the Ad hoc approach offers marginal security and decreases the data transfer rate. Consequently, it helps the attacker to connect with the ad-hoc network without any trouble. Therefore, a robust and reliable intrusion detection system (IDS) is a necessity of today’s information security domain. These IDS systems play a vital role in monitoring the threats encountered in a network by detecting the change in the normal profile due to attacks. Recently, to detect attacks the IDS are being equipped with machine learning algorithms to attain better accuracy and fast detection speed. Most of the IDS use different network features. However, enormous number of features makes the detection and prevention complicated. The IDS presented in this paper employs random forest and principal component analysis to minimize the number of features for network IDS for wireless ad hoc networks. The one class SVM has been used for detection of worm hole attack with and without feature selection. The performances of these approaches are compared with various existing techniques with false positive rate (FPR), accuracy and detection rate. Here, the accuracy improves and false positive rate reduces when intrusion is detected with feature selection technique. This paper discusses the performance of the one class SVM classifier in the wireless adhoc network IDS with random forest feature selection and principal component analysis feature selection techniques and one class SVM classifier without feature selection technique in the detection of wormhole attack. And the performance of one class SVM IDS is better in the detection of wormhole attack while it is implemented with principal component analysis feature selection technique.


2015 ◽  
Vol 72 (5) ◽  
Author(s):  
Tony Antonio ◽  
Adi Suryaputra Paramita

Feature selection technique has an important role for internet traffic classification. This technique will present more accurate data and more accurate internet traffic classification which will provide precise information for bandwidth optimization. One of the important considerations in the feature selection technique that should be looked into is how to choose the right features which can deliver better and more precise results for the classification process. This research will compare feature selection algorithms where the Internet traffic has the same correlation that could fit into the same class. Internet traffic dataset will be collected, formatted, classified and analyzed using Naïve Bayesian. Formerly, the Correlation Feature Selection (CFS) is used in the feature selection to find a collection of the best sub-sets data from the existing data but without the discriminant and principal of a body dataset. We plan to use Principal Component Analysis technique in order to find discriminant and principal feature for internet traffic classification. Moreover, this paper also studied the process to fit the features. The result also shows that the internet traffic classification using Naïve Bayesian and Correlation Feature Selection (CFS) have more than 90% accuracy while the classification accuracy reached 75% for feature selection using Principal Component Analysis (PCA).


Sign in / Sign up

Export Citation Format

Share Document