scholarly journals Hyperspectral Classification of Plants: A Review of Waveband Selection Generalisability

2020 ◽  
Vol 12 (1) ◽  
pp. 113 ◽  
Author(s):  
Andrew Hennessy ◽  
Kenneth Clarke ◽  
Megan Lewis

Hyperspectral sensing, measuring reflectance over visible to shortwave infrared wavelengths, has enabled the classification and mapping of vegetation at a range of taxonomic scales, often down to the species level. Classification with hyperspectral measurements, acquired by narrow band spectroradiometers or imaging sensors, has generally required some form of spectral feature selection to reduce the dimensionality of the data to a level suitable for the construction of a classification model. Despite the large number of hyperspectral plant classification studies, an in-depth review of feature selection methods and resultant waveband selections has not yet been performed. Here, we present a review of the last 22 years of hyperspectral vegetation classification literature that evaluates the overall waveband selection frequency, waveband selection frequency variation by taxonomic, structural, or functional group, and the influence of feature selection choice by comparing such methods as stepwise discriminant analysis (SDA), support vector machines (SVM), and random forests (RF). This review determined that all characteristics of hyperspectral plant studies influence the wavebands selected for classification. This includes the taxonomic, structural, and functional groups of the target samples, the methods, and scale at which hyperspectral measurements are recorded, as well as the feature selection method used. Furthermore, these influences do not appear to be consistent. Moreover, the considerable variability in waveband selection caused by the feature selectors effectively masks the analysis of any variability between studies related to plant groupings. Additionally, questions are raised about the suitability of SDA as a feature selection method, with it producing waveband selections at odds with the other feature selectors. Caution is recommended when choosing a feature selector for hyperspectral plant classification: We recommend multiple methods being performed. The resultant sets of selected spectral features can either be evaluated individually by multiple classification models or combined as an ensemble for evaluation by a single classifier. Additionally, we suggest caution when relying upon waveband recommendations from the literature to guide waveband selections or classifications for new plant discrimination applications, as such recommendations appear to be weakly generalizable between studies.

Symmetry ◽  
2020 ◽  
Vol 12 (2) ◽  
pp. 301 ◽  
Author(s):  
Jie Cao ◽  
Da Wang ◽  
Zhaoyang Qu ◽  
Hongyu Sun ◽  
Bin Li ◽  
...  

Network traffic classification based on machine learning is an important branch of pattern recognition in computer science. It is a key technology for dynamic intelligent network management and enhanced network controllability. However, the traffic classification methods still facing severe challenges: The optimal set of features is difficult to determine. The classification method is highly dependent on the effective characteristic combination. Meanwhile, it is also important to balance the experience risk and generalization ability of the classifier. In this paper, an improved network traffic classification model based on a support vector machine is proposed. First, a filter-wrapper hybrid feature selection method is proposed to solve the false deletion of combined features caused by a traditional feature selection method. Second, to balance the empirical risk and generalization ability of support vector machine (SVM) traffic classification model, an improved parameter optimization algorithm is proposed. The algorithm can dynamically adjust the quadratic search area, reduce the density of quadratic mesh generation, improve the search efficiency of the algorithm, and prevent the over-fitting while optimizing the parameters. The experiments show that the improved traffic classification model achieves higher classification accuracy, lower dimension and shorter elapsed time and performs significantly better than traditional SVM and the other three typical supervised ML algorithms.


2017 ◽  
Vol 72 ◽  
pp. 314-326 ◽  
Author(s):  
Saúl Solorio-Fernández ◽  
José Fco. Martínez-Trinidad ◽  
J. Ariel Carrasco-Ochoa

Author(s):  
B. Venkatesh ◽  
J. Anuradha

In Microarray Data, it is complicated to achieve more classification accuracy due to the presence of high dimensions, irrelevant and noisy data. And also It had more gene expression data and fewer samples. To increase the classification accuracy and the processing speed of the model, an optimal number of features need to extract, this can be achieved by applying the feature selection method. In this paper, we propose a hybrid ensemble feature selection method. The proposed method has two phases, filter and wrapper phase in filter phase ensemble technique is used for aggregating the feature ranks of the Relief, minimum redundancy Maximum Relevance (mRMR), and Feature Correlation (FC) filter feature selection methods. This paper uses the Fuzzy Gaussian membership function ordering for aggregating the ranks. In wrapper phase, Improved Binary Particle Swarm Optimization (IBPSO) is used for selecting the optimal features, and the RBF Kernel-based Support Vector Machine (SVM) classifier is used as an evaluator. The performance of the proposed model are compared with state of art feature selection methods using five benchmark datasets. For evaluation various performance metrics such as Accuracy, Recall, Precision, and F1-Score are used. Furthermore, the experimental results show that the performance of the proposed method outperforms the other feature selection methods.


Author(s):  
Gang Liu ◽  
Chunlei Yang ◽  
Sen Liu ◽  
Chunbao Xiao ◽  
Bin Song

A feature selection method based on mutual information and support vector machine (SVM) is proposed in order to eliminate redundant feature and improve classification accuracy. First, local correlation between features and overall correlation is calculated by mutual information. The correlation reflects the information inclusion relationship between features, so the features are evaluated and redundant features are eliminated with analyzing the correlation. Subsequently, the concept of mean impact value (MIV) is defined and the influence degree of input variables on output variables for SVM network based on MIV is calculated. The importance weights of the features described with MIV are sorted by descending order. Finally, the SVM classifier is used to implement feature selection according to the classification accuracy of feature combination which takes MIV order of feature as a reference. The simulation experiments are carried out with three standard data sets of UCI, and the results show that this method can not only effectively reduce the feature dimension and high classification accuracy, but also ensure good robustness.


2010 ◽  
Vol 44-47 ◽  
pp. 1130-1134
Author(s):  
Sheng Li ◽  
Pei Lin Zhang ◽  
Bing Li

Feature selection is a key step in hydraulic system fault diagnosis. Some of the collected features are unrelated to classification model, and some are high correlated to other features. These features are harmful for establishing classification model. In order to solve this problem, genetic algorithm-partial least squares (GA-PLS) is proposed for selecting the representative and optimal features. K nearest neighbor algorithm (KNN) is used for diagnosing and classifying hydraulic system faults. For expressing better performance of GA-PLS, the original data of a model engineering hydraulic system is used, and the results of GA-PLS are compared with all feature used and GA. The experimental results show that, the proposed feature method can diagnose and classify hydraulic system faults more efficiently with using fewer features.


Author(s):  
Jian-Wu Xu ◽  
Kenji Suzuki

One of the major challenges in current Computer-Aided Detection (CADe) of polyps in CT Colonography (CTC) is to improve the specificity without sacrificing the sensitivity. If a large number of False Positive (FP) detections of polyps are produced by the scheme, radiologists might lose their confidence in the use of CADe. In this chapter, the authors used a nonlinear regression model operating on image voxels and a nonlinear classification model with extracted image features based on Support Vector Machines (SVMs). They investigated the feasibility of a Support Vector Regression (SVR) in the massive-training framework, and the authors developed a Massive-Training SVR (MTSVR) in order to reduce the long training time associated with the Massive-Training Artificial Neural Network (MTANN) for reduction of FPs in CADe of polyps in CTC. In addition, the authors proposed a feature selection method directly coupled with an SVM classifier to maximize the CADe system performance. They compared the proposed feature selection method with the conventional stepwise feature selection based on Wilks’ lambda with a linear discriminant analysis classifier. The FP reduction system based on the proposed feature selection method was able to achieve a 96.0% by-polyp sensitivity with an FP rate of 4.1 per patient. The performance is better than that of the stepwise feature selection based on Wilks’ lambda (which yielded the same sensitivity with 18.0 FPs/patient). To test the performance of the proposed MTSVR, the authors compared it with the original MTANN in the distinction between actual polyps and various types of FPs in terms of the training time reduction and FP reduction performance. The authors’ CTC database consisted of 240 CTC datasets obtained from 120 patients in the supine and prone positions. With MTSVR, they reduced the training time by a factor of 190, while achieving a performance (by-polyp sensitivity of 94.7% with 2.5 FPs/patient) comparable to that of the original MTANN (which has the same sensitivity with 2.6 FPs/patient).


Sign in / Sign up

Export Citation Format

Share Document