Classification of Chinese Herbal Medicine Based on Improved LDA Algorithm Using Machine Olfaction

2012 ◽  
Vol 239-240 ◽  
pp. 1532-1536 ◽  
Author(s):  
De Han Luo ◽  
Ya Wen Shao

Linear discriminant analysis (LDA) is a popular method among pattern recognition algorithms of machine olfaction. However, “Small Sample Size” (SSS) problem would occur while using LDA algorithm with traditional Fisher criterion if the within-class scatter matrix is singular. In this paper, maximum scatter difference (MSD) criterion and LDA were combined to solve SSS problem, so that three kinds of Chinese herbal medicines from different growing areas were accurately classified. At the same time, the classification result was enhanced. It works out that only a few samples of Anhui Atractylodes are classified incorrectly, however, the classification rate reaches 97.8%.

2016 ◽  
Vol 2016 ◽  
pp. 1-10
Author(s):  
Zhicheng Lu ◽  
Zhizheng Liang

Linear discriminant analysis has been widely studied in data mining and pattern recognition. However, when performing the eigen-decomposition on the matrix pair (within-class scatter matrix and between-class scatter matrix) in some cases, one can find that there exist some degenerated eigenvalues, thereby resulting in indistinguishability of information from the eigen-subspace corresponding to some degenerated eigenvalue. In order to address this problem, we revisit linear discriminant analysis in this paper and propose a stable and effective algorithm for linear discriminant analysis in terms of an optimization criterion. By discussing the properties of the optimization criterion, we find that the eigenvectors in some eigen-subspaces may be indistinguishable if the degenerated eigenvalue occurs. Inspired from the idea of the maximum margin criterion (MMC), we embed MMC into the eigen-subspace corresponding to the degenerated eigenvalue to exploit discriminability of the eigenvectors in the eigen-subspace. Since the proposed algorithm can deal with the degenerated case of eigenvalues, it not only handles the small-sample-size problem but also enables us to select projection vectors from the null space of the between-class scatter matrix. Extensive experiments on several face images and microarray data sets are conducted to evaluate the proposed algorithm in terms of the classification performance, and experimental results show that our method has smaller standard deviations than other methods in most cases.


Author(s):  
XIPENG QIU ◽  
LIDE WU

Linear Discriminant Analysis (LDA) is a popular feature extraction technique in statistical pattern recognition. However, it often suffers from the small sample size problem when dealing with high-dimensional data. Moreover, while LDA is guaranteed to find the best directions when each class has a Gaussian density with a common covariance matrix, it can fail if the class densities are more general. In this paper, a novel nonparametric linear feature extraction method, nearest neighbor discriminant analysis (NNDA), is proposed from the view of the nearest neighbor classification. NNDA finds the important discriminant directions without assuming the class densities belong to any particular parametric family. It does not depend on the nonsingularity of the within-class scatter matrix either. Then we give an approximate approach to optimize NNDA and an extension to k-NN. We apply NNDA to the simulated data and real world data, the results demonstrate that NNDA outperforms the existing variant LDA methods.


Author(s):  
WEN-SHENG CHEN ◽  
PONG C. YUEN ◽  
JIAN HUANG

This paper presents a new regularization technique to deal with the small sample size (S3) problem in linear discriminant analysis (LDA) based face recognition. Regularization on the within-class scatter matrix Sw has been shown to be a good direction for solving the S3 problem because the solution is found in full space instead of a subspace. The main limitation in regularization is that a very high computation is required to determine the optimal parameters. In view of this limitation, this paper re-defines the three-parameter regularization on the within-class scatter matrix [Formula: see text], which is suitable for parameter reduction. Based on the new definition of [Formula: see text], we derive a single parameter (t) explicit expression formula for determining the three parameters and develop a one-parameter regularization on the within-class scatter matrix. A simple and efficient method is developed to determine the value of t. It is also proven that the new regularized within-class scatter matrix [Formula: see text] approaches the original within-class scatter matrix Sw as the single parameter tends to zero. A novel one-parameter regularization linear discriminant analysis (1PRLDA) algorithm is then developed. The proposed 1PRLDA method for face recognition has been evaluated with two public available databases, namely ORL and FERET databases. The average recognition accuracies of 50 runs for ORL and FERET databases are 96.65% and 94.00%, respectively. Comparing with existing LDA-based methods in solving the S3 problem, the proposed 1PRLDA method gives the best performance.


Author(s):  
WEN-SHENG CHEN ◽  
JIAN HUANG ◽  
JIN ZOU ◽  
BIN FANG

Linear Discriminant Analysis (LDA) is a popular statistical method for both feature extraction and dimensionality reduction in face recognition. The major drawback of LDA is the so-called small sample size (3S) problem. This problem always occurs when the total number of training samples is smaller than the dimension of feature space. Under this situation, the within-class scatter matrix Sw becomes singular and LDA approach cannot be implemented directly. To overcome the 3S problem, this paper proposes a novel wavelet-face based subspace LDA algorithm. Wavelet-face feature extraction and dimensionality reduction are based on two-level D4-filter wavelet transform and discarding the null space of total class scatter matrix St. It is shown that our obtained projection matrix satisfies the uncorrelated constraint conditions. Hence in the sense of statistical uncorrelation, this projection matrix is optimal. The proposed method for face recognition has been evaluated with two public available databases, namely ORL and FERET databases. Comparing with existing LDA-based methods to solve the 3S problem, our method gives the best performance.


2011 ◽  
Vol 317-319 ◽  
pp. 150-153
Author(s):  
Wan Li Feng ◽  
Shang Bing Gao

In this paper, a reformative scatter difference discriminant criterion (SDDC) with fuzzy set theory is studied. The scatter difference between between-class and within-class as discriminant criterion is effective to overcome the singularity problem of the within-class scatter matrix due to small sample size problem occurred in classical Fisher discriminant analysis. However, the conventional SDDC assumes the same level of relevance of each sample to the corresponding class. So, a fuzzy maximum scatter difference analysis (FMSDA) algorithm is proposed, in which the fuzzy k-nearest neighbor (FKNN) is implemented to achieve the distribution information of original samples, and this information is utilized to redefine corresponding scatter matrices which are different to the conventional SDDC and effective to extract discriminative features from overlapping (outlier) samples. Experiments conducted on FERET face databases demonstrate the effectiveness of the proposed method.


Author(s):  
WEIXIANG LIU ◽  
KEHONG YUAN ◽  
JIAN WU ◽  
DATIAN YE ◽  
ZHEN JI ◽  
...  

Classification of gene expression samples is a core task in microarray data analysis. How to reduce thousands of genes and to select a suitable classifier are two key issues for gene expression data classification. This paper introduces a framework on combining both feature extraction and classifier simultaneously. Considering the non-negativity, high dimensionality and small sample size, we apply a discriminative mixture model which is designed for non-negative gene express data classification via non-negative matrix factorization (NMF) for dimension reduction. In order to enhance the sparseness of training data for fast learning of the mixture model, a generalized NMF is also adopted. Experimental results on several real gene expression datasets show that the classification accuracy, stability and decision quality can be significantly improved by using the generalized method, and the proposed method can give better performance than some previous reported results on the same datasets.


Sign in / Sign up

Export Citation Format

Share Document