A Confidence-Based Hierarchical Feature Clustering Algorithm for Text Classification

A Fuzzy Self-Constructing Feature Clustering Algorithm for Text Classification

IOSR Journal of Engineering ◽

10.9790/3021-02943644 ◽

2012 ◽

Vol 02 (09) ◽

pp. 36-44

Author(s):

A. Kavitha

Keyword(s):

Text Classification ◽

Clustering Algorithm ◽

Feature Clustering

A Fuzzy Self-Constructing Feature Clustering Algorithm for Text Classification

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2010.122 ◽

2011 ◽

Vol 23 (3) ◽

pp. 335-349 ◽

Cited By ~ 88

Author(s):

Jung-Yi Jiang ◽

Ren-Jia Liou ◽

Shie-Jue Lee

Keyword(s):

Text Classification ◽

Clustering Algorithm ◽

Feature Clustering

An Improved Weighted-Feature Clustering Algorithm for K-anonymity

2009 Fifth International Conference on Information Assurance and Security ◽

10.1109/ias.2009.311 ◽

2009 ◽

Cited By ~ 2

Author(s):

Lijian Lu ◽

Xiaojun Ye

Keyword(s):

Clustering Algorithm ◽

Feature Clustering

An enhanced fuzzy similarity based concept mining model for text classification using feature clustering

2012 Students Conference on Engineering and Systems ◽

10.1109/sces.2012.6199126 ◽

2012 ◽

Cited By ~ 6

Author(s):

Shalini Puri ◽

Sona Kaushik

Keyword(s):

Text Classification ◽

Feature Clustering ◽

Fuzzy Similarity ◽

Concept Mining ◽

Mining Model

Salient Object Detection Based on Background Feature Clustering

Advances in Multimedia ◽

10.1155/2017/4183986 ◽

2017 ◽

Vol 2017 ◽

pp. 1-9 ◽

Cited By ~ 3

Author(s):

Kan Huang ◽

Yong Zhang ◽

Bo Lv ◽

Yongbiao Shi

Keyword(s):

Object Detection ◽

Clustering Algorithm ◽

Geodesic Distance ◽

Saliency Map ◽

Salient Object Detection ◽

Salient Object ◽

Edge Preserving ◽

Feature Clustering ◽

Background Distribution ◽

Extensive Evaluation

Automatic estimation of salient object without any prior knowledge tends to greatly enhance many computer vision tasks. This paper proposes a novel bottom-up based framework for salient object detection by first modeling background and then separating salient objects from background. We model the background distribution based on feature clustering algorithm, which allows for fully exploiting statistical and structural information of the background. Then a coarse saliency map is generated according to the background distribution. To be more discriminative, the coarse saliency map is enhanced by a two-step refinement which is composed of edge-preserving element-level filtering and upsampling based on geodesic distance. We provide an extensive evaluation and show that our proposed method performs favorably against other outstanding methods on two most commonly used datasets. Most importantly, the proposed approach is demonstrated to be more effective in highlighting the salient object uniformly and robust to background noise.

Method of Feature Reduction in Short Text Classification Based on Feature Clustering

Applied Sciences ◽

10.3390/app9081578 ◽

2019 ◽

Vol 9 (8) ◽

pp. 1578 ◽

Cited By ~ 2

Author(s):

Li ◽

Yin ◽

Shi ◽

Mao ◽

Shi

Keyword(s):

Text Classification ◽

Spectral Clustering ◽

Reduction Method ◽

Feature Reduction ◽

Vector Spaces ◽

Feature Clustering ◽

Short Text ◽

Cluster Feature ◽

Original Feature ◽

Traversal Algorithm

One decisive problem of short text classification is the serious dimensional disaster when utilizing a statistics-based approach to construct vector spaces. Here, a feature reduction method is proposed that is based on two-stage feature clustering (TSFC), which is applied to short text classification. Features are semi-loosely clustered by combining spectral clustering with a graph traversal algorithm. Next, intra-cluster feature screening rules are designed to remove outlier feature words, which improves the effect of similar feature clusters. We classify short texts with corresponding similar feature clusters instead of original feature words. Similar feature clusters replace feature words, and the dimension of vector space is significantly reduced. Several classifiers are utilized to evaluate the effectiveness of this method. The results show that the method largely resolves the dimensional disaster and it can significantly improve the accuracy of short text classification.

A Methodology for Text Classification based on Feature Clustering

International Conference on Automatic Control and Artificial Intelligence (ACAI 2012) ◽

10.1049/cp.2012.0935 ◽

2012 ◽

Author(s):

Yang Song ◽

Lisha Hou

Keyword(s):

Text Classification ◽

Feature Clustering

Distributed Facial Feature Clustering Algorithm Based on Spatiotemporal Locality

Innovative Mobile and Internet Services in Ubiquitous Computing - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-030-50399-4_38 ◽

2020 ◽

pp. 394-403

Author(s):

Qiutong Lin ◽

Bihua Zhuo ◽

Lili Jiao ◽

Li Liao ◽

Jiangtao Guo

Keyword(s):

Clustering Algorithm ◽

Facial Feature ◽

Feature Clustering

Feature Clustering and Ensemble Learning Based Approach for Software Defect Prediction

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999201109201259 ◽

2020 ◽

Vol 13 ◽

Author(s):

R. Srivastava ◽

Aman Kumar Jain

Keyword(s):

Clustering Algorithm ◽

Class Imbalance ◽

Software Defect Prediction ◽

Class Imbalance Problem ◽

Feature Clustering ◽

Significance Level ◽

Software Products ◽

Imbalance Problem ◽

Software Defect ◽

Software Modules

Objective:: Defects in delivered software products not only have financial implications but also blemish the reputation of the organisation and lead to wastage of time and human resource. This paper aims to detect defects in software modules. Methods:: Our approach sequentially combines SMOTE algorithm to deal with class imbalance problem, K - means clustering algorithm to obtain a set of key features based on inter-class and intra-class coefficient of correlation and ensemble modelling to predict defects in software modules. After cautious examination, an ensemble framework of XGBoost, Decision Tree and Random Forest is used for prediction of software defects owing to numerous merits of ensembling approach. Results:: We have used five open-source datasets from NASA Promise Repository for Software Engineering. The result obtained from our approach has been compared with that of individual algorithms used in ensemble. A confidence interval for the accuracy of our approach with respect to performance evaluation metrics namely Accuracy, Precision, Recall, F1 score and AUC score has also been constructed at a significance level of 0.01. Conclusion:: Results have been depicted pictographically.

A semi-supervised feature clustering algorithm with application to word sense disambiguation

10.3115/1220575.1220689 ◽

2005 ◽

Cited By ~ 4

Author(s):

Zheng-Yu Niu ◽

Dong-Hong Ji ◽

Chew Lim Tan

Keyword(s):

Clustering Algorithm ◽

Word Sense Disambiguation ◽

Word Sense ◽

Feature Clustering ◽

Sense Disambiguation