An ABC Solution of TSP with Cluster Feature

2021 ◽  
Vol 11 (04) ◽  
pp. 1001-1007
Author(s):  
建兵 林
Keyword(s):  
Author(s):  
Jie Zhou ◽  
◽  
Bicheng Li ◽  
Yongwang Tang ◽  

Person name clustering disambiguation is the process that partitions name mentions according to corresponding target person entities in reality. The existed methods can not realize effective identification of important features to disambiguate person names. This paper presents a method of Chinese person name disambiguation based on two-stage clustering. This method adopts a stage-by-stage processing model to identify and utilize different types of important features. Firstly, we extract three kinds of core evidences namely direct social relation, indirect social relation and common description prefix, recognize document-pairs referring to the same person entity, and realize initial clustering of person names with high precision. Then, we take the result of initial clustering as new initial input, utilize the statistical properties of multi-documents to recognize and evaluate important features, and build a double-vector representation of clusters (cluster feature vector and important feature vector). Based on the processes above, the final clustering of person names is generated, and the recall of clustering is improved effectively. The experiments have been conducted on the dataset of CLP2010 Chinese person names disambiguation, and experimental results show that this method has good performance in person name clustering disambiguation.


2019 ◽  
Vol 9 (8) ◽  
pp. 1578 ◽  
Author(s):  
Li ◽  
Yin ◽  
Shi ◽  
Mao ◽  
Shi

One decisive problem of short text classification is the serious dimensional disaster when utilizing a statistics-based approach to construct vector spaces. Here, a feature reduction method is proposed that is based on two-stage feature clustering (TSFC), which is applied to short text classification. Features are semi-loosely clustered by combining spectral clustering with a graph traversal algorithm. Next, intra-cluster feature screening rules are designed to remove outlier feature words, which improves the effect of similar feature clusters. We classify short texts with corresponding similar feature clusters instead of original feature words. Similar feature clusters replace feature words, and the dimension of vector space is significantly reduced. Several classifiers are utilized to evaluate the effectiveness of this method. The results show that the method largely resolves the dimensional disaster and it can significantly improve the accuracy of short text classification.


2018 ◽  
Vol 07 (01) ◽  
pp. 1750015
Author(s):  
Bingqing Lin ◽  
Zhen Pang ◽  
Qihua Wang

This paper concerns with variable screening when highly correlated variables exist in high-dimensional linear models. We propose a novel cluster feature selection (CFS) procedure based on the elastic net and linear correlation variable screening to enjoy the benefits of the two methods. When calculating the correlation between the predictor and the response, we consider highly correlated groups of predictors instead of the individual ones. This is in contrast to the usual linear correlation variable screening. Within each correlated group, we apply the elastic net to select variables and estimate their parameters. This avoids the drawback of mistakenly eliminating true relevant variables when they are highly correlated like LASSO [R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B 58 (1996) 268–288] does. After applying the CFS procedure, the maximum absolute correlation coefficient between clusters becomes smaller and any common model selection methods like sure independence screening (SIS) [J. Fan and J. Lv, Sure independence screening for ultrahigh dimensional feature space, J. R. Stat. Soc. Ser. B 70 (2008) 849–911] or LASSO can be applied to improve the results. Extensive numerical examples including pure simulation examples and semi-real examples are conducted to show the good performances of our procedure.


Sign in / Sign up

Export Citation Format

Share Document