scholarly journals Finding author similarity by clustering probabilistic LSA factors in INDIAN english authors poetry

2018 ◽  
Vol 7 (2.7) ◽  
pp. 1096
Author(s):  
K Praveen kumar ◽  
Venkata Naresh Mandhala ◽  
Sudheshna Vempati ◽  
Dr Subba Rao Peram

High dimensionality and sparseness is the big challenge to the data scientists to discover the similarity among the documents. In unsuper-vised learning data is unlabeled and there is no clear distance measures to discover the clusters among the data. In this paper we considered Indian English Authors poems to cluster them using Probabilistic Latent Semantic Analysis, using which we analyzed the authors similarity. We compared the results of clustering with Latent Semantic Analysis method, a word occurrence method. In this case, Results are shown that probabilistic methods are performing good clustering than the word occurrence method.  

Sign in / Sign up

Export Citation Format

Share Document