Intention based Clustering of Relevant Reviews using Content Similarity
The proposed work deals with finding related reviews posted on various online Forums. Conventional methods for matching related documents compute the content similarity over the entire review instead of partitioning into segments revealing different intentions. In this work, intention-based similarity clustering is introduced to find the relatedness of two documents. This method forms the document clusters based on the similarity of the segments with similar intentions. The segmentation points are identified using a number of text features which can express when the segmentation should be done. Finally, the document clusters are formed by grouping the segments with similar intentions in same cluster and then the similarities among the segments with the same intention are computed. The proposed model is trained on TripAdvisor and Yelp Open Review datasets to evaluate the performance of the model, and the evaluation results show that the model produces more precise results in mining documents related to the user’s interest.