D-Pattern Evolving and Inner Pattern Evolving for High Performance Text Mining

Author(s):  
B. Vignani ◽  
Suresh Chandra Satapathy
Author(s):  
Hans Vandierendonck ◽  
Karen Murphy ◽  
Mahwish Arif ◽  
Dimitrios S. Nikolopoulos

Author(s):  
Shanshan Yu ◽  
Jindian Su ◽  
Pengfei Li ◽  
Hao Wang

As a typical unsupervised learning method, the TextRank algorithm performs well for large-scale text mining, especially for automatic summarization or keyword extraction. However, TextRank only considers the similarities between sentences in the processes of automatic summarization and neglects information about text structure and context. To overcome these shortcomings, the authors propose an improved highly-scalable method, called iTextRank. When building a TextRank graph in their new method, the authors compute sentence similarities and adjust the weights of nodes by considering statistical and linguistic features, such as similarities in titles, paragraph structures, special sentences, sentence positions and lengths. Their analysis shows that the time complexity of iTextRank is comparable with TextRank. More importantly, two experiments show that iTextRank has a higher accuracy and lower recall rate than TextRank, and it is as effective as several popular online automatic summarization systems.


Sign in / Sign up

Export Citation Format

Share Document