Text document summarization using word embedding

2020 ◽  
Vol 143 ◽  
pp. 112958 ◽  
Author(s):  
Mudasir Mohd ◽  
Rafiya Jan ◽  
Muzaffar Shah
2021 ◽  
Author(s):  
Jingyi You ◽  
◽  
Chenlong Hu ◽  
Hidetaka Kamigaito ◽  
Hiroya Takamura ◽  
...  

Author(s):  
Gabriel Silva ◽  
Rafael Ferreira ◽  
Rafael Dueire Lins ◽  
Luciano Cabral ◽  
Hilário Oliveira ◽  
...  

Author(s):  
Sandhya P. ◽  
Mahek Laxmikant Kantesaria

Named entity recognition (NER) is a subtask of the information extraction. NER system reads the text and highlights the entities. NER will separate different entities according to the project. NER is the process of two steps. The steps are detection of names and classifications of them. The first step is further divided into the segmentation. The second step will consist to choose an ontology which will organize the things categorically. Document summarization is also called automatic summarization. It is a process in which the text document with the help of software will create a summary by selecting the important points of the original text. In this chapter, the authors explain how document summarization is performed using named entity recognition. They discuss about the different types of summarization techniques. They also discuss about how NER works and its applications. The libraries available for NER-based information extraction are explained. They finally explain how NER is applied into document summarization.


Author(s):  
Chandra Yadav ◽  
Aditi Sharan

Automatic text document summarization is active research area in text mining field. In this article, the authors are proposing two new approaches (three models) for sentence selection, and a new entropy-based summary evaluation criteria. The first approach is based on the algebraic model, Singular Value Decomposition (SVD), i.e. Latent Semantic Analysis (LSA) and model is termed as proposed_model-1, and Second Approach is based on entropy that is further divided into proposed_model-2 and proposed_model-3. In first proposed model, the authors are using right singular matrix, and second & third proposed models are based on Shannon entropy. The advantage of these models is that these are not a Length dominating model, giving better results, and low redundancy. Along with these three new models, an entropy-based summary evaluation criteria is proposed and tested. They are also showing that their entropy based proposed models statistically closer to DUC-2002's standard/gold summary. In this article, the authors are using a dataset taken from Document Understanding Conference-2002.


Sign in / Sign up

Export Citation Format

Share Document