A Query-Oriented Summarization System for XML Elements

Purpose Query-based summarization approaches might not be able to provide summaries compatible with the user’s information need, as they mostly rely on a limited source of information, usually represented as a single query by the user. This issue becomes even more challenging when dealing with scientific documents, as they contain more specific subject-related terms, while the user may not be able to express his/her specific information need in a query with limited terms. This study aims to propose an interactive multi-document text summarization approach that generates an eligible summary that is more compatible with the user’s information need. This approach allows the user to interactively specify the composition of a multi-document summary. Design/methodology/approach This approach exploits the user’s opinion in two stages. The initial query is refined by user-selected keywords/keyphrases and complete sentences extracted from the set of retrieved documents. It is followed by a novel method for sentence expansion using the genetic algorithm, and ranking the final set of sentences using the maximal marginal relevance method. Basically, for implementation, the Web of Science data set in the artificial intelligence (AI) category is considered. Findings The proposed approach receives feedback from the user in terms of favorable keywords and sentences. The feedback eventually improves the summary as the end. To assess the performance of the proposed system, this paper has asked 45 users who were graduate students in the field of AI to fill out a questionnaire. The quality of the final summary has been also evaluated from the user’s perspective and information redundancy. It has been investigated that the proposed approach leads to higher degrees of user satisfaction compared to the ones with no or only one step of the interaction. Originality/value The interactive summarization approach goes beyond the initial user’s query, while it includes the user’s preferred keywords/keyphrases and sentences through a systematic interaction. With respect to these interactions, the system gives the user a more clear idea of the information he/she is looking for and consequently adjusting the final result to the ultimate information need. Such interaction allows the summarization system to achieve a comprehensive understanding of the user’s information needs while expanding context-based knowledge and guiding the user toward his/her information journey.

Download Full-text

Document vector embedding based extractive text summarization system for Hindi and English text

Applied Intelligence ◽

10.1007/s10489-021-02871-9 ◽

2022 ◽

Author(s):

Ruby Rani ◽

D. K. Lobiyal

Keyword(s):

Text Summarization ◽

English Text ◽

Document Vector ◽

Summarization System

Download Full-text

Text Summarization System: An Extractive Approach using Hierarchical Text Clustering

International Journal of Computer Applications ◽

10.5120/ijca2021921015 ◽

2021 ◽

Vol 174 (23) ◽

pp. 15-19

Author(s):

Francisca O. Oladipo ◽

Abdulaziz Baba-Ali Ohiani

Keyword(s):

Text Clustering ◽

Text Summarization ◽

Summarization System

Download Full-text

A web-trained extraction summarization system

10.3115/1073445.1073482 ◽

2003 ◽

Cited By ~ 7

Author(s):

Liang Zhou ◽

Eduard Hovy

Keyword(s):

Summarization System

Download Full-text

Automatic Text Summarization of COVID-19 Research Articles Using Recurrent Neural Networks and Coreference Resolution

Frontiers in Biomedical Technologies ◽

10.18502/fbt.v7i4.5321 ◽

2021 ◽

Author(s):

Mahsa Afsharizadeh ◽

Hossein Ebrahimpour-Komleh ◽

Ayoub Bagheri

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Text Summarization ◽

Medical Community ◽

Research Articles ◽

Coreference Resolution ◽

Extractive Summarization ◽

Automatic Text Summarization ◽

Summarization System ◽

Automatic Text

Purpose: Pandemic COVID-19 has created an emergency for the medical community. Researchers require extensive study of scientific literature in order to discover drugs and vaccines. In this situation where every minute is valuable to save the lives of hundreds of people, a quick understanding of scientific articles will help the medical community. Automatic text summarization makes this possible. Materials and Methods: In this study, a recurrent neural network-based extractive summarization is proposed. The extractive method identifies the informative parts of the text. Recurrent neural network is very powerful for analyzing sequences such as text. The proposed method has three phases: sentence encoding, sentence ranking, and summary generation. To improve the performance of the summarization system, a coreference resolution procedure is used. Coreference resolution identifies the mentions in the text that refer to the same entity in the real world. This procedure helps to summarization process by discovering the central subject of the text. Results: The proposed method is evaluated on the COVID-19 research articles extracted from the CORD-19 dataset. The results show that the combination of using recurrent neural network and coreference resolution embedding vectors improves the performance of the summarization system. The Proposed method by achieving the value of ROUGE1-recall 0.53 demonstrates the improvement of summarization performance by using coreference resolution embedding vectors in the RNN-based summarization system. Conclusion: In this study, coreference information is stored in the form of coreference embedding vectors. Jointly use of recurrent neural network and coreference resolution results in an efficient summarization system.

Download Full-text

INFORMATION OVERLAP IN MULTILINGUAL WIKIPEDIA AND SUMMARIZATION

International Journal of Cooperative Information Systems ◽

10.1142/s0218843013500019 ◽

2012 ◽

Vol 21 (04) ◽

pp. 383-403 ◽

Cited By ~ 1

Author(s):

ELENA FILATOVA

Keyword(s):

Question Answering ◽

Human Subjects ◽

Amazon Mechanical Turk ◽

Information Selection ◽

Training Corpus ◽

Selection Systems ◽

Pyramid Model ◽

Selection Tasks ◽

Summarization System

Wikipedia is used as a training corpus for many information selection tasks: summarization, question-answering, etc. The information presented in Wikipedia articles as well as the order in which this information is presented, is treated as the gold standard and is used for improving the quality of information selection systems. However, the Wikipedia articles corresponding to the same entry (person, location, event, etc.) written in different languages have substantial differences regarding what information is included in these articles. In this paper we analyze the regularities of information overlap among the articles about the same Wikipedia entry written in different languages: some information facts are covered in the Wikipedia articles in many languages, while others are covered only in a few languages. We introduce a hypothesis that the structure of this information overlap is similar to the information overlap structure (pyramid model) used in summarization evaluation, as well as the information overlap/repetition structure used to identify important information for multidocument summarization. We prove the correctness of our hypothesis by building a summarization system according to the presented information overlap hypothesis. This system summarizes English Wikipedia articles given the articles about the same Wikipedia entries written in other languages. To evaluate the quality of the created summaries, we use Amazon Mechanical Turk as the source of human subjects who can reliably judge the quality of the created text. We also compare the summaries generated according to the information overlap hypothesis against the lead line baseline which is considered to be the most reliable way to generate summaries of Wikipedia articles. The summarization experiment proves the correctness of the introduced multilingual Wikipedia information overlap hypothesis.

Download Full-text

Automatic Summarization of Open-Domain Multiparty Dialogues in Diverse Genres

Computational Linguistics ◽

10.1162/089120102762671945 ◽

2002 ◽

Vol 28 (4) ◽

pp. 447-485 ◽

Cited By ~ 42

Author(s):

Klaus Zechner

Keyword(s):

Average Duration ◽

Research Area ◽

Global Evaluation ◽

Ranking Algorithm ◽

Open Domain ◽

Automatic Summarization ◽

Term Weighting ◽

Written Text ◽

Summarization System ◽

New Research

Automatic summarization of open-domain spoken dialogues is a relatively new research area. This article introduces the task and the challenges involved and motivates and presents an approach for obtaining automatic-extract summaries for human transcripts of multiparty dialogues of four different genres, without any restriction on domain. We address the following issues, which are intrinsic to spoken-dialogue summarization and typically can be ignored when summarizing written text such as news wire data: (1) detection and removal of speech disfluencies; (2) detection and insertion of sentence boundaries; and (3) detection and linking of cross-speaker information units (question-answer pairs). A system evaluation is performed using a corpus of 23 dialogue excerpts with an average duration of about 10 minutes, comprising 80 topical segments and about 47,000 words total. The corpus was manually annotated for relevant text spans by six human annotators. The global evaluation shows that for the two more informal genres, our summarization system using dialogue-specific components significantly outperforms two baselines: (1) a maximum-marginal-relevance ranking algorithm using TF*IDF term weighting, and (2) a LEAD baseline that extracts the first n words from a text.

Download Full-text

Experiments on Applying a Text Summarization System for Question Answering

Evaluation of Multilingual and Multi-modal Information Retrieval - Lecture Notes in Computer Science ◽

10.1007/978-3-540-74999-8_44 ◽

2007 ◽

pp. 372-376

Author(s):

Pedro Paulo Balage Filho ◽

Vinícius Rodrigues de Uzêda ◽

Thiago Alexandre Salgueiro Pardo ◽

Maria das Graças Volpe Nunes

Keyword(s):

Question Answering ◽

Text Summarization ◽

Summarization System

Download Full-text

A Query-Oriented Summarization System for XML Elements

Multiple Text Document Summarization System using hybrid Summarization technique

Feature Based Summarization System for E-Commerce Based Products by Using Customerss Reviews

An interactive query-based approach for summarizing scientific documents

Document vector embedding based extractive text summarization system for Hindi and English text

Text Summarization System: An Extractive Approach using Hierarchical Text Clustering

A web-trained extraction summarization system

Automatic Text Summarization of COVID-19 Research Articles Using Recurrent Neural Networks and Coreference Resolution

INFORMATION OVERLAP IN MULTILINGUAL WIKIPEDIA AND SUMMARIZATION

Automatic Summarization of Open-Domain Multiparty Dialogues in Diverse Genres

Experiments on Applying a Text Summarization System for Question Answering

Export Citation Format