A Novel Approach to Multi-document Summarization

Author(s):  
Li-Qing Qiu ◽  
Bin Pang ◽  
Sai-Qun Lin ◽  
Peng Chen
Author(s):  
Uri Mirchev ◽  
Mark Last

Automatic multi-document summarization is aimed at recognizing important text content in a collection of topic-related documents and representing it in the form of a short abstract or extract. This chapter presents a novel approach to the multi-document summarization problem, focusing on the generic summarization task. The proposed SentRel (Sentence Relations) multi-document summarization algorithm assigns importance scores to documents and sentences in a collection based on two aspects: static and dynamic. In the static aspect, the significance score is recursively inferred from a novel, tripartite graph representation of the text corpus. In the dynamic aspect, the significance score is continuously refined with respect to the current summary content. The resulting summary is generated in the form of complete sentences exactly as they appear in the summarized documents, ensuring the summary's grammatical correctness. The proposed algorithm is evaluated on the TAC 2011 dataset using DUC 2001 for training and DUC 2004 for parameter tuning. The SentRel ROUGE-1 and ROUGE-2 scores are comparable to state-of-the-art summarization systems, which require a different set of textual entities.


2016 ◽  
Vol 25 (01) ◽  
pp. 1660002 ◽  
Author(s):  
Guangbing Yang

Oft-decried information overload is a serious problem that negatively impacts the comprehension of information in the digital age. Text summarization is a helpful process that can be used to alleviate this problem. With the aim of seeking a novel method to enhance the performance of multi-document summarization, this study proposes a novel approach to analyze the problem of multi-document summarization based on a mixture model, consisting of a contextual topic model from a Bayesian hierarchical topic modeling family for selecting candidate summary sentences, and a regression model in machine learning for generating the summary. By investigating hierarchical topics and their correlations with respect to the lexical co-occurrences of words, the proposed contextual topic model can determine the relevance of sentences more effectively, recognize latent topics, and arrange them hierarchically. The quantitative evaluation results from a practical application demonstrates that a system implementing this model can significantly improve the performance of summarization and make it comparable to state-of-the-art summarization systems.


Author(s):  
Nada A. Dief ◽  
Ali E. Al-Desouky ◽  
Amr Aly Eldin ◽  
Asmaa M. El-Said

Due to the increasing accessibility of online data and the availability of thousands of documents on the Internet, it becomes very difficult for a human to review and analyze each document manually. The sheer size of such documents and data presents a significant challenge for users. Providing automatic summaries of specific topics helps the users to overcome this problem. Most of the current extractive multi-document summarization systems can successfully extract summary sentences; however, many limitations exist which include the degree of redundancy, inaccurate extraction of important sentences, low coverage and poor coherence among the selected sentences. This paper introduces an adaptive extractive multi-document generic (EMDG) methodology for automatic text summarization. The framework of this methodology relies on a novel approach for sentence similarity measure, a discriminative sentence selection method for sentence scoring and a reordering technique for the extracted sentences after removing the redundant ones. Extensive experiments are done on the summarization benchmark datasets DUC2005, DUC2006 and DUC2007. This proves that the proposed EMDG methodology is more effective than the current extractive multi-document summarization systems. Rouge evaluation for automatic summarization is used to validate the proposed EMDG methodology, and the experimental results showed that it is more effective and outperforms the baseline techniques, where the generated summary is characterized by high coverage and cohesion.


2005 ◽  
Vol 11 (1) ◽  
pp. 67-86 ◽  
Author(s):  
RIE ANDO ◽  
BRANIMIR BOGURAEV ◽  
ROY BYRD ◽  
MARY NEFF

This paper describes a novel approach to multi-document summarization, which explicitly addresses the problem of detecting, and retaining for the summary, multiple themes in document collections. We place equal emphasis on the processes of theme identification and theme presentation. For the former, we apply Iterative Residual Rescaling (IRR); for the latter, we argue for graphical display elements. IRR is an algorithm designed to account for correlations between words and to construct multi-dimensional topical space indicative of relationships among linguistic objects (documents, phrases, and sentences). Summaries are composed of objects with certain properties, derived by exploiting the many-to-many relationships in such a space. Given their inherent complexity, our multi-faceted summaries benefit from a visualization environment. We discuss some essential features of such an environment.


2019 ◽  
Vol 476 (24) ◽  
pp. 3705-3719 ◽  
Author(s):  
Avani Vyas ◽  
Umamaheswar Duvvuri ◽  
Kirill Kiselyov

Platinum-containing drugs such as cisplatin and carboplatin are routinely used for the treatment of many solid tumors including squamous cell carcinoma of the head and neck (SCCHN). However, SCCHN resistance to platinum compounds is well documented. The resistance to platinum has been linked to the activity of divalent transporter ATP7B, which pumps platinum from the cytoplasm into lysosomes, decreasing its concentration in the cytoplasm. Several cancer models show increased expression of ATP7B; however, the reason for such an increase is not known. Here we show a strong positive correlation between mRNA levels of TMEM16A and ATP7B in human SCCHN tumors. TMEM16A overexpression and depletion in SCCHN cell lines caused parallel changes in the ATP7B mRNA levels. The ATP7B increase in TMEM16A-overexpressing cells was reversed by suppression of NADPH oxidase 2 (NOX2), by the antioxidant N-Acetyl-Cysteine (NAC) and by copper chelation using cuprizone and bathocuproine sulphonate (BCS). Pretreatment with either chelator significantly increased cisplatin's sensitivity, particularly in the context of TMEM16A overexpression. We propose that increased oxidative stress in TMEM16A-overexpressing cells liberates the chelated copper in the cytoplasm, leading to the transcriptional activation of ATP7B expression. This, in turn, decreases the efficacy of platinum compounds by promoting their vesicular sequestration. We think that such a new explanation of the mechanism of SCCHN tumors’ platinum resistance identifies novel approach to treating these tumors.


2020 ◽  
Vol 51 (3) ◽  
pp. 544-560 ◽  
Author(s):  
Kimberly A. Murphy ◽  
Emily A. Diehm

Purpose Morphological interventions promote gains in morphological knowledge and in other oral and written language skills (e.g., phonological awareness, vocabulary, reading, and spelling), yet we have a limited understanding of critical intervention features. In this clinical focus article, we describe a relatively novel approach to teaching morphology that considers its role as the key organizing principle of English orthography. We also present a clinical example of such an intervention delivered during a summer camp at a university speech and hearing clinic. Method Graduate speech-language pathology students provided a 6-week morphology-focused orthographic intervention to children in first through fourth grade ( n = 10) who demonstrated word-level reading and spelling difficulties. The intervention focused children's attention on morphological families, teaching how morphology is interrelated with phonology and etymology in English orthography. Results Comparing pre- and posttest scores, children demonstrated improvement in reading and/or spelling abilities, with the largest gains observed in spelling affixes within polymorphemic words. Children and their caregivers reacted positively to the intervention. Therefore, data from the camp offer preliminary support for teaching morphology within the context of written words, and the intervention appears to be a feasible approach for simultaneously increasing morphological knowledge, reading, and spelling. Conclusion Children with word-level reading and spelling difficulties may benefit from a morphology-focused orthographic intervention, such as the one described here. Research on the approach is warranted, and clinicians are encouraged to explore its possible effectiveness in their practice. Supplemental Material https://doi.org/10.23641/asha.12290687


2015 ◽  
Vol 21 ◽  
pp. 128
Author(s):  
Kaniksha Desai ◽  
Halis Akturk ◽  
Ana Maria Chindris ◽  
Shon Meek ◽  
Robert Smallridge ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document