Graph-based biomedical text summarization: An itemset mining and sentence clustering approach

The expanding amount of text-based biomedical information has prompted mining valuable or intriguing frequent patterns (words/terms) from extremely massive content, which is still a very challenging task. In the chapter, the authors have conceived a practical methodology for text mining dependent on the frequent item sets. This chapter presents a strategy utilizing item set mining graph-based summarization for summing up biomedical literature. They address the difficulties of recognizing important subjects or concepts in the given biomedical document text and display the relations between the strings by choosing the high pertinent lines from biomedical literature using apriori itemset mining algorithm. This method utilizes essential criteria to distinguish the significant concepts, events, for example, the fundamental subjects of the input record. These sentences are determined as exceptionally educational, applicable, and chosen to create the final summary.

Download Full-text

Word Embedding-Based Biomedical Text Summarization

Advances in Intelligent Systems and Computing - Emerging Trends in Intelligent Computing and Informatics ◽

10.1007/978-3-030-33582-3_28 ◽

2019 ◽

pp. 288-297

Author(s):

Oussama Rouane ◽

Hacene Belhadef ◽

Mustapha Bouakkaz

Keyword(s):

Text Summarization ◽

Word Embedding ◽

Biomedical Text

Download Full-text

Clinical Context–Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation (Preprint)

10.2196/preprints.19810 ◽

2020 ◽

Author(s):

Muhammad Afzal ◽

Fakhare Alam ◽

Khalid Mahmood Malik ◽

Ghaus M Malik

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Neural Network ◽

Text Summarization ◽

Biomedical Literature ◽

Biomedical Text ◽

Context Aware ◽

Clinical Context ◽

Jaccard Similarity ◽

Recognition Model

BACKGROUND Automatic text summarization (ATS) enables users to retrieve meaningful evidence from big data of biomedical repositories to make complex clinical decisions. Deep neural and recurrent networks outperform traditional machine-learning techniques in areas of natural language processing and computer vision; however, they are yet to be explored in the ATS domain, particularly for medical text summarization. OBJECTIVE Traditional approaches in ATS for biomedical text suffer from fundamental issues such as an inability to capture clinical context, quality of evidence, and purpose-driven selection of passages for the summary. We aimed to circumvent these limitations through achieving precise, succinct, and coherent information extraction from credible published biomedical resources, and to construct a simplified summary containing the most informative content that can offer a review particular to clinical needs. METHODS In our proposed approach, we introduce a novel framework, termed Biomed-Summarizer, that provides quality-aware Patient/Problem, Intervention, Comparison, and Outcome (PICO)-based intelligent and context-enabled summarization of biomedical text. Biomed-Summarizer integrates the prognosis quality recognition model with a clinical context–aware model to locate text sequences in the body of a biomedical article for use in the final summary. First, we developed a deep neural network binary classifier for quality recognition to acquire scientifically sound studies and filter out others. Second, we developed a bidirectional long-short term memory recurrent neural network as a clinical context–aware classifier, which was trained on semantically enriched features generated using a word-embedding tokenizer for identification of meaningful sentences representing PICO text sequences. Third, we calculated the similarity between query and PICO text sequences using Jaccard similarity with semantic enrichments, where the semantic enrichments are obtained using medical ontologies. Last, we generated a representative summary from the high-scoring PICO sequences aggregated by study type, publication credibility, and freshness score. RESULTS Evaluation of the prognosis quality recognition model using a large dataset of biomedical literature related to intracranial aneurysm showed an accuracy of 95.41% (2562/2686) in terms of recognizing quality articles. The clinical context–aware multiclass classifier outperformed the traditional machine-learning algorithms, including support vector machine, gradient boosted tree, linear regression, K-nearest neighbor, and naïve Bayes, by achieving 93% (16127/17341) accuracy for classifying five categories: aim, population, intervention, results, and outcome. The semantic similarity algorithm achieved a significant Pearson correlation coefficient of 0.61 (0-1 scale) on a well-known BIOSSES dataset (with 100 pair sentences) after semantic enrichment, representing an improvement of 8.9% over baseline Jaccard similarity. Finally, we found a highly positive correlation among the evaluations performed by three domain experts concerning different metrics, suggesting that the automated summarization is satisfactory. CONCLUSIONS By employing the proposed method Biomed-Summarizer, high accuracy in ATS was achieved, enabling seamless curation of research evidence from the biomedical literature to use for clinical decision-making.

Download Full-text

Biomedical Text Summarization: A Graph-Based Ranking Approach

Advances in Intelligent Systems and Computing - Applied Information Processing Systems ◽

10.1007/978-981-16-2008-9_14 ◽

2021 ◽

pp. 147-156

Author(s):

Supriya Gupta ◽

Aakanksha Sharaff ◽

Naresh Kumar Nagwani

Keyword(s):

Text Summarization ◽

Biomedical Text

Download Full-text

Enhancing Biomedical Text Summarization Using Semantic Relation Extraction

PLoS ONE ◽

10.1371/journal.pone.0023862 ◽

2011 ◽

Vol 6 (8) ◽

pp. e23862 ◽

Cited By ~ 17

Author(s):

Yue Shang ◽

Yanpeng Li ◽

Hongfei Lin ◽

Zhihao Yang

Keyword(s):

Relation Extraction ◽

Semantic Relation ◽

Text Summarization ◽

Biomedical Text

Download Full-text

Deep contextualized embeddings for quantifying the informative content in biomedical text summarization

Computer Methods and Programs in Biomedicine ◽

10.1016/j.cmpb.2019.105117 ◽

2020 ◽

Vol 184 ◽

pp. 105117 ◽

Cited By ~ 4

Author(s):

Milad Moradi ◽

Georg Dorffner ◽

Matthias Samwald

Keyword(s):

Text Summarization ◽

Biomedical Text ◽

Informative Content

Download Full-text

Extractive Based Single Document Text Summarization Using Clustering Approach

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v3.i2.pp73-78 ◽

2014 ◽

Vol 3 (2) ◽

pp. 73

Author(s):

Pankaj Kailas Bhole ◽

A. J. Agrawal

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Computational Intelligence ◽

Clustering Algorithm ◽

Text Summarization ◽

Large Set ◽

Meaningful Information ◽

Time Reading ◽

Clustering Approach

Text summarization is an old challenge in text mining but in dire need of researcher’s attention in the areas of computational intelligence, machine learning and natural language processing. We extract a set of features from each sentence that helps identify its importance in the document. Every time reading full text is time consuming. Clustering approach is useful to decide which type of data present in document. In this paper we introduce the concept of k-mean clustering for natural language processing of text for word matching and in order to extract meaningful information from large set of offline documents, data mining document clustering algorithm are adopted.

Download Full-text

Clinical Context–Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation

Journal of Medical Internet Research ◽

10.2196/19810 ◽

2020 ◽

Vol 22 (10) ◽

pp. e19810

Author(s):

Muhammad Afzal ◽

Fakhare Alam ◽

Khalid Mahmood Malik ◽

Ghaus M Malik

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Neural Network ◽

Text Summarization ◽

Biomedical Literature ◽

Biomedical Text ◽

Context Aware ◽

Clinical Context ◽

Jaccard Similarity ◽

Recognition Model

Background Automatic text summarization (ATS) enables users to retrieve meaningful evidence from big data of biomedical repositories to make complex clinical decisions. Deep neural and recurrent networks outperform traditional machine-learning techniques in areas of natural language processing and computer vision; however, they are yet to be explored in the ATS domain, particularly for medical text summarization. Objective Traditional approaches in ATS for biomedical text suffer from fundamental issues such as an inability to capture clinical context, quality of evidence, and purpose-driven selection of passages for the summary. We aimed to circumvent these limitations through achieving precise, succinct, and coherent information extraction from credible published biomedical resources, and to construct a simplified summary containing the most informative content that can offer a review particular to clinical needs. Methods In our proposed approach, we introduce a novel framework, termed Biomed-Summarizer, that provides quality-aware Patient/Problem, Intervention, Comparison, and Outcome (PICO)-based intelligent and context-enabled summarization of biomedical text. Biomed-Summarizer integrates the prognosis quality recognition model with a clinical context–aware model to locate text sequences in the body of a biomedical article for use in the final summary. First, we developed a deep neural network binary classifier for quality recognition to acquire scientifically sound studies and filter out others. Second, we developed a bidirectional long-short term memory recurrent neural network as a clinical context–aware classifier, which was trained on semantically enriched features generated using a word-embedding tokenizer for identification of meaningful sentences representing PICO text sequences. Third, we calculated the similarity between query and PICO text sequences using Jaccard similarity with semantic enrichments, where the semantic enrichments are obtained using medical ontologies. Last, we generated a representative summary from the high-scoring PICO sequences aggregated by study type, publication credibility, and freshness score. Results Evaluation of the prognosis quality recognition model using a large dataset of biomedical literature related to intracranial aneurysm showed an accuracy of 95.41% (2562/2686) in terms of recognizing quality articles. The clinical context–aware multiclass classifier outperformed the traditional machine-learning algorithms, including support vector machine, gradient boosted tree, linear regression, K-nearest neighbor, and naïve Bayes, by achieving 93% (16127/17341) accuracy for classifying five categories: aim, population, intervention, results, and outcome. The semantic similarity algorithm achieved a significant Pearson correlation coefficient of 0.61 (0-1 scale) on a well-known BIOSSES dataset (with 100 pair sentences) after semantic enrichment, representing an improvement of 8.9% over baseline Jaccard similarity. Finally, we found a highly positive correlation among the evaluations performed by three domain experts concerning different metrics, suggesting that the automated summarization is satisfactory. Conclusions By employing the proposed method Biomed-Summarizer, high accuracy in ATS was achieved, enabling seamless curation of research evidence from the biomedical literature to use for clinical decision-making.

Download Full-text