Automatic Text Summarization from Unstructured Text using Natural Language Processing

The automatic text summarizing task is one of the most complex problems in the field of natural language processing. In this dissertation, we present the abstraction-based summarization approach which allows to paraphrase the original text and generate new sentences. Creation of new formulations, completely different from the original text is similar to how humans summarize texts. To achieve this, we propose the deep learning method using Sequence to Sequence architecture with the attention mechanism. The goal is to create the model for Polish language, using dataset containing over 200,000 articles from Polish websites, split into text and summary parts. Presented outcomes look promising, obtaining decent results utilizing standard metrics for such type of task.Based on review of prior research done during experiments, this is the very first attempt of applying abstractive text summarization techniques for Polish language.

Download Full-text

A Review Paper on Automatic Text Summarization in Indonesia Language

International Journal of Emerging Technology and Advanced Engineering ◽

10.46338/ijetae0821_11 ◽

2021 ◽

Vol 11 (8) ◽

pp. 89-96

Author(s):

Nurul Khotimah ◽

◽

Adi Wibowo P ◽

Bryan Andreas ◽

Abba Suganda Girsang

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Hybrid Approach ◽

Algebraic Approach ◽

Text Summarization ◽

Automatic Text Summarization ◽

Machine Learning Approach ◽

Fuzzy Logic Approach ◽

Logic Approach

Text summarization is one problem in natural language processing that generates a brief version of the original document. This research took attention for some researchers in this last decade and growing fast, including Indonesia language. This paper aims to recap summarization text research especially in Indonesia language. As usual, this paper discusses two summarization approaches, extractive and abstractive. In fact, the number of research of extractive is more than abstractive. This paper investigates some methods such as Statistical Based Approach, Graph Based Approach, Machine Learning Approach, Fuzzy Logic Approach, Algebraic Approach, and Hybrid Approach. This paper shows some methods details and summarize the results. Keywords— Text summarization, extractive summary, abstractive summary, natural language processing

Download Full-text

An Exhaustive Survey on Automatic Text Summarization Using Machine Learning Approches

Webology ◽

10.14704/web/v18si05/web18299 ◽

2021 ◽

Vol 18 (05) ◽

pp. 1184-1190

Author(s):

Abinaya N ◽

Anand R ◽

Arunkumar T ◽

Sameema Begam S

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Language Processing ◽

Document Analysis ◽

Text Summarization ◽

Learning Approaches ◽

Automatic Text Summarization ◽

Grammatical Errors ◽

E Learning ◽

Automatic Text

Automatic Text Summarization (ATS) is the key challenge in the area of Natural Language Processing (NLP). It deals with generalizing a summary from a given text without losing the vital information. This is a contemporary area because of exponential content growth in internet and applied in summarizing the content available in books, newsletters, internal document analysis, patent research, e-learning etc. Various machine learning approaches are used in order to achieve the performance of human-generated summaries. The system fails to perform at few areas like checking grammatical errors and paraphrasing the sentences after the summary creation. This work provides a brief view on methods and approaches used in ATS.

Download Full-text

Natural Language Processing (NLP) based Text Summarization - A Survey

2021 6th International Conference on Inventive Computation Technologies (ICICT) ◽

10.1109/icict50816.2021.9358703 ◽

2021 ◽

Author(s):

Ishitva Awasthi ◽

Kuntal Gupta ◽

Prabjot Singh Bhogal ◽

Sahejpreet Singh Anand ◽

Piyush Kumar Soni

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Summarization

Download Full-text

Prediction and Analysis of Extracting Relations using Spacy Model

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f8524.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 3281-3287

Keyword(s):

Natural Language ◽

Information Extraction ◽

Performance Measures ◽

Text Summarization ◽

Language Understanding ◽

Language Generation ◽

Automatic Text Summarization ◽

Structured Information ◽

Automatic Text ◽

F Measure

Text is an extremely rich resources of information. Each and every second, minutes, peoples are sending or receiving hundreds of millions of data. There are various tasks involved in NLP are machine learning, information extraction, information retrieval, automatic text summarization, question-answered system, parsing, sentiment analysis, natural language understanding and natural language generation. The information extraction is an important task which is used to find the structured information from unstructured or semi-structured text. The paper presents a methodology for extracting the relations of biomedical entities using spacy. The framework consists of following phases such as data creation, load and converting the data into spacy object, preprocessing, define the pattern and extract the relations. The dataset is downloaded from NCBI database which contains only the sentences. The created model evaluated with performance measures like precision, recall and f-measure. The model achieved 87% of accuracy in retrieving of entities relation.

Download Full-text

Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation (Preprint)

10.2196/preprints.17376 ◽

2019 ◽

Cited By ~ 1

Author(s):

Sijia Liu ◽

Yanshan Wang ◽

Andrew Wen ◽

Liwei Wang ◽

Na Hong ◽

...

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Model ◽

Structured Data ◽

Common Data Model ◽

Concept System ◽

Unstructured Text ◽

Electronic Health

BACKGROUND Widespread adoption of electronic health records has enabled the secondary use of electronic health record data for clinical research and health care delivery. Natural language processing techniques have shown promise in their capability to extract the information embedded in unstructured clinical data, and information retrieval techniques provide flexible and scalable solutions that can augment natural language processing systems for retrieving and ranking relevant records. OBJECTIVE In this paper, we present the implementation of a cohort retrieval system that can execute textual cohort selection queries on both structured data and unstructured text—Cohort Retrieval Enhanced by Analysis of Text from Electronic Health Records (CREATE). METHODS CREATE is a proof-of-concept system that leverages a combination of structured queries and information retrieval techniques on natural language processing results to improve cohort retrieval performance using the Observational Medical Outcomes Partnership Common Data Model to enhance model portability. The natural language processing component was used to extract common data model concepts from textual queries. We designed a hierarchical index to support the common data model concept search utilizing information retrieval techniques and frameworks. RESULTS Our case study on 5 cohort identification queries, evaluated using the precision at 5 information retrieval metric at both the patient-level and document-level, demonstrates that CREATE achieves a mean precision at 5 of 0.90, which outperforms systems using only structured data or only unstructured text with mean precision at 5 values of 0.54 and 0.74, respectively. CONCLUSIONS The implementation and evaluation of Mayo Clinic Biobank data demonstrated that CREATE outperforms cohort retrieval systems that only use one of either structured data or unstructured text in complex textual cohort queries.

Download Full-text

Leveraging Natural Language Processing Applications Using Machine Learning

Handbook of Research on Emerging Trends and Applications of Machine Learning - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-9643-1.ch016 ◽

2020 ◽

pp. 338-360

Author(s):

Janjanam Prabhudas ◽

C. H. Pradeep Reddy

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Summarization ◽

Feature Representation ◽

Learning Models ◽

Primary Focus ◽

And Performance

The enormous increase of information along with the computational abilities of machines created innovative applications in natural language processing by invoking machine learning models. This chapter will project the trends of natural language processing by employing machine learning and its models in the context of text summarization. This chapter is organized to make the researcher understand technical perspectives regarding feature representation and their models to consider before applying on language-oriented tasks. Further, the present chapter revises the details of primary models of deep learning, its applications, and performance in the context of language processing. The primary focus of this chapter is to illustrate the technical research findings and gaps of text summarization based on deep learning along with state-of-the-art deep learning models for TS.

Download Full-text

Text Summarization Using Natural Language Processing

Information and Communication Technology for Competitive Strategies (ICTCS 2020) - Lecture Notes in Networks and Systems ◽

10.1007/978-981-16-0739-4_62 ◽

2021 ◽

pp. 653-663

Author(s):

G. Sreenivasulu ◽

N. Thulasi Chitra ◽

B. Sujatha ◽

K. Venu Madhav

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Summarization

Download Full-text