Automatic Text Summarization and Keyword Extraction using Natural Language Processing

The automatic text summarizing task is one of the most complex problems in the field of natural language processing. In this dissertation, we present the abstraction-based summarization approach which allows to paraphrase the original text and generate new sentences. Creation of new formulations, completely different from the original text is similar to how humans summarize texts. To achieve this, we propose the deep learning method using Sequence to Sequence architecture with the attention mechanism. The goal is to create the model for Polish language, using dataset containing over 200,000 articles from Polish websites, split into text and summary parts. Presented outcomes look promising, obtaining decent results utilizing standard metrics for such type of task.Based on review of prior research done during experiments, this is the very first attempt of applying abstractive text summarization techniques for Polish language.

Download Full-text

A Review Paper on Automatic Text Summarization in Indonesia Language

International Journal of Emerging Technology and Advanced Engineering ◽

10.46338/ijetae0821_11 ◽

2021 ◽

Vol 11 (8) ◽

pp. 89-96

Author(s):

Nurul Khotimah ◽

◽

Adi Wibowo P ◽

Bryan Andreas ◽

Abba Suganda Girsang

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Hybrid Approach ◽

Algebraic Approach ◽

Text Summarization ◽

Automatic Text Summarization ◽

Machine Learning Approach ◽

Fuzzy Logic Approach ◽

Logic Approach

Text summarization is one problem in natural language processing that generates a brief version of the original document. This research took attention for some researchers in this last decade and growing fast, including Indonesia language. This paper aims to recap summarization text research especially in Indonesia language. As usual, this paper discusses two summarization approaches, extractive and abstractive. In fact, the number of research of extractive is more than abstractive. This paper investigates some methods such as Statistical Based Approach, Graph Based Approach, Machine Learning Approach, Fuzzy Logic Approach, Algebraic Approach, and Hybrid Approach. This paper shows some methods details and summarize the results. Keywords— Text summarization, extractive summary, abstractive summary, natural language processing

Download Full-text

An Exhaustive Survey on Automatic Text Summarization Using Machine Learning Approches

Webology ◽

10.14704/web/v18si05/web18299 ◽

2021 ◽

Vol 18 (05) ◽

pp. 1184-1190

Author(s):

Abinaya N ◽

Anand R ◽

Arunkumar T ◽

Sameema Begam S

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Language Processing ◽

Document Analysis ◽

Text Summarization ◽

Learning Approaches ◽

Automatic Text Summarization ◽

Grammatical Errors ◽

E Learning ◽

Automatic Text

Automatic Text Summarization (ATS) is the key challenge in the area of Natural Language Processing (NLP). It deals with generalizing a summary from a given text without losing the vital information. This is a contemporary area because of exponential content growth in internet and applied in summarizing the content available in books, newsletters, internal document analysis, patent research, e-learning etc. Various machine learning approaches are used in order to achieve the performance of human-generated summaries. The system fails to perform at few areas like checking grammatical errors and paraphrasing the sentences after the summary creation. This work provides a brief view on methods and approaches used in ATS.

Download Full-text

Keyword extraction method for machine reading comprehension based on natural language processing

Journal of Physics Conference Series ◽

10.1088/1742-6596/1955/1/012072 ◽

2021 ◽

Vol 1955 (1) ◽

pp. 012072

Author(s):

Ruiheng Li ◽

Xuan Zhang ◽

Chengdong Li ◽

Zhongju Zheng ◽

Zihang Zhou ◽

...

Keyword(s):

Reading Comprehension ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Extraction Method ◽

Keyword Extraction ◽

Machine Reading

Download Full-text

Natural Language Processing (NLP) based Text Summarization - A Survey

2021 6th International Conference on Inventive Computation Technologies (ICICT) ◽

10.1109/icict50816.2021.9358703 ◽

2021 ◽

Author(s):

Ishitva Awasthi ◽

Kuntal Gupta ◽

Prabjot Singh Bhogal ◽

Sahejpreet Singh Anand ◽

Piyush Kumar Soni

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Summarization

Download Full-text

Development of algorithm for classification smoking status from unstructured bilingual electronic health records based on natural language processing (Preprint)

10.2196/preprints.26978 ◽

2021 ◽

Author(s):

Ye Seul Bae ◽

Kyung Hwan Kim ◽

Han Kyul Kim ◽

Sae Won Choi ◽

Taehoon Ko ◽

...

Keyword(s):

Natural Language Processing ◽

Electronic Health Records ◽

Natural Language ◽

Language Processing ◽

Smoking Status ◽

Svm Classifier ◽

Keyword Extraction ◽

Health Records ◽

Clinical Notes ◽

Electronic Health

BACKGROUND Smoking is a major risk factor and important variable for clinical research, but there are few studies regarding automatic obtainment of smoking classification from unstructured bilingual electronic health records (EHR). OBJECTIVE We aim to develop an algorithm to classify smoking status based on unstructured EHRs using natural language processing (NLP). METHODS With acronym replacement and Python package Soynlp, we normalize 4,711 bilingual clinical notes. Each EHR notes was classified into 4 categories: current smokers, past smokers, never smokers, and unknown. Subsequently, SPPMI (Shifted Positive Point Mutual Information) is used to vectorize words in the notes. By calculating cosine similarity between these word vectors, keywords denoting the same smoking status are identified. RESULTS Compared to other keyword extraction methods (word co-occurrence-, PMI-, and NPMI-based methods), our proposed approach improves keyword extraction precision by as much as 20.0%. These extracted keywords are used in classifying 4 smoking statuses from our bilingual clinical notes. Given an identical SVM classifier, the extracted keywords improve the F1 score by as much as 1.8% compared to those of the unigram and bigram Bag of Words. CONCLUSIONS Our study shows the potential of SPPMI in classifying smoking status from bilingual, unstructured EHRs. Our current findings show how smoking information can be easily acquired and used for clinical practice and research.

Download Full-text

Prediction and Analysis of Extracting Relations using Spacy Model

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f8524.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 3281-3287

Keyword(s):

Natural Language ◽

Information Extraction ◽

Performance Measures ◽

Text Summarization ◽

Language Understanding ◽

Language Generation ◽

Automatic Text Summarization ◽

Structured Information ◽

Automatic Text ◽

F Measure

Text is an extremely rich resources of information. Each and every second, minutes, peoples are sending or receiving hundreds of millions of data. There are various tasks involved in NLP are machine learning, information extraction, information retrieval, automatic text summarization, question-answered system, parsing, sentiment analysis, natural language understanding and natural language generation. The information extraction is an important task which is used to find the structured information from unstructured or semi-structured text. The paper presents a methodology for extracting the relations of biomedical entities using spacy. The framework consists of following phases such as data creation, load and converting the data into spacy object, preprocessing, define the pattern and extract the relations. The dataset is downloaded from NCBI database which contains only the sentences. The created model evaluated with performance measures like precision, recall and f-measure. The model achieved 87% of accuracy in retrieving of entities relation.

Download Full-text

Leveraging Natural Language Processing Applications Using Machine Learning

Handbook of Research on Emerging Trends and Applications of Machine Learning - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-9643-1.ch016 ◽

2020 ◽

pp. 338-360

Author(s):

Janjanam Prabhudas ◽

C. H. Pradeep Reddy

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Summarization ◽

Feature Representation ◽

Learning Models ◽

Primary Focus ◽

And Performance

The enormous increase of information along with the computational abilities of machines created innovative applications in natural language processing by invoking machine learning models. This chapter will project the trends of natural language processing by employing machine learning and its models in the context of text summarization. This chapter is organized to make the researcher understand technical perspectives regarding feature representation and their models to consider before applying on language-oriented tasks. Further, the present chapter revises the details of primary models of deep learning, its applications, and performance in the context of language processing. The primary focus of this chapter is to illustrate the technical research findings and gaps of text summarization based on deep learning along with state-of-the-art deep learning models for TS.

Download Full-text

Automatic Text Summarization and Keyword Extraction using Natural Language Processing

Automatic Text Summarization from Unstructured Text using Natural Language Processing

A Survey of Distinctive Prominence of Automatic Text Summarization Techniques Using Natural Language Processing

Automatic summary of texts in Polish

A Review Paper on Automatic Text Summarization in Indonesia Language

An Exhaustive Survey on Automatic Text Summarization Using Machine Learning Approches

Keyword extraction method for machine reading comprehension based on natural language processing

Natural Language Processing (NLP) based Text Summarization - A Survey

Development of algorithm for classification smoking status from unstructured bilingual electronic health records based on natural language processing (Preprint)

Prediction and Analysis of Extracting Relations using Spacy Model

Leveraging Natural Language Processing Applications Using Machine Learning

Export Citation Format