scholarly journals Narzędzia do automatycznego streszczania tekstów w języku polskim. Stan badań naukowych i prac wdrożeniowych

e-mentor ◽  
2021 ◽  
Vol 89 (2) ◽  
pp. 67-77
Author(s):  
Piotr Glenc ◽  

The goal of the publication is to present the state of research and works carried out in Poland on the issue of automatic text summarization. The author describes principal theoretical and methodological issues related to automatic summary generation followed by the outline of the selected works on the automatic abstracting of Polish texts. The author also provides three examples of IT tools that generate summaries of texts in Polish (Summarize, Resoomer, and NICOLAS) and their characteristics derived from the conducted experiment, which included quality assessment of generated summaries using ROUGE-N metrics. The results of both actions showed a deficiency of tools allowing to automatically create summaries of Polish texts, especially in the abstractive approach. Most of the proposed solutions are based on the extractive method, which uses parts of the original text to create its abstract. There is also a shortage of tools generating one common summary of many text documents and specialized tools generating summaries of documents related to specific subject areas. Moreover, it is necessary to intensify works on creating the corpora of Polish-language text summaries, which the computer scientists could apply to evaluate their newly developed tools.

e-mentor ◽  
2021 ◽  
Vol 89 (2) ◽  
pp. 67-77
Author(s):  
Piotr Glenc ◽  

The goal of the publication is to present the state of research and works carried out in Poland on the issue of automatic text summarization. The author describes principal theoretical and methodological issues related to automatic summary generation followed by the outline of the selected works on the automatic abstracting of Polish texts. The author also provides three examples of IT tools that generate summaries of texts in Polish (Summarize, Resoomer, and NICOLAS) and their characteristics derived from the conducted experiment, which included quality assessment of generated summaries using ROUGE-N metrics. The results of both actions showed a deficiency of tools allowing to automatically create summaries of Polish texts, especially in the abstractive approach. Most of the proposed solutions are based on the extractive method, which uses parts of the original text to create its abstract. There is also a shortage of tools generating one common summary of many text documents and specialized tools generating summaries of documents related to specific subject areas. Moreover, it is necessary to intensify works on creating the corpora of Polish-language text summaries, which the computer scientists could apply to evaluate their newly developed tools.


2021 ◽  
Author(s):  
Sakdipat Ontoum ◽  
Jonathan H. Chan

By identifying and extracting relevant information from articles, automated text summarizing helps the scientific and medical sectors. Automatic text summarization is a way of compressing text documents so that users may find important information in the original text in less time. We will first review some new works in the field of summarizing that use deep learning approaches, and then we will explain the "COVID-19" summarization research papers. The ease with which a reader can grasp written text is referred to as the readability test. The substance of text determines its readability in natural language processing. We constructed word clouds using the abstract's most commonly used text. By looking at those three measurements, we can determine the mean of "ROUGE-1", "ROUGE-2", and "ROUGE-L". As a consequence, "Distilbart-mnli-12-6" and "GPT2-large" are outperform than other. <br>


2021 ◽  
Author(s):  
Sakdipat Ontoum ◽  
Jonathan H. Chan

By identifying and extracting relevant information from articles, automated text summarizing helps the scientific and medical sectors. Automatic text summarization is a way of compressing text documents so that users may find important information in the original text in less time. We will first review some new works in the field of summarizing that use deep learning approaches, and then we will explain the "COVID-19" summarization research papers. The ease with which a reader can grasp written text is referred to as the readability test. The substance of text determines its readability in natural language processing. We constructed word clouds using the abstract's most commonly used text. By looking at those three measurements, we can determine the mean of "ROUGE-1", "ROUGE-2", and "ROUGE-L". As a consequence, "Distilbart-mnli-12-6" and "GPT2-large" are outperform than other. <br>


Information ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 78 ◽  
Author(s):  
Tulu Tilahun Hailu ◽  
Junqing Yu ◽  
Tessfu Geteye Fantaye

Text summarization is a process of producing a concise version of text (summary) from one or more information sources. If the generated summary preserves meaning of the original text, it will help the users to make fast and effective decision. However, how much meaning of the source text can be preserved is becoming harder to evaluate. The most commonly used automatic evaluation metrics like Recall-Oriented Understudy for Gisting Evaluation (ROUGE) strictly rely on the overlapping n-gram units between reference and candidate summaries, which are not suitable to measure the quality of abstractive summaries. Another major challenge to evaluate text summarization systems is lack of consistent ideal reference summaries. Studies show that human summarizers can produce variable reference summaries of the same source that can significantly affect automatic evaluation metrics scores of summarization systems. Humans are biased to certain situation while producing summary, even the same person perhaps produces substantially different summaries of the same source at different time. This paper proposes a word embedding based automatic text summarization and evaluation framework, which can successfully determine salient top-n sentences of a source text as a reference summary, and evaluate the quality of systems summaries against it. Extensive experimental results demonstrate that the proposed framework is effective and able to outperform several baseline methods with regard to both text summarization systems and automatic evaluation metrics when tested on a publicly available dataset.


2020 ◽  
Vol 9 (2) ◽  
pp. 342
Author(s):  
Amal Alkhudari

Due to the wide spread information and the diversity of its sources, there is a need to produce an accurate text summary with the least time and effort. This summary must  preserve key information content and overall meaning of the original text. Text summarization is one of the most important applications of Natural Language Processing (NLP). The goal of automatic text summarization is to create summaries that are similar to human-created ones. However, in many cases, the readability of created summaries is not satisfactory,   because the summaries do not consider the meaning of the words and do not cover all the semantically relevant aspects of data. In this paper we use syntactic and semantic analysis to propose an automatic system of Arabic texts summarization. This system is capable of understanding the meaning of information and retrieves only the relevant part. The effectiveness and evaluation of the proposed work are demonstrated under EASC corpus using Rouge measure. The generated summaries will be compared against those done by human and precedent researches.  


2020 ◽  
Vol 17 (9) ◽  
pp. 4368-4374
Author(s):  
Perpetua F. Noronha ◽  
Madhu Bhan

Digital data in huge amount is being persistently generated at an unparalleled and exponential rate. In this digital era where internet stands the prime source for generating incredible information, it is vital to develop better means to mine the available information rapidly and most capably. Manual extraction of the salient information from the large input text documents is a time consuming and inefficient task. In this fast-moving world, it is difficult to read all the text-content and derive insights from it. Automatic methods are required. The task of probing for relevant documents from the large number of sources available, and consuming apt information from it is a challenging task and is need of the hour. Automatic text summarization technique can be used to generate relevant and quality information in less time. Text Summarization is used to condense the source text into a brief summary maintaining its salient information and readability. Generating summaries automatically is in great demand to attend to the growing and increasing amount of text data that is obtainable online in order to mark out the significant information and to consume it faster. Text summarization is becoming extremely popular with the advancement in Natural Language Processing (NLP) and deep learning methods. The most important gain of automatic text summarization is, it reduces the analysis time. In this paper we focus on key approaches to automatic text summarization and also about their efficiency and limitations.


Author(s):  
Mohamed Amine Boudia ◽  
Reda Mohamed Hamou ◽  
Abdelmalek Amine ◽  
Amine Rahmani

In this paper, the authors propose a new approach for automatic text summarization by extraction based on Saving Energy Function where the first step constitute to use two techniques of extraction: scoring of phrases, and similarity that aims to eliminate redundant phrases without losing the theme of the text. While the second step aims to optimize the results of the previous layer by the metaheuristic based on Bee Algorithm, the objective function of the optimization is to maximize the sum of similarity between phrases of the candidate summary in order to keep the theme of the text, minimize the sum of scores in order to increase the summarization rate, this optimization also will give a candidate's summary where the order of the phrases changes compared to the original text. The third and final layer aims to choose the best summary from the candidate summaries generated by bee optimization, the authors opted for the technique of voting with a simple majority.


2022 ◽  
Vol 15 (1) ◽  
pp. 1-18
Author(s):  
Krishnaveni P. ◽  
Balasundaram S. R.

The day-to-day growth of online information necessitates intensive research in automatic text summarization (ATS). The ATS software produces summary text by extracting important information from the original text. With the help of summaries, users can easily read and understand the documents of interest. Most of the approaches for ATS used only local properties of text. Moreover, the numerous properties make the sentence selection difficult and complicated. So this article uses a graph based summarization to utilize structural and global properties of text. It introduces maximal clique based sentence selection (MCBSS) algorithm to select important and non-redundant sentences that cover all concepts of the input text for summary. The MCBSS algorithm finds novel information using maximal cliques (MCs). The experimental results of recall oriented understudy for gisting evaluation (ROUGE) on Timeline dataset show that the proposed work outperforms the existing graph algorithms Bushy Path (BP), Aggregate Similarity (AS), and TextRank (TR).


2021 ◽  
Vol 11 (22) ◽  
pp. 10511
Author(s):  
Muhammad Mohsin ◽  
Shazad Latif ◽  
Muhammad Haneef ◽  
Usman Tariq ◽  
Muhammad Attique Khan ◽  
...  

Automatic Text Summarization (ATS) is gaining attention because a large volume of data is being generated at an exponential rate. Due to easy internet availability globally, a large amount of data is being generated from social networking websites, news websites and blog websites. Manual summarization is time consuming, and it is difficult to read and summarize a large amount of content. Automatic text summarization is the solution to deal with this problem. This study proposed two automatic text summarization models which are Genetic Algorithm with Hierarchical Clustering (GA-HC) and Particle Swarm Optimization with Hierarchical Clustering (PSO-HC). The proposed models use a word embedding model with Hierarchal Clustering Algorithm to group sentences conveying almost same meaning. Modified GA and adaptive PSO based sentence ranking models are proposed for text summary in news text documents. Simulations are conducted and compared with other understudied algorithms to evaluate the performance of proposed methodology. Simulations results validate the superior performance of the proposed methodology.


In order to read as well as search information quickly, there was a need to reduce the size of the documents without any changes to its content. Therefore, in order to solve this problem, there was a solution to it by introducing a technique called as automatic text summarization which is used to generate summaries from the input document by condensing large sized input documents into smaller documents without losing its meaning as well as relevancy with respect to the original document. Text summarization stands for shortening of text into accurate, meaningful sentences. The paper shows an implementation of summarization of the original document by scoring the sentence based on term frequency and inverse document frequency matrix. The entire record was compressed so that only the relevant sentences in the document were retained. This technique can be applicable in various applications like automating text documents, quicker understanding of documents because of summarization


Sign in / Sign up

Export Citation Format

Share Document