scholarly journals Automated Kannada Text Summarization using Sentence features

There is a growing requirement for the text summarization due to the difficulty of managing exponential increase of information accessible on the World Wide Web. Text summarization is a process to extract the contents in the original text to the shorter form which provides important information to the user. The summarizer presented in this paper produces the extractive summaries of Kannada text documents. The proposed summarizer system considers five features to determine the important sentences in the document. The features used are Term Frequency, Term Frequency-Inverse Sentence Frequency, Keywords feature, Sentence length and Sentence position. The value of each feature is computed and score for each sentence in the document is the average of all the feature score values. The sentences with the top scores are selected to be included in the extractive summary. The results of the proposed model are evaluated using ROUGE toolkit to measure the performance based on F-score of generated summaries. Experimental studies on custom-built dataset with 50 Kannada text documents shows significantly better performance in producing extractive summaries as compared to human summaries

2015 ◽  
Vol 5 (1) ◽  
pp. 36-47 ◽  
Author(s):  
Rasmita Rautray ◽  
Rakesh Chandra Balabantaray ◽  
Anisha Bhardwaj

Problem of exponential growth of information available electronically, there is an increasing demand for text summarization. Text summarization is the process of extracting the contents of the original text in a shorter form that provides useful information to the user. This paper presents a summarizer to produce summaries while reducing the redundant information and maximizing the summary relevancy. The proposed model takes several features into an account, including title feature, sentence weight, term weight, sentence position, inter sentence similarity, proper noun, thematic word and numerical data. The score of each feature for the model can be obtained from the document sets. However, the results of such models are evaluated to measure their performance based on F-score of extracted sentences at 20% compression rate on a C-50 data corpus. Experimental studies on C-50 data corpus, PSO summarizer show significantly better performance compared to other summarizer.


In order to read as well as search information quickly, there was a need to reduce the size of the documents without any changes to its content. Therefore, in order to solve this problem, there was a solution to it by introducing a technique called as automatic text summarization which is used to generate summaries from the input document by condensing large sized input documents into smaller documents without losing its meaning as well as relevancy with respect to the original document. Text summarization stands for shortening of text into accurate, meaningful sentences. The paper shows an implementation of summarization of the original document by scoring the sentence based on term frequency and inverse document frequency matrix. The entire record was compressed so that only the relevant sentences in the document were retained. This technique can be applicable in various applications like automating text documents, quicker understanding of documents because of summarization


2020 ◽  
Vol 10 (16) ◽  
pp. 5422 ◽  
Author(s):  
So-Eon Kim ◽  
Nazira Kaibalina ◽  
Seong-Bae Park

The advent of the sequence-to-sequence model and the attention mechanism has increased the comprehension and readability of automatically generated summaries. However, most previous studies on text summarization have focused on generating or extracting sentences only from an original text, even though every text has a latent topic category. That is, even if a topic category helps improve the summarization quality, there have been no efforts to utilize such information in text summarization. Therefore, this paper proposes a novel topical category-aware neural text summarizer which is differentiated from legacy neural summarizers in that it reflects the topic category of an original text into generating a summary. The proposed summarizer adopts the class activation map (CAM) as topical influence of the words in the original text. Since the CAM excerpts the words relevant to a specific category from the text, it allows the attention mechanism to be influenced by the topic category. As a result, the proposed neural summarizer reflects the topical information of a text as well as the content information into a summary by combining the attention mechanism and CAM. The experiments on The New York Times Annotated Corpus show that the proposed model outperforms the legacy attention-based sequence-to-sequence model, which proves that it is effective at reflecting a topic category into automatic summarization.


Author(s):  
Kishore Kumar Mamidala Et.al

Extracting/abstracting the condensed form of original text document by retaining its information and complete meaning is known as text summarization. The creation of manual summaries from large text documents is difficult and time-consuming for humans. Text summarization has become an important and challenging area in natural language processing. This paper presents a heuristic appraoch to extract a summary of e-news articles of the Telugu language. The method proposes new lexical parameter-based information extraction (IE) rules for scoring the sentences. Event score and Named Entity Score is a novel part in sentence scoring to identify the essential information in the text. Depending on the frequency of occurrence of event/named entites in the sentence and document, sentences are selected for summary. Data is collected from online news sources (i.e., Eenadu, Sakshi,Andhra Jyothi, Namaste Telangana) to experiment. The proposed method is compared with other techniques developed for Telugu text summarization. Evaluation metrics like precision, recall, and F1 score is used to measure the proposed method's performance. An extensive statistical and qualitative evaluation of the system's summaries has been conducted using Recall-Oriented Understudy for Gisting Evaluation (ROUGE), a standard summary evaluation tool. The results showed improved performance compared to other methods.


2022 ◽  
Vol 19 (1) ◽  
pp. 1719
Author(s):  
Saravanan Arumugam ◽  
Sathya Bama Subramani

With the increase in the amount of data and documents on the web, text summarization has become one of the significant fields which cannot be avoided in today’s digital era. Automatic text summarization provides a quick summary to the user based on the information presented in the text documents. This paper presents the automated single document summarization by constructing similitude graphs from the extracted text segments. On extracting the text segments, the feature values are computed for all the segments by comparing them with the title and the entire document and by computing segment significance using the information gain ratio. Based on the computed features, the similarity between the segments is evaluated to construct the graph in which the vertices are the segments and the edges specify the similarity between them. The segments are ranked for including them in the extractive summary by computing the graph score and the sentence segment score. The experimental analysis has been performed using ROUGE metrics and the results are analyzed for the proposed model. The proposed model has been compared with the various existing models using 4 different datasets in which the proposed model acquired top 2 positions with the average rank computed on various metrics such as precision, recall, F-score. HIGHLIGHTS Paper presents the automated single document summarization by constructing similitude graphs from the extracted text segments It utilizes information gain ratio, graph construction, graph score and the sentence segment score computation Results analysis has been performed using ROUGE metrics with 4 popular datasets in the document summarization domain The model acquired top 2 positions with the average rank computed on various metrics such as precision, recall, F-score GRAPHICAL ABSTRACT


2020 ◽  
pp. 276-289
Author(s):  
Mobina Fathi ◽  
Kimia Vakili ◽  
Niloofar Deravi

Around the end of December 2019, a new beta-coronavirus from Wuhan City, Hubei Province, China began to spread rapidly. The new virus, called SARS-CoV-2, which could be transmitted through respiratory droplets, had a range of mild to severe symptoms, from simple cold in some cases to death in others. The disease caused by SARS-CoV-2 was named COVID-19 by WHO and has so far killed more people than SARS and MERS. Following the widespread global outbreak of COVID-19, with more than 132758 confirmed cases and 4955 deaths worldwide, the World Health Organization declared COVID-19 a pandemic disease in January 2020. Earlier studies on viral pneumonia epidemics has shown that pregnant women are at greater risk than others. During pregnancy, the pregnant woman is more prone to infectious diseases. Research on both SARS-CoV and MERS-CoV, which are pathologically similar to SARS-CoV-2, has shown that being infected with these viruses during pregnancy increases the risk of maternal death, stillbirth, intrauterine growth retardation and, preterm delivery. With the exponential increase in cases of COVID-19 throughout the world, there is a need to understand the effects of SARS-CoV-2 on the health of pregnant women, through extrapolation of earlier studies that have been conducted on pregnant women infected with SARS-CoV, and MERS-CoV. There is an urgent need to understand the chance of vertical transmission of SARS-CoV-2 from mother to fetus and the possibility of the virus crossing the placental barrier. Additionally, since some viral diseases and antiviral drugs may have a negative impact on the mother and fetus, in which case, pregnant women need special attention for the prevention, diagnosis, and treatment of COVID-19.


2014 ◽  
Vol 6 (1) ◽  
pp. 1032-1035 ◽  
Author(s):  
Ramzi Suleiman

The research on quasi-luminal neutrinos has sparked several experimental studies for testing the "speed of light limit" hypothesis. Until today, the overall evidence favors the "null" hypothesis, stating that there is no significant difference between the observed velocities of light and neutrinos. Despite numerous theoretical models proposed to explain the neutrinos behavior, no attempt has been undertaken to predict the experimentally produced results. This paper presents a simple novel extension of Newton's mechanics to the domain of relativistic velocities. For a typical neutrino-velocity experiment, the proposed model is utilized to derive a general expression for . Comparison of the model's prediction with results of six neutrino-velocity experiments, conducted by five collaborations, reveals that the model predicts all the reported results with striking accuracy. Because in the proposed model, the direction of the neutrino flight matters, the model's impressive success in accounting for all the tested data, indicates a complete collapse of the Lorentz symmetry principle in situation involving quasi-luminal particles, moving in two opposite directions. This conclusion is support by previous findings, showing that an identical Sagnac effect to the one documented for radial motion, occurs also in linear motion.


Author(s):  
Samuel Richardson

‘Pamela under the Notion of being a Virtuous Modest Girl will be introduced into all Families, and when she gets there, what Scenes does she represent? Why a fine young Gentleman endeavouring to debauch a beautiful young Girl of Sixteen.’ (Pamela Censured, 1741) One of the most spectacular successes of the burgeoning literary marketplace of eighteeent-century London, Pamela also marked a defining moment in the emergence of the modern novel. In the words of one contemporary, it divided the world ‘into two different Parties, Pamelists and Antipamelists’, even eclipsing the sensational factional politics of the day. Preached up for its morality, and denounced as pornography in disguise, it vividly describes a young servant’s long resistance to the attempts of her predatory master to seduce her. Written in the voice of its low-born heroine, but by a printer who fifteen years earlier had narrowly escaped imprisonment for the seditious output of his press, Pamela is not only a work of pioneering psychological complexity, but also a compelling and provocative study of power and its abuse. Based on the original text of 1740, from which Richardson later retreated in a series of defensive revisions, this edition makes available the version of Pamela that aroused such widespread controversy on its first appearance.


Author(s):  
Irfan Ullah Khan ◽  
Nida Aslam ◽  
Malak Aljabri ◽  
Sumayh S. Aljameel ◽  
Mariam Moataz Aly Kamaleldin ◽  
...  

The COVID-19 outbreak is currently one of the biggest challenges facing countries around the world. Millions of people have lost their lives due to COVID-19. Therefore, the accurate early detection and identification of severe COVID-19 cases can reduce the mortality rate and the likelihood of further complications. Machine Learning (ML) and Deep Learning (DL) models have been shown to be effective in the detection and diagnosis of several diseases, including COVID-19. This study used ML algorithms, such as Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and K-Nearest Neighbor (KNN) and DL model (containing six layers with ReLU and output layer with sigmoid activation), to predict the mortality rate in COVID-19 cases. Models were trained using confirmed COVID-19 patients from 146 countries. Comparative analysis was performed among ML and DL models using a reduced feature set. The best results were achieved using the proposed DL model, with an accuracy of 0.97. Experimental results reveal the significance of the proposed model over the baseline study in the literature with the reduced feature set.


Sign in / Sign up

Export Citation Format

Share Document