sentence extraction Latest Research Papers

In this era everything is digitalized we can find a large amount of digital data for different purposes on the internet and relatively it’s very hard to summarize this data manually. Automatic Text Summarization (ATS) is the subsequent big one that could simply summarize the source data and give us a short version that could preserve the content and the overall meaning. While the concept of ATS is started long back in 1950’s, this field is still struggling to give the best and efficient summaries. ATS proceeds towards 2 methods, Extractive and Abstractive Summarization. The Extractive and Abstractive methods had a process to improve text summarization technique. Text Summarization is implemented with NLP due to packages and methods in Python. Different approaches are present for summarizing the text and having few algorithms with which we can implement it. Text Rank is what to extractive text summarization and it is an unsupervised learning. Text Rank algorithm also uses undirected graphs, weighted graphs. keyword extraction, sentence extraction. So, in this paper, a model is made to get better result in text summarization with Genism library in NLP. This method improves the overall meaning of the phrase and the person reading it can understand in a better way.

Download Full-text

Trouble Sentence and Emotional Sentence Extraction System for Problem Recognition in Psychotherapy Dialogue

Journal of Japan Society for Fuzzy Theory and Intelligent Informatics ◽

10.3156/jsoft.33.4_860 ◽

2021 ◽

Vol 33 (4) ◽

pp. 860-871

Author(s):

Ryoma HANABUSA ◽

Kenji ARAKI

Keyword(s):

Extraction System ◽

Problem Recognition ◽

Sentence Extraction

Download Full-text

Similarity measures and diversity rankings for query-focused sentence extraction

10.17918/etd-3245 ◽

2021 ◽

Author(s):

Palakorn Achananuparp

Keyword(s):

Similarity Measures ◽

Sentence Extraction

Download Full-text

Risk of Bias Assessment in Preclinical Literature using Natural Language Processing

10.1101/2021.06.04.447092 ◽

2021 ◽

Author(s):

Qianying Wang ◽

Jing Liao ◽

Mirella Lapata ◽

Malcolm Macleod

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Animal Welfare ◽

Language Processing ◽

Risk Of Bias ◽

Random Allocation ◽

Neural Models ◽

Test Set ◽

Bias Assessment ◽

Sentence Extraction

Objective: We sought to apply natural language processing to the task of automatic risk of bias assessment in preclinical literature, which could speed the process of systematic review, provide information to guide research improvement activity, and support translation from preclinical to clinical research. Materials and Methods: We use 7,840 full-text publications describing animal experiments with yes/no annotations for five risk of bias items. We implement a series of models including baselines (support vector machine, logistic regression, random forest), neural models (convolutional neural network, recurrent neural network with attention, hierarchical neural network) and models using BERT with two strategies (document chunk pooling and sentence extraction). We tune hyperparameters to obtain the highest F1 scores for each risk of bias item on the validation set and compare evaluation results on the test set to our previous regular expression approach. Results: The F1 scores of best models on test set are 82.0% for random allocation, 81.6% for blinded assessment of outcome, 82.6% for conflict of interests, 91.4% for compliance with animal welfare regulations and 46.6% for reporting animals excluded from analysis. Our models significantly outperform regular expressions for four risk of bias items. Conclusion: For random allocation, blinded assessment of outcome, conflict of interests and animal exclusions, neural models achieve good performance, and for animal welfare regulations, BERT model with sentence extraction strategy works better.

Download Full-text

Parallel sentence extraction to improve cross-language information retrieval from Wikipedia

Journal of Information Science ◽

10.1177/0165551521992754 ◽

2021 ◽

pp. 016555152199275

Author(s):

Juryong Cheon ◽

Youngjoong Ko

Keyword(s):

Information Retrieval ◽

Language Resources ◽

Query Translation ◽

Factors Affecting ◽

Parallel Corpora ◽

Parallel Corpus ◽

Bilingual Dictionary ◽

Sentence Extraction ◽

Cross Language Information Retrieval ◽

Cross Language

Translation language resources, such as bilingual word lists and parallel corpora, are important factors affecting the effectiveness of cross-language information retrieval (CLIR) systems. In particular, when large domain-appropriate parallel corpora are not available, developing an effective CLIR system is particularly difficult. Furthermore, creating a large parallel corpus is costly and requires considerable effort. Therefore, we here demonstrate the construction of parallel corpora from Wikipedia as well as improved query translation, wherein the queries are used for a CLIR system. To do so, we first constructed a bilingual dictionary, termed WikiDic. Then, we evaluated individual language resources and combinations of them in terms of their ability to extract parallel sentences; the combinations of our proposed WikiDic with the translation probability from the Web’s bilingual example sentence pairs and WikiDic was found to be best suited to parallel sentence extraction. Finally, to evaluate the parallel corpus generated from this best combination of language resources, we compared its performance in query translation for CLIR to that of a manually created English–Korean parallel corpus. As a result, the corpus generated by our proposed method achieved a better performance than did the manually created corpus, thus demonstrating the effectiveness of the proposed method for automatic parallel corpus extraction. Not only can the method demonstrated herein be used to inform the construction of other parallel corpora from language resources that are readily available, but also, the parallel sentence extraction method will naturally improve as Wikipedia continues to be used and its content develops.

Download Full-text

Tweet contextualization: combining sentence extraction, sentence aggregation and sentence reordering to enhance informativeness and readability

Social Network Analysis and Mining ◽

10.1007/s13278-021-00724-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Amira Dhokar ◽

Lobna Hlaoua ◽

Lotfi Ben Romdhane

Keyword(s):

Sentence Extraction ◽

Tweet Contextualization

Download Full-text

A Framework for Generating Extractive Summary from Multiple Malayalam Documents

Information ◽

10.3390/info12010041 ◽

2021 ◽

Vol 12 (1) ◽

pp. 41

Author(s):

K. Manju ◽

S. David Peter ◽

Sumam Idicula

Keyword(s):

Text Summarization ◽

Data Set ◽

Document Summarization ◽

Sentence Extraction ◽

Textual Data ◽

Multiple Input ◽

Extraction Algorithm ◽

Malayalam Language ◽

Short Time ◽

Sparse Methods

Automatic extractive text summarization retrieves a subset of data that represents most notable sentences in the entire document. In the era of digital explosion, which is mostly unstructured textual data, there is a demand for users to understand the huge amount of text in a short time; this demands the need for an automatic text summarizer. From summaries, the users get the idea of the entire content of the document and can decide whether to read the entire document or not. This work mainly focuses on generating a summary from multiple news documents. In this case, the summary helps to reduce the redundant news from the different newspapers. A multi-document summary is more challenging than a single-document summary since it has to solve the problem of overlapping information among sentences from different documents. Extractive text summarization yields the sensitive part of the document by neglecting the irrelevant and redundant sentences. In this paper, we propose a framework for extracting a summary from multiple documents in the Malayalam Language. Also, since the multi-document summarization data set is sparse, methods based on deep learning are difficult to apply. The proposed work discusses the performance of existing standard algorithms in multi-document summarization of the Malayalam Language. We propose a sentence extraction algorithm that selects the top ranked sentences with maximum diversity. The system is found to perform well in terms of precision, recall, and F-measure on multiple input documents.

Download Full-text

Information Extraction Tasks based on BERT and SpaCy on Tourism Domain

ECTI Transactions on Computer and Information Technology (ECTI-CIT) ◽

10.37936/ecti-cit.2021151.228621 ◽

2021 ◽

Vol 15 (1) ◽

pp. 108-122

Author(s):

Chantana Chantrapornchai ◽

Aphisit Tunsakul

Keyword(s):

Name Entity Recognition ◽

Text Summarization ◽

Training Data ◽

Entity Recognition ◽

Entity Extraction ◽

Data Set ◽

Name Entity ◽

Sentence Extraction ◽

Relation Type ◽

Proper Training

In this paper, we present two methodologies to extract particular information based on the full text returned from the search engine to facilitate the users. The approaches are based three tasks: name entity recognition (NER), text classiﬁcation and text summarization. The ﬁrst step is the building training data and data cleansing. We consider tourism domain such as restaurant, hotels, shopping and tourism data set crawling from the websites. First, the tourism data are gathered and the vocabularies are built. Several minor steps include sentence extraction, relation and name entity extraction for tagging purpose. These steps are needed for creating proper training data. Then, the recognition model of a given entity type can be built. From the experiments, given review texts, we demonstrate to build the model to extract the desired entity,i.e, name, location, facility as well as relation type, classify the reviews or summarize the reviews. Two tools, SpaCy and BERT, are used to compare the performance of these tasks.

Download Full-text

Chinese Judicial Summarising Based on Short Sentence Extraction and GPT-2

Knowledge Science, Engineering and Management - Lecture Notes in Computer Science ◽

10.1007/978-3-030-82147-0_31 ◽

2021 ◽

pp. 376-393

Author(s):

Jie Liu ◽

Jiaye Wu ◽

Xudong Luo

Keyword(s):

Short Sentence ◽

Sentence Extraction

Download Full-text

Sentence Extraction-Based Machine Reading Comprehension for Vietnamese

Knowledge Science, Engineering and Management - Lecture Notes in Computer Science ◽

10.1007/978-3-030-82147-0_42 ◽

2021 ◽

pp. 511-523

Author(s):

Phong Nguyen-Thuan Do ◽

Nhat Duy Nguyen ◽

Tin Van Huynh ◽

Kiet Van Nguyen ◽

Anh Gia-Tuan Nguyen ◽

...

Keyword(s):

Reading Comprehension ◽

Sentence Extraction ◽

Machine Reading

Download Full-text

sentence extraction
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Text Summarizing Using NLP

Trouble Sentence and Emotional Sentence Extraction System for Problem Recognition in Psychotherapy Dialogue

Similarity measures and diversity rankings for query-focused sentence extraction

Risk of Bias Assessment in Preclinical Literature using Natural Language Processing

Parallel sentence extraction to improve cross-language information retrieval from Wikipedia

Tweet contextualization: combining sentence extraction, sentence aggregation and sentence reordering to enhance informativeness and readability

A Framework for Generating Extractive Summary from Multiple Malayalam Documents

Information Extraction Tasks based on BERT and SpaCy on Tourism Domain

Chinese Judicial Summarising Based on Short Sentence Extraction and GPT-2

Sentence Extraction-Based Machine Reading Comprehension for Vietnamese

Export Citation Format

sentence extractionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Text Summarizing Using NLP

Trouble Sentence and Emotional Sentence Extraction System for Problem Recognition in Psychotherapy Dialogue

Similarity measures and diversity rankings for query-focused sentence extraction

Risk of Bias Assessment in Preclinical Literature using Natural Language Processing

Parallel sentence extraction to improve cross-language information retrieval from Wikipedia

Tweet contextualization: combining sentence extraction, sentence aggregation and sentence reordering to enhance informativeness and readability

A Framework for Generating Extractive Summary from Multiple Malayalam Documents

Information Extraction Tasks based on BERT and SpaCy on Tourism Domain

Chinese Judicial Summarising Based on Short Sentence Extraction and GPT-2

Sentence Extraction-Based Machine Reading Comprehension for Vietnamese

sentence extraction
Recently Published Documents