text documents Latest Research Papers

Differentiating geographic movement described in text documents

Transactions in GIS ◽

10.1111/tgis.12893 ◽

2022 ◽

Author(s):

Scott Pezanowski ◽

Alan M. MacEachren ◽

Prasenjit Mitra

Keyword(s):

Text Documents

A Reconstruction Method for Cross-Cut Shredded Documents Based on the Extreme Learning Machine Algorithm

10.21203/rs.3.rs-1143560/v1 ◽

2022 ◽

Author(s):

Zhenghui Zhang ◽

Juan Zou ◽

Jinhua Zheng ◽

Shengxiang Yang ◽

Dunwei Gong ◽

...

Keyword(s):

Extreme Learning Machine ◽

Intelligent Systems ◽

Measurement Model ◽

Computer Assisted ◽

Reconstruction Method ◽

Matching Accuracy ◽

Text Documents ◽

Evidence Collection ◽

Learning Machine ◽

Consensus Information

Abstract Reconstruction of cross-cut shredded text documents (RCCSTD) has important applications for information security and judicial evidence collection. The traditional method of manual construction is a very time-consuming task, so the use of computer-assisted efficient reconstruction is a crucial research topic. Fragment consensus information extraction and fragment pair compatibility measurement are two fundamental processes in RCCSTD. Due to the limitations of the existing classical methods of these two steps, only documents with specific structures or characteristics can be spliced, and pairing error is larger when the cutting is more fine-grained. In order to reconstruct the fragments more effectively, this paper improves the extraction method for consensus information and constructs a new global pairwise compatibility measurement model based on the extreme learning machine algorithm. The purpose of the algorithm's design is to exploit all available information and computationally suggest matches to increase the algorithm's ability to discriminate between data in various complex situations, then find the best neighbor of each fragment for splicing according to pairwise compatibility. The overall performance of our approach in several practical experiments is illustrated. The results indicate that the matching accuracy of the proposed algorithm is better than that of the previously published classical algorithms and still ensures a higher matching accuracy in the noisy datasets, which can provide a feasible method for RCCSTD intelligent systems in real scenarios.

Similitude Based Segment Graph Construction and Segment Ranking for Automatic Summarization of Text Document

Trends in Sciences ◽

10.48048/tis.2022.1719 ◽

2022 ◽

Vol 19 (1) ◽

pp. 1719

Author(s):

Saravanan Arumugam ◽

Sathya Bama Subramani

Keyword(s):

Information Gain ◽

Text Summarization ◽

Text Documents ◽

Document Summarization ◽

Gain Ratio ◽

Average Rank ◽

Text Document ◽

Proposed Model ◽

Feature Values ◽

Information Gain Ratio

With the increase in the amount of data and documents on the web, text summarization has become one of the significant fields which cannot be avoided in today’s digital era. Automatic text summarization provides a quick summary to the user based on the information presented in the text documents. This paper presents the automated single document summarization by constructing similitude graphs from the extracted text segments. On extracting the text segments, the feature values are computed for all the segments by comparing them with the title and the entire document and by computing segment significance using the information gain ratio. Based on the computed features, the similarity between the segments is evaluated to construct the graph in which the vertices are the segments and the edges specify the similarity between them. The segments are ranked for including them in the extractive summary by computing the graph score and the sentence segment score. The experimental analysis has been performed using ROUGE metrics and the results are analyzed for the proposed model. The proposed model has been compared with the various existing models using 4 different datasets in which the proposed model acquired top 2 positions with the average rank computed on various metrics such as precision, recall, F-score. HIGHLIGHTS Paper presents the automated single document summarization by constructing similitude graphs from the extracted text segments It utilizes information gain ratio, graph construction, graph score and the sentence segment score computation Results analysis has been performed using ROUGE metrics with 4 popular datasets in the document summarization domain The model acquired top 2 positions with the average rank computed on various metrics such as precision, recall, F-score GRAPHICAL ABSTRACT

Automatic Text Summaration of COVID-19 Scientific Research Topics Using Pre-trained Model from HuggingFace®

10.36227/techrxiv.17693702.v1 ◽

2021 ◽

Author(s):

Sakdipat Ontoum ◽

Jonathan H. Chan

Keyword(s):

Language Processing ◽

Relevant Information ◽

Original Text ◽

Learning Approaches ◽

Text Documents ◽

Written Text ◽

Automatic Text Summarization ◽

Word Clouds ◽

Readability Test ◽

Automatic Text

By identifying and extracting relevant information from articles, automated text summarizing helps the scientific and medical sectors. Automatic text summarization is a way of compressing text documents so that users may find important information in the original text in less time. We will first review some new works in the field of summarizing that use deep learning approaches, and then we will explain the "COVID-19" summarization research papers. The ease with which a reader can grasp written text is referred to as the readability test. The substance of text determines its readability in natural language processing. We constructed word clouds using the abstract's most commonly used text. By looking at those three measurements, we can determine the mean of "ROUGE-1", "ROUGE-2", and "ROUGE-L". As a consequence, "Distilbart-mnli-12-6" and "GPT2-large" are outperform than other. <br>

Automatic Text Summaration of COVID-19 Scientific Research Topics Using Pre-trained Model from HuggingFace®

10.36227/techrxiv.17693702 ◽

2021 ◽

Author(s):

Sakdipat Ontoum ◽

Jonathan H. Chan

Keyword(s):

Language Processing ◽

Relevant Information ◽

Original Text ◽

Learning Approaches ◽

Text Documents ◽

Written Text ◽

Automatic Text Summarization ◽

Word Clouds ◽

Readability Test ◽

Automatic Text

By identifying and extracting relevant information from articles, automated text summarizing helps the scientific and medical sectors. Automatic text summarization is a way of compressing text documents so that users may find important information in the original text in less time. We will first review some new works in the field of summarizing that use deep learning approaches, and then we will explain the "COVID-19" summarization research papers. The ease with which a reader can grasp written text is referred to as the readability test. The substance of text determines its readability in natural language processing. We constructed word clouds using the abstract's most commonly used text. By looking at those three measurements, we can determine the mean of "ROUGE-1", "ROUGE-2", and "ROUGE-L". As a consequence, "Distilbart-mnli-12-6" and "GPT2-large" are outperform than other. <br>

A Survey on Aspect-Based Sentiment Classification

ACM Computing Surveys ◽

10.1145/3503044 ◽

2021 ◽

Author(s):

Gianni Brauwers ◽

Flavius Frasincar

Keyword(s):

State Of The Art ◽

Knowledge Bases ◽

Sentiment Classification ◽

Automatic Extraction ◽

Learning Models ◽

Text Documents ◽

Fine Grained ◽

Knowledge Based ◽

Transformer Model ◽

The Web

With the constantly growing number of reviews and other sentiment-bearing texts on the Web, the demand for automatic sentiment analysis algorithms continues to expand. Aspect-based sentiment classification (ABSC) allows for the automatic extraction of highly fine-grained sentiment information from text documents or sentences. In this survey, the rapidly evolving state of the research on ABSC is reviewed. A novel taxonomy is proposed that categorizes the ABSC models into three major categories: knowledge-based, machine learning, and hybrid models. This taxonomy is accompanied with summarizing overviews of the reported model performances, and both technical and intuitive explanations of the various ABSC models. State-of-the-art ABSC models are discussed, such as models based on the transformer model, and hybrid deep learning models that incorporate knowledge bases. Additionally, various techniques for representing the model inputs and evaluating the model outputs are reviewed. Furthermore, trends in the research on ABSC are identified and a discussion is provided on the ways in which the field of ABSC can be advanced in the future.

EDUCOLOGICAL ASPECTS OF ACTUALITING OF REGULATORY DOCUMENTATION IN MECHANICAL ENGINEERING

Collection of scholarly papers of Dniprovsk State Technical University (Technical Sciences) ◽

10.31319/2519-2884.39.2021.4 ◽

2021 ◽

Vol 2 (39) ◽

pp. 33-41

Author(s):

V. Samokhval ◽

О. Maksimenko ◽

О. Nikulin

Keyword(s):

Higher Education ◽

Mechanical Engineering ◽

Educational Process ◽

Practical Skills ◽

Technical Documentation ◽

Education Students ◽

Text Documents ◽

Professional Programs ◽

Technological Documentation

The main competencies of specialists in the field of knowledge 13 Mechanical engineering appertain the ability to draw up technical documentation, which includes both the design of text documents and drawings of parts, units, assemblies. Over the last decade, there has been a significant update of the standards of both the Unified System of Design Documentation and the Unified System of Technological Documentation. Suitable, according to the principle of education, to ensure the quality of the educational process, the task of implementations modern regulations into the educational process and providing students with practical skills in the design of educational documentation, which is as approximate as possible to the technical future practice. The purpose of the work is to develop proposals for the implementations of relevant regulations in the regulatory and methodological documentation of the educational process to ensure that higher education students acquire practical skills in applying this knowledge in the design of educational documentation with the closest possible approximation to real technical documentation. From the analysis of the requirements of the acting standards that make up the system of design and technological documentation, it follows that their requirements are quite diverse and allow some flexibility of choice, depending on the needs of developers or consumers. Corresponding, according to the provisions of education, the current methodological and regulatory documentation of higher education institutions should provide information to students about the existing system of standards for technical documentation and their practical skills in applying such knowledge. Based on the standards of the Unified System of Technological Documentation and DSTU EN ISO 7200: 2005, proposals have been developed for the design of educational documentation by students of mechanical engineering specialties according to educational and professional programs. The proposals concern the determine of the institution of higher education as the owner of the document, the numbering of documents of the educational process, the designation of sections of the document, page numbering, design of the main inscription of drawings and the title page of text documents. In particular, a numbering system has been developed, which allows to mark the whole set of documents of the educational process of a particular applicant, individual documents by type, separate sections of text documents and drawings. A single form of the main inscription has been developed, which can be used both for text documents and for drawings. The proposed innovations relate mainly to the specialties of mechanical engineering for educational and professional programs. But they can be used, in whole or in part, for other specialties and fields of knowledge.

SUPPORT OF INFORMAL CARERS FOR PEOPLE AFTER A STROKE WITH CROWDSOURCING AND NATURAL LANGUAGE PROCESSING

Acta Electrotechnica et Informatica ◽

10.15546/aeei-2021-0013 ◽

2021 ◽

Vol 21 (3) ◽

pp. 3-10

Author(s):

Petr ŠALOUN ◽

◽

Barbora CIGÁNKOVÁ ◽

David ANDREŠIČ ◽

Lenka KRHUTOVÁ ◽

...

Keyword(s):

Language Processing ◽

Text Documents ◽

Data Set ◽

Text Document ◽

Long Time ◽

Informal Carers ◽

Effective Visualization ◽

Text Document Classification ◽

Lay Public

For a long time, both professionals and the lay public showed little interest in informal carers. Yet these people deals with multiple and common issues in their everyday lives. As the population is aging we can observe a change of this attitude. And thanks to the advances in computer science, we can offer them some effective assistance and support by providing necessary information and connecting them with both professional and lay public community. In this work we describe a project called “Research and development of support networks and information systems for informal carers for persons after stroke” producing an information system visible to public as a web portal. It does not provide just simple a set of information but using means of artificial intelligence, text document classification and crowdsourcing further improving its accuracy, it also provides means of effective visualization and navigation over the content made by most by the community itself and personalized on a level of informal carer’s phase of the care-taking timeline. In can be beneficial for informal carers as it allows to find a content specific to their current situation. This work describes our approach to classification of text documents and its improvement through crowdsourcing. Its goal is to test text documents classifier based on documents similarity measured by N-grams method and to design evaluation and crowdsourcing-based classification improvement mechanism. Interface for crowdsourcing was created using CMS WordPress. In addition to data collection, the purpose of interface is to evaluate classification accuracy, which leads to extension of classifier test data set, thus the classification is more successful.

The Principle of Proportionality, Article 8 of the ECHR and Swedish Care Orders in Cases of Neglect. A Sketch of a Theoretical Framework

Review of European Administrative Law ◽

10.7590/187479821x16364535488019 ◽

2021 ◽

Vol 14 (4) ◽

pp. 7-22

Author(s):

Katarina Alexius

Keyword(s):

Social Sciences ◽

Child Welfare ◽

Interdisciplinary Approach ◽

Theoretical Framework ◽

Equal Treatment ◽

Legal Certainty ◽

Text Documents ◽

Advantages And Disadvantages ◽

Abuse And Neglect ◽

Principle Of Proportionality

This study conducts an analysis of the rights in article 8 of the ECHR and the application of the proportionality principle when Swedish care orders may be regarded as a necessary interference in family life. The study has been based on an interdisciplinary approach. Text documents were studied through socio-legal methods and perspectives, by combining knowledge from legal sources and social sciences research through a content analysis derived from formal and substantive legal certainty. The article concludes that reasoning in Swedish administrative courts should routinely consider proportionality in cases of neglect, and sets out to sketch a theoretical framework for the principle of proportionality in decisions on care orders. The results show that, since decisions in child welfare cases cannot be made completely uniform and predictable, the focus of decisions in social child welfare work must be to satisfy the objectives and values of substantive legal certainty, instead of unrealistically striving for formal legal certainty through equal treatment and predictability. The results also show that, by requiring those who exercise public authority to present their assessments based on proportionality, new demands are made for the quality and efficiency of involuntary out-of-home placements. Child welfare investigations should nowadays include impact assessments that clarify the advantages and disadvantages of the care in relation to the risk of harm from the original home conditions. Abuse and neglect in out-of-home placements will therefore be of growing importance in decisions on care orders in the future.

Semantic-Preserving Adversarial Text Attacks

10.36227/techrxiv.17102927.v1 ◽

2021 ◽

Author(s):

Xinghao Yang ◽

Yongshun Gong ◽

Weifeng Liu ◽

JAMES BAILEY ◽

Tianqing Zhu ◽

...

Keyword(s):

Deep Learning ◽

Potential Candidate ◽

Success Rates ◽

Learning Models ◽

Text Documents ◽

Word Level ◽

Sentence Level ◽

Adversarial Attack ◽

Semantic Preservation ◽

Adversarial Example

Deep learning models are known immensely brittle to adversarial image examples, yet their vulnerability in text classification is insufficiently explored. Existing text adversarial attack strategies can be roughly divided into three categories, i.e., character-level attack, word-level attack, and sentence-level attack. Despite the success brought by recent text attack methods, how to induce misclassification with the minimal text modifications while keeping the lexical correctness, syntactic soundness, and semantic consistency simultaneously is still a challenge. To examine the vulnerability of deep models, we devise a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) approach which attacks text documents not only at a unigram word level but also at a bigram level to avoid generating meaningless sentences. We also present a hybrid attack strategy that collects substitution words from both synonyms and sememe candidates, to enrich the potential candidate set. Besides, a Semantic Preservation Optimization (SPO) method is devised to determine the word substitution priority and reduce the perturbation cost. Furthermore, we constraint the SPO with a semantic Filter (dubbed SPOF) to improve the semantic similarity between the input text and the adversarial example. To estimate the effectiveness of our proposed methods, BU-SPO and BU-SPOF, we attack four victim deep learning models trained on three real-world text datasets. Experimental results demonstrate that our approaches accomplish the highest semantics consistency and attack success rates by making the minimal word modifications compared with competitive methods.

text documents
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Differentiating geographic movement described in text documents

A Reconstruction Method for Cross-Cut Shredded Documents Based on the Extreme Learning Machine Algorithm

Similitude Based Segment Graph Construction and Segment Ranking for Automatic Summarization of Text Document

Automatic Text Summaration of COVID-19 Scientific Research Topics Using Pre-trained Model from HuggingFace®

Automatic Text Summaration of COVID-19 Scientific Research Topics Using Pre-trained Model from HuggingFace®

A Survey on Aspect-Based Sentiment Classification

EDUCOLOGICAL ASPECTS OF ACTUALITING OF REGULATORY DOCUMENTATION IN MECHANICAL ENGINEERING

SUPPORT OF INFORMAL CARERS FOR PEOPLE AFTER A STROKE WITH CROWDSOURCING AND NATURAL LANGUAGE PROCESSING

The Principle of Proportionality, Article 8 of the ECHR and Swedish Care Orders in Cases of Neglect. A Sketch of a Theoretical Framework

Semantic-Preserving Adversarial Text Attacks

Export Citation Format

text documentsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Differentiating geographic movement described in text documents

A Reconstruction Method for Cross-Cut Shredded Documents Based on the Extreme Learning Machine Algorithm

Similitude Based Segment Graph Construction and Segment Ranking for Automatic Summarization of Text Document

Automatic Text Summaration of COVID-19 Scientific Research Topics Using Pre-trained Model from HuggingFace®

Automatic Text Summaration of COVID-19 Scientific Research Topics Using Pre-trained Model from HuggingFace®

A Survey on Aspect-Based Sentiment Classification

EDUCOLOGICAL ASPECTS OF ACTUALITING OF REGULATORY DOCUMENTATION IN MECHANICAL ENGINEERING

SUPPORT OF INFORMAL CARERS FOR PEOPLE AFTER A STROKE WITH CROWDSOURCING AND NATURAL LANGUAGE PROCESSING

The Principle of Proportionality, Article 8 of the ECHR and Swedish Care Orders in Cases of Neglect. A Sketch of a Theoretical Framework

Semantic-Preserving Adversarial Text Attacks

text documents
Recently Published Documents