The application of automated text processing techniques to legal text management

Research on distance learning and computer-aided grading has been developed in parallel. Little work has been done in the past to join the two areas to solve the problem of automated learning assessment in virtual classrooms. This paper presents a model for learning assessment using an automated text processing technique to analyze class messages with an emphasis on course topics produced in an online class. It is suggested that students should be evaluated on many dimensions, including the learning artifacts such as course work submitted and class participation. Taking all these grading criteria into consideration, we design a model which combines three grading factors: the quality of course work, the quantity of efforts, and the activeness of participation, for evaluating the performance of students in the class. These three main items are measured on the basis of keyword contribution, message length, and message count, and a score is derived from the class messages to evaluate students’ performance. An assessment model is then constructed from these three measures to compute a performance indicator score for each student. The experiment shows that there is a high correlation between the performance indicator scores and the actual grades assigned by instructors. The rank orders of students by performance indicator scores and by the actual grades are highly correlated as well. Evidence from the experiment shows that the computer grader can be a great supplementary teaching and grading tool for distance learning instructors.

Download Full-text

Automated Travel History Extraction From Clinical Notes for Informing the Detection of Emergent Infectious Disease Events: Algorithm Development and Validation

JMIR Public Health and Surveillance ◽

10.2196/26719 ◽

2021 ◽

Vol 7 (3) ◽

pp. e26719

Author(s):

Kelly S Peterson ◽

Julia Lewis ◽

Olga V Patterson ◽

Alec B Chapman ◽

Daniel W Denhalter ◽

...

Keyword(s):

Public Health ◽

Infectious Disease ◽

United States ◽

Text Processing ◽

The United States ◽

Language Models ◽

Travel History ◽

Patient Travel ◽

Electronic Health ◽

Automated Text Processing

Background Patient travel history can be crucial in evaluating evolving infectious disease events. Such information can be challenging to acquire in electronic health records, as it is often available only in unstructured text. Objective This study aims to assess the feasibility of annotating and automatically extracting travel history mentions from unstructured clinical documents in the Department of Veterans Affairs across disparate health care facilities and among millions of patients. Information about travel exposure augments existing surveillance applications for increased preparedness in responding quickly to public health threats. Methods Clinical documents related to arboviral disease were annotated following selection using a semiautomated bootstrapping process. Using annotated instances as training data, models were developed to extract from unstructured clinical text any mention of affirmed travel locations outside of the continental United States. Automated text processing models were evaluated, involving machine learning and neural language models for extraction accuracy. Results Among 4584 annotated instances, 2659 (58%) contained an affirmed mention of travel history, while 347 (7.6%) were negated. Interannotator agreement resulted in a document-level Cohen kappa of 0.776. Automated text processing accuracy (F1 85.6, 95% CI 82.5-87.9) and computational burden were acceptable such that the system can provide a rapid screen for public health events. Conclusions Automated extraction of patient travel history from clinical documents is feasible for enhanced passive surveillance public health systems. Without such a system, it would usually be necessary to manually review charts to identify recent travel or lack of travel, use an electronic health record that enforces travel history documentation, or ignore this potential source of information altogether. The development of this tool was initially motivated by emergent arboviral diseases. More recently, this system was used in the early phases of response to COVID-19 in the United States, although its utility was limited to a relatively brief window due to the rapid domestic spread of the virus. Such systems may aid future efforts to prevent and contain the spread of infectious diseases.

Download Full-text

RESEARCHING METHODS FOR PROCESSING TEXT INFORMATION AND REVIEWING THE STAGES OF AN ARTIFICIAL INTELLIGENCE MODEL CREATION AT PRODUCING CHATBOTS

Automation and modeling in design and management of ◽

10.30987/2658-6436-2021-2-19-23 ◽

2021 ◽

Vol 2021 (2) ◽

pp. 19-23

Author(s):

Anastasiya Ivanova ◽

Aleksandr Kuz'menko ◽

Rodion Filippov ◽

Lyudmila Filippova ◽

Anna Sazonova ◽

...

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Machine Learning ◽

Text Processing ◽

Text Format ◽

Machine Learning Methods ◽

Methods And Techniques ◽

Text Information ◽

Processing Techniques ◽

Machine Processing

The task of producing a chatbot based on a neural network supposes machine processing of the text, which in turn involves using various methods and techniques for analyzing phrases and sentences. The article considers the most popular solutions and models for data analysis in the text format: methods of lemmatization, vectorization, as well as machine learning methods. Particular attention is paid to the text processing techniques, after their analyzing the best method was identified and tested.

Download Full-text

Automated News Classification using N-gram Model and Key Features of Nepali Language

SCITECH Nepal ◽

10.3126/scitech.v13i1.23504 ◽

2018 ◽

Vol 13 (1) ◽

pp. 64-69

Author(s):

Dinesh Dangol ◽

Rupesh Dahi Shrestha ◽

Arun Timalsina

Keyword(s):

Text Classification ◽

English Language ◽

Promising Result ◽

Text Processing ◽

Automatic Text Classification ◽

Key Features ◽

Research Experiment ◽

N Gram ◽

Automated Text Processing ◽

Automatic Text

With an increasing trend of publishing news online on website, automatic text processing becomes more and more important. Automatic text classification has been a focus of many researchers in different languages for decades. There is a huge amount of research repository on features of English language and their uses on automated text processing. This research implements Nepali language key features for automatic text classification of Nepali news. In particular, the study on impact of Nepali language based features, which are extremely different than English language is more challenging because of the higher level of complexity to be resolved. The research experiment using vector space model, n-gram model and key feature based processing specific to Nepali language shows promising result compared to bag-of-words model for the task of automated Nepali news classification.

Download Full-text

Using text processing techniques to automatically enrich a domain ontology

Proceedings of the international conference on Formal Ontology in Information Systems - FOIS '01 ◽

10.1145/505168.505194 ◽

2001 ◽

Cited By ~ 68

Author(s):

Paola Velardi ◽

Paolo Fabriani ◽

Michele Missikoff

Keyword(s):

Text Processing ◽

Domain Ontology ◽

Processing Techniques

Download Full-text

A common architecture for different text processing techniques in an information retrieval environment

Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '86 ◽

10.1145/253168.253198 ◽

1986 ◽

Cited By ~ 1

Author(s):

G. Thurmair

Keyword(s):

Information Retrieval ◽

Text Processing ◽

Processing Techniques ◽

Common Architecture

Download Full-text

Lost in Space: Geolocation in Event Data

Political Science Research and Methods ◽

10.1017/psrm.2018.23 ◽

2018 ◽

Vol 7 (04) ◽

pp. 871-888 ◽

Cited By ~ 6

Author(s):

Sophie J. Lee ◽

Howard Liu ◽

Michael D. Ward

Keyword(s):

Learning Algorithm ◽

Text Processing ◽

Contextual Information ◽

Training Data ◽

Supervised Machine Learning ◽

Model Parameters ◽

Event Data ◽

Data Set ◽

N Gram ◽

Automated Text Processing

Improving geolocation accuracy in text data has long been a goal of automated text processing. We depart from the conventional method and introduce a two-stage supervised machine-learning algorithm that evaluates each location mention to be either correct or incorrect. We extract contextual information from texts, i.e., N-gram patterns for location words, mention frequency, and the context of sentences containing location words. We then estimate model parameters using a training data set and use this model to predict whether a location word in the test data set accurately represents the location of an event. We demonstrate these steps by constructing customized geolocation event data at the subnational level using news articles collected from around the world. The results show that the proposed algorithm outperforms existing geocoders even in a case added post hoc to test the generality of the developed algorithm.

Download Full-text

AUTOMATED TEXT PROCESSING: TOPIC SEGMENTATION OF EDUCATIONAL TEXTS

Vestnik of Samara State Technical University. Psychological and Pedagogical Sciences ◽

10.17673/vsgtu-pps.2019.3.13 ◽

2019 ◽

pp. 158-173

Author(s):

Marina Solnyshkina ◽

◽

Iskander Yarmakeev ◽

Elzara Gafiyatova ◽

Farida Ismaeva

Keyword(s):

Text Processing ◽

Topic Segmentation ◽

Automated Text Processing

Download Full-text

Detection of Nunation Vowelization Types in The Quran Diacritical Marks Using Automated Text- Processing Algorithms: اكتشاف تنوين التركيب وتنوين التتابع في الضبط القرآني باستخدام خوارزميات المعالجة النصية الآلية

Journal of engineering sciences and information technology - مجلة العلوم الهندسية و تكنولوجيا المعلومات ◽

10.26389/ajsrp.r170620 ◽

2020 ◽

Vol 4 (3) ◽

Author(s):

Amir Adel Mabrouk Eldeib, Moulay Ibrahim El- Khalil Ghembaza

Keyword(s):

Text Processing ◽

Arabic Language ◽

Software Applications ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Holy Quran ◽

The Holy Quran ◽

Processing Algorithms ◽

Training Examples ◽

Automated Text Processing

The science of diacritical marks is closely related to the Holy Quran, as it was used in the Quran to remove confusion and error from the pronunciation of the reader, so the introduction of any technique in the process of processing Quranic texts will have an effect on facilitating the tasks of researchers in the field of Quranic studies, whether on the reader of the Quran, to help him read accurate and correct recitation, or on the tutor to help him compile a number of examples appropriate for training. The importance of this research lies in employing automated text- processing algorithms to determine the locations of the Nunation vowelization types in the Holy Quran, and the possibility of their computerizing in order to facilitate the accurate recitation of the Holy Quran and, at the same time, to collect training examples in a database or building a corpus for future use in many research and software applications for the Holy Quran and its sciences. This research aims to present a new idea through the proposition of a framework architecture that identifies and discover automatically the locations and types of the Nunation in the Holy Quran based on the part- of- speech tagging algorithm for Arabic language so as to determine the type of words, and then by using a knowledge base to discover the appropriate Nunation words and their locations, and finally discovering the type of Nunation so as to determine the vowelization of the last letter of each Nunation word according to the Quran diacritical marks science. Furthermore, another benefit is to link searching processes with Quranic texts towards extracting the composition Nunation and the sequence Nunations in the Holy Quran emerges from the science of Quran diacritical marks; and display them as data according to a set of options selected by the user through suitable applications interfaces. The basic elements that the results of searching Quranic texts should display are highlighted, in order to extract the positions and types of Nunation vowelizations. As well as, a template for the results of searching all types of Nunation in a specific Quranic Chapter is given, with several possible options to retrieve all data in detail.

Download Full-text

Text Processing Techniques in Approaches for Automated Composition of Domain Models

Proceedings of the 16th International Conference on Evaluation of Novel Approaches to Software Engineering ◽

10.5220/0010533904890500 ◽

2021 ◽

Author(s):

Viktorija Gribermane ◽

Erika Nazaruka

Keyword(s):

Text Processing ◽

Domain Models ◽

Automated Composition ◽

Processing Techniques

Download Full-text