Automatic Error Detection and Correction of Text: The State of the Art

Abstract Rule-based natural language generation denotes the process of converting a semantic input structure into a surface representation by means of a grammar. In the following, we assume that this grammar is handcrafted and not automatically created for instance by a deep neural network. Such a grammar might comprise of a large set of rules. A single error in these rules can already have a large impact on the quality of the generated sentences, potentially causing even a complete failure of the entire generation process. Searching for errors in these rules can be quite tedious and time-consuming due to potentially complex and recursive dependencies. This work proposes a statistical approach to recognizing errors and providing suggestions for correcting certain kinds of errors by cross-checking the grammar with the semantic input structure. The basic assumption is the correctness of the latter, which is usually a valid hypothesis due to the fact that these input structures are often automatically created. Our evaluation reveals that in many cases an automatic error detection and correction is indeed possible.

Download Full-text

Input-Aware Implication Selection Scheme Utilizing ATPG for Efficient Concurrent Error Detection

Electronics ◽

10.3390/electronics7100258 ◽

2018 ◽

Vol 7 (10) ◽

pp. 258 ◽

Cited By ~ 4

Author(s):

Abdus Hassan ◽

Umar Afzaal ◽

Tooba Arifeen ◽

Jeong Lee

Keyword(s):

Error Detection ◽

High Probability ◽

State Of The Art ◽

The State ◽

Experimental Results ◽

Concurrent Error Detection ◽

Selection Algorithm ◽

Probability Of Error ◽

Selection Scheme ◽

Selection Strategies

Recently, concurrent error detection enabled through invariant relationships between different wires in a circuit has been proposed. Because there are many such implications in a circuit, selection strategies have been developed to select the most valuable implications for inclusion in the checker hardware such that a sufficiently high probability of error detection ( P d e t e c t i o n ) is achieved. These algorithms, however, due to their heuristic nature cannot guarantee a lossless P d e t e c t i o n . In this paper, we develop a new input-aware implication selection algorithm with the help of ATPG which minimizes loss on P d e t e c t i o n . In our algorithm, the detectability of errors for each candidate implication is carefully evaluated using error prone vectors. The evaluation results are then utilized to select the most efficient candidates for achieving optimal P d e t e c t i o n . The experimental results on 15 representative combinatorial benchmark circuits from the MCNC benchmarks suite show that the implications selected from our algorithm achieve better P d e t e c t i o n in comparison to the state of the art. The proposed method also offers better performance, up to 41.10%, in terms of the proposed impact-level metric, which is the ratio of achieved P d e t e c t i o n to the implication count.

Download Full-text

ERROR DETECTION AND CORRECTION BASED ON CHINESE PHONEMIC ALPHABET IN CHINESE TEXT

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488508005261 ◽

2008 ◽

Vol 16 (supp01) ◽

pp. 89-105 ◽

Cited By ~ 2

Author(s):

CHUEN-MIN HUANG ◽

MEI-CHEN WU ◽

CHING-CHE CHANG

Keyword(s):

Chinese Text ◽

Error Detection ◽

Experimental Results ◽

Text Editor ◽

Correction Rate ◽

Error Detection And Correction ◽

Chinese Texts ◽

Chinese Writing ◽

Automatic Error

Misspelling and misconception resulting from similar pronunciation appears frequently in Chinese texts. Without double check-up, this situation will be getting worse even with the help of Chinese input editor. It is hoped that the quality of Chinese writing would be enhanced if an effective automatic error detection and correction mechanism is embedded in text editor. Therefore, the burden of manpower to proofread shall be released. Until recently, researches in automatic error detection and correction of Chinese text have undergone many challenges and suffered from bad performance compared with that of Western text. In view of the prominent phenomenon in Chinese writing problem, this study proposes a learning model based on Chinese phonemic alphabets. The experimental results demonstrate that this model is effective in finding out misspellings and further improves detection and correction rate.

Download Full-text

automatic error-detection and correction system

10.1007/springerreference_8502 ◽

2011 ◽

Keyword(s):

Error Detection ◽

Error Detection And Correction ◽

Automatic Error ◽

Correction System

Download Full-text

automatic error-detection and correction system

Computer Science and Communications Dictionary ◽

10.1007/1-4020-0613-6_1098 ◽

2000 ◽

pp. 84-84

Author(s):

Martin H. Weik

Keyword(s):

Error Detection ◽

Error Detection And Correction ◽

Automatic Error ◽

Correction System

Download Full-text

Spell checker for consumer language (CSpell)

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy171 ◽

2019 ◽

Vol 26 (3) ◽

pp. 211-218 ◽

Cited By ~ 7

Author(s):

Chris J Lu ◽

Alan R Aronson ◽

Sonya E Shooshan ◽

Dina Demner-Fushman

Keyword(s):

Error Detection ◽

State Of The Art ◽

Word Boundary ◽

Spelling Error ◽

Consumer Health ◽

Correct Word ◽

Ranking System ◽

Novel Approach ◽

Error Detection And Correction ◽

Spell Checker

Abstract Objective Automated understanding of consumer health inquiries might be hindered by misspellings. To detect and correct various types of spelling errors in consumer health questions, we developed a distributable spell-checking tool, CSpell, that handles nonword errors, real-word errors, word boundary infractions, punctuation errors, and combinations of the above. Methods We developed a novel approach of using dual embedding within Word2vec for context-dependent corrections. This technique was used in combination with dictionary-based corrections in a 2-stage ranking system. We also developed various splitters and handlers to correct word boundary infractions. All correction approaches are integrated to handle errors in consumer health questions. Results Our approach achieves an F1 score of 80.93% and 69.17% for spelling error detection and correction, respectively. Discussion The dual-embedding model shows a significant improvement (9.13%) in F1 score compared with the general practice of using cosine similarity with word vectors in Word2vec for context ranking. Our 2-stage ranking system shows a 4.94% improvement in F1 score compared with the best 1-stage ranking system. Conclusion CSpell improves over the state of the art and provides near real-time automatic misspelling detection and correction in consumer health questions. The software and the CSpell test set are available at https://umlslex.nlm.nih.gov/cSpell.

Download Full-text

Automatic error detection and correction approach in Chinese text based on features and learning

Proceedings of the 3rd World Congress on Intelligent Control and Automation (Cat. No.00EX393) ◽

10.1109/wcica.2000.862557 ◽

2002 ◽

Cited By ~ 4

Author(s):

Zhang Lei ◽

Zhou Ming ◽

Huang Changning ◽

Lu Mingyu

Keyword(s):

Chinese Text ◽

Error Detection ◽

Error Detection And Correction ◽

Automatic Error

Download Full-text

Using learner corpora for automatic error detection and correction

Studies in Corpus Linguistics - Automatic Treatment and Analysis of Learner Corpus Data ◽

10.1075/scl.59.09gam ◽

2013 ◽

pp. 127-150 ◽

Cited By ~ 3

Author(s):

Michael Gamon ◽

Martin Chodorow ◽

Claudia Leacock ◽

Joel Tetreault

Keyword(s):

Error Detection ◽

Learner Corpora ◽

Error Detection And Correction ◽

Automatic Error

Download Full-text

Background 3

Ubiquitous Cardiology ◽

10.4018/978-1-60566-080-6.ch004 ◽

2009 ◽

pp. 110-144

Author(s):

Piotr Augustyniak ◽

Ryszard Tadeusiewicz

Keyword(s):

Error Detection ◽

Medical Information ◽

General Purpose ◽

Digital Data ◽

Human Errors ◽

Error Detection And Correction ◽

Database Operations ◽

Information Interchange ◽

Automatic Error

This chapter defines the set of standard diagnostic parameters and metadata expected from cardiac examinations. Rest ECG, exercise ECG, and long-term recording techniques are compared with regard to method-appropriate hierarchies of diagnostic results. This summary is approaching the idea of high redundancy in the dataset influencing data transmission and database operations. As far as the paper record was concerned, these spare data were useful in the validation and correction of human errors. Nowadays, automatic error detection and correction codes are widely applied in systems for storage and transmission of digital data. Basic issues about DICOM and HL7, two widespread medical information interchange systems, are presented thereafter. These general-purpose systems integrate multi-modal medical data and offer specialized tools for the storage, retrieval, and management of data. Both standards originate from the efforts of standardizing the description of possibly wide aspects of patient-oriented digital data in the form of electronic health records. Certain aspects of data security are also considered here.

Download Full-text

Correlating the Machine Learning Models for Automatic Error Detection and Correction in Medical Images

2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA) ◽

10.1109/icimia48430.2020.9074862 ◽

2020 ◽

Author(s):

N.K. Roopa ◽

G S Mamatha

Keyword(s):

Machine Learning ◽

Error Detection ◽

Medical Images ◽

Learning Models ◽

Error Detection And Correction ◽

Automatic Error ◽

Machine Learning Models

Download Full-text