ВИКОРИСТАННЯ ЦИФРОВИХ ТЕХНОЛОГІЙ ДЛЯ ВИПРАВЛЕННЯ ГРАМАТИЧНИХ ПОМИЛОК: СИНТАКСИЧНІ N-ГРАМИ ТА МЕТОДИ ГЛИБИННОГО НАВЧАННЯ

Мова ◽

10.18524/2307-4558.2021.35.237789 ◽

2021 ◽

pp. 237-241

Author(s):

Олена Олександрівна ПОЖАРИЦЬКА ◽

Кирило Володимирович ТРОЇЦЬКИЙ

Keyword(s):

Error Correction ◽

Grammatical Error

Об’єкт статті — автоматизоване виправлення граматичних помилок як галузь лінгвістики. Предмет статті — різноманітність методів та технологій, які використовуються у виправленні граматичних помилок, а також можливості їх використання та оцінка. У статті розглянуто найбільш продуктивні методи, що застосовуються у галузі виявлення та виправлення граматичних помилок в комп’ютерній лінгвістиці. Мета статті полягає у маніфестації ефективності застосування комп’ютерних програм задля виявлення граматичних помилок в англомовному тексті. Дослідницькі методи, використані у статті: аналіз данних, опис абстрактних комп’ютерних моделей та спостереження над їх продуктивністю. У статті розглянуто комп’ютерну модель для виявлення та визначення граматичних помилок, засновану на синтаксичних n-грамах, дано її визначення, описано шляхи її реалізації та етапи попередньої обробки даних, необхідні для роботи моделі. Встановлено, що конкретними типами помилок, які залучена комп’ютерна модель може виявити, є помилки підмето-присудкового узгодження, помилки у виборі прийменника, числа іменників, а також деякі типи помилок, пов’язані з використанням артиклю. Також у статті проаналізовано іншу модель, засновану на архітектурі трансформера — GECToR (Grammatical Error Correction: Tag, Not Rewrite). Ця модель глибинного навчання спрямована на виявлення та виправлення набагато складніших помилок, у тому числі тих, що пов’язані з екстралінгвістичними реаліями. Крім того, вона є доволі корисною, оскільки, на відміну від інших моделей, які просто коригують неправильні слова без пояснень, GECToR призначає теги, які можна додатково інтерпретувати для навчальних цілей. У процесі аналізу зроблено висновок про переваги та недоліки розглянутих моделей та методів, що були виявлені після їх практичної реалізації. Під час оцінки продуктивності вищезазначених моделей на основі спільного завдання BEA 2019 були отримані наступні результати: модель, заснована на синтаксичних n-грамах, отримала показник F0,5 7,6 %, а оцінка F0,5 моделі GECToR визначила її ефективність як 66,7 %. Отримані дані свідчать про майже дев’ятикратну перевагу ефективності методів глибинного навчання (типу GECToR) порівняно з методами, заснованими на правилах (типу методу синтаксичних n-грамів).

Download Full-text

Generating artificial errors for grammatical error correction

10.3115/v1/e14-3013 ◽

2014 ◽

Cited By ~ 4

Author(s):

Mariano Felice ◽

Zheng Yuan

Keyword(s):

Error Correction ◽

Grammatical Error

Download Full-text

Deep Context Model for Grammatical Error Correction

10.21437/slate.2017-29 ◽

2017 ◽

Cited By ~ 1

Author(s):

Chuan Wang ◽

Ruobing Li ◽

Hui Lin

Keyword(s):

Error Correction ◽

Context Model ◽

Grammatical Error

Download Full-text

An efficient system for grammatical error correction on mobile devices

2021 IEEE 15th International Conference on Semantic Computing (ICSC) ◽

10.1109/icsc50631.2021.00034 ◽

2021 ◽

Author(s):

Sourabh Vasant Gothe ◽

Sushant Dogra ◽

Mritunjai Chandra ◽

Chandramouli Sanchi ◽

Barath Raj Kandur Raja

Keyword(s):

Error Correction ◽

Mobile Devices ◽

Efficient System ◽

Grammatical Error

Download Full-text

A Comprehensive Survey of Grammatical Error Correction

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3474840 ◽

2021 ◽

Vol 12 (5) ◽

pp. 1-51

Author(s):

Yu Wang ◽

Yuelin Wang ◽

Kai Dang ◽

Jie Liu ◽

Zhuo Liu

Keyword(s):

Error Correction ◽

Machine Translation ◽

Language Processing ◽

Data Augmentation ◽

Intelligent System ◽

Statistical Machine Translation ◽

Error Type ◽

Data Annotation ◽

Depth Analysis ◽

Grammatical Error

Grammatical error correction (GEC) is an important application aspect of natural language processing techniques, and GEC system is a kind of very important intelligent system that has long been explored both in academic and industrial communities. The past decade has witnessed significant progress achieved in GEC for the sake of increasing popularity of machine learning and deep learning. However, there is not a survey that untangles the large amount of research works and progress in this field. We present the first survey in GEC for a comprehensive retrospective of the literature in this area. We first give the definition of GEC task and introduce the public datasets and data annotation schema. After that, we discuss six kinds of basic approaches, six commonly applied performance boosting techniques for GEC systems, and three data augmentation methods. Since GEC is typically viewed as a sister task of Machine Translation (MT), we put more emphasis on the statistical machine translation (SMT)-based approaches and neural machine translation (NMT)-based approaches for the sake of their importance. Similarly, some performance-boosting techniques are adapted from MT and are successfully combined with GEC systems for enhancement on the final performance. More importantly, after the introduction of the evaluation in GEC, we make an in-depth analysis based on empirical results in aspects of GEC approaches and GEC systems for a clearer pattern of progress in GEC, where error type analysis and system recapitulation are clearly presented. Finally, we discuss five prospective directions for future GEC researches.

Download Full-text

Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction

10.18653/v1/d16-1161 ◽

2016 ◽

Cited By ~ 10

Author(s):

Marcin Junczys-Dowmunt ◽

Roman Grundkiewicz

Keyword(s):

Error Correction ◽

Machine Translation ◽

State Of The Art ◽

Grammatical Error

Download Full-text

Automatic Metric Validation for Grammatical Error Correction

10.18653/v1/p18-1127 ◽

2018 ◽

Cited By ~ 1

Author(s):

Leshem Choshen ◽

Omri Abend

Keyword(s):

Error Correction ◽

Grammatical Error

Download Full-text

Do Grammatical Error Correction Models Realize Grammatical Generalization?

Journal of Natural Language Processing ◽

10.5715/jnlp.28.1331 ◽

2021 ◽

Vol 28 (4) ◽

pp. 1331-1335

Author(s):

Masato Mita

Keyword(s):

Error Correction ◽

Error Correction Models ◽

Grammatical Error

Download Full-text

GECToR – Grammatical Error Correction: Tag, Not Rewrite

10.18653/v1/2020.bea-1.16 ◽

2020 ◽

Cited By ~ 1

Author(s):

Kostiantyn Omelianchuk ◽

Vitaliy Atrasevych ◽

Artem Chernodub ◽

Oleksandr Skurzhanskyi

Keyword(s):

Error Correction ◽

Grammatical Error

Download Full-text

Efficient Grammatical Error Correction with Hierarchical Error Detections and Correction

10.1109/icws53863.2021.00073 ◽

2021 ◽

Author(s):

Fayu Pan ◽

Bin Cao

Keyword(s):

Error Correction ◽

Grammatical Error

Download Full-text

Towards Minimal Supervision BERT-Based Grammar Error Correction (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7202 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13859-13860

Author(s):

Yiyuan Li ◽

Antonios Anastasopoulos ◽

Alan W. Black

Keyword(s):

Error Correction ◽

Contextual Information ◽

Language Model ◽

Sequence Generation ◽

Strong Potential ◽

Grammatical Error

Current grammatical error correction (GEC) models typically consider the task as sequence generation, which requires large amounts of annotated data and limit the applications in data-limited settings. We try to incorporate contextual information from pre-trained language model to leverage annotation and benefit multilingual scenarios. Results show strong potential of Bidirectional Encoder Representations from Transformers (BERT) in grammatical error correction task.

Download Full-text