scholarly journals Korean Grammatical Error Correction Based on Transformer with Copying Mechanisms and Grammatical Noise Implantation Methods

Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2658
Author(s):  
Myunghoon Lee ◽  
Hyeonho Shin ◽  
Dabin Lee ◽  
Sung-Pil Choi

Grammatical Error Correction (GEC) is the task of detecting and correcting various grammatical errors in texts. Many previous approaches to the GEC have used various mechanisms including rules, statistics, and their combinations. Recently, the performance of the GEC in English has been drastically enhanced due to the vigorous applications of deep neural networks and pretrained language models. Following the promising results of the English GEC tasks, we apply the Transformer with Copying Mechanism into the Korean GEC task by introducing novel and effective noising methods for constructing Korean GEC datasets. Our comparative experiments showed that the proposed system outperforms two commercial grammar check and other NMT-based models.

2020 ◽  
Author(s):  
Masahiro Kaneko ◽  
Masato Mita ◽  
Shun Kiyono ◽  
Jun Suzuki ◽  
Kentaro Inui

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 158702-158711
Author(s):  
Muhammad Salman Ali ◽  
Tauhid Bin Iqbal ◽  
Kang-Ho Lee ◽  
Abdul Muqeet ◽  
Seunghyun Lee ◽  
...  

Author(s):  
Qiqing Wang ◽  
Cunbin Li

The surge of renewable energy systems can lead to increasing incidents that negatively impact economics and society, rendering incident detection paramount to understand the mechanism and range of those impacts. In this paper, a deep learning framework is proposed to detect renewable energy incidents from news articles containing accidents in various renewable energy systems. The pre-trained language models like Bidirectional Encoder Representations from Transformers (BERT) and word2vec are utilized to represent textual inputs, which are trained by the Text Convolutional Neural Networks (TCNNs) and Text Recurrent Neural Networks. Two types of classifiers for incident detection are trained and tested in this paper, one is a binary classifier for detecting the existence of an incident, the other is a multi-label classifier for identifying different incident attributes such as causal-effects and consequences, etc. The proposed incident detection framework is implemented on a hand-annotated dataset with 5 190 records. The results show that the proposed framework performs well on both the incident existence detection task (F1-score 91.4%) and the incident attributes identification task (micro F1-score 81.7%). It is also shown that the BERT-based TCNNs are effective and robust in detecting renewable energy incidents from large-scale textual materials.


2020 ◽  
Vol 35 (12) ◽  
pp. 1987-2008 ◽  
Author(s):  
Han Wang ◽  
Haixian Zhang ◽  
Junjie Hu ◽  
Ying Song ◽  
Sen Bai ◽  
...  

2020 ◽  
Vol 34 (01) ◽  
pp. 1226-1233
Author(s):  
Zewei Zhao ◽  
Houfeng Wang

Grammatical error correction (GEC) is a promising natural language processing (NLP) application, whose goal is to change the sentences with grammatical errors into the correct ones. Neural machine translation (NMT) approaches have been widely applied to this translation-like task. However, such methods need a fairly large parallel corpus of error-annotated sentence pairs, which is not easy to get especially in the field of Chinese grammatical error correction. In this paper, we propose a simple yet effective method to improve the NMT-based GEC models by dynamic masking. By adding random masks to the original source sentences dynamically in the training procedure, more diverse instances of error-corrected sentence pairs are generated to enhance the generalization ability of the grammatical error correction model without additional data. The experiments on NLPCC 2018 Task 2 show that our MaskGEC model improves the performance of the neural GEC models. Besides, our single model for Chinese GEC outperforms the current state-of-the-art ensemble system in NLPCC 2018 Task 2 without any extra knowledge.


Sign in / Sign up

Export Citation Format

Share Document