A Framework for Indonesian Grammar Error Correction

Author(s):  
Nankai Lin ◽  
Boyu Chen ◽  
Xiaotian Lin ◽  
Kanoksak Wattanachote ◽  
Shengyi Jiang

Grammatical Error Correction (GEC) is a challenge in Natural Language Processing research. Although many researchers have been focusing on GEC in universal languages such as English or Chinese, few studies focus on Indonesian, which is a low-resource language. In this article, we proposed a GEC framework that has the potential to be a baseline method for Indonesian GEC tasks. This framework treats GEC as a multi-classification task. It integrates different language embedding models and deep learning models to correct 10 types of Part of Speech (POS) error in Indonesian text. In addition, we constructed an Indonesian corpus that can be utilized as an evaluation dataset for Indonesian GEC research. Our framework was evaluated on this dataset. Results showed that the Long Short-Term Memory model based on word-embedding achieved the best performance. Its overall macro-average F 0.5 in correcting 10 POS error types reached 0.551. Results also showed that the framework can be trained on a low-resource dataset.

Author(s):  
Bo Wang ◽  
◽  
Kaoru Hirota ◽  
Chang Liu ◽  
Yaping Dai ◽  
...  

An approach to N-best hypotheses re-ranking using a sequence-labeling model is applied to resolve the data deficiency problem in Grammatical Error Correction (GEC). Multiple candidate sentences are generated using a Neural Machine Translation (NMT) model; thereafter, these sentences are re-ranked via a stacked Transformer following a Bidirectional Long Short-Term Memory (BiLSTM) with Conditional Random Field (CRF). Correlations within the sentences are extracted using the sequence-labeling model based on the Transformer, which is particularly suitable for long sentences. Meanwhile, the knowledge from a large amount of unlabeled data is acquired through the pre-trained structure. Thus, completely revised sentences are adopted instead of partially modified sentences. Compared with conventional NMT, experiments on the NUCLE and FCE datasets demonstrate that the model improves the F0.5 score by 8.22% and 2.09%, respectively. As an advantage, the proposed re-ranking method has the advantage of only requires a small set of easily computed features that do not need linguistic inputs.


Author(s):  
Casper Shikali Shivachi ◽  
Refuoe Mokhosi ◽  
Zhou Shijie ◽  
Liu Qihe

The need to capture intra-word information in natural language processing (NLP) tasks has inspired research in learning various word representations at word, character, or morpheme levels, but little attention has been given to syllables from a syllabic alphabet. Motivated by the success of compositional models in morphological languages, we present a Convolutional-long short term memory (Conv-LSTM) model for constructing Swahili word representation vectors from syllables. The unified architecture addresses the word agglutination and polysemous nature of Swahili by extracting high-level syllable features using a convolutional neural network (CNN) and then composes quality word embeddings with a long short term memory (LSTM). The word embeddings are then validated using a syllable-aware language model ( 31.267 ) and a part-of-speech (POS) tagging task ( 98.78 ), both yielding very competitive results to the state-of-art models in their respective domains. We further validate the language model using Xhosa and Shona, which are syllabic-based languages. The novelty of the study is in its capability to construct quality word embeddings from syllables using a hybrid model that does not use max-over-pool common in CNN and then the exploitation of these embeddings in POS tagging. Therefore, the study plays a crucial role in the processing of agglutinative and syllabic-based languages by contributing quality word embeddings from syllable embeddings, a robust Conv–LSTM model that learns syllables for not only language modeling and POS tagging, but also for other downstream NLP tasks.


2021 ◽  
pp. 1-10
Author(s):  
Hye-Jeong Song ◽  
Tak-Sung Heo ◽  
Jong-Dae Kim ◽  
Chan-Young Park ◽  
Yu-Seop Kim

Sentence similarity evaluation is a significant task used in machine translation, classification, and information extraction in the field of natural language processing. When two sentences are given, an accurate judgment should be made whether the meaning of the sentences is equivalent even if the words and contexts of the sentences are different. To this end, existing studies have measured the similarity of sentences by focusing on the analysis of words, morphemes, and letters. To measure sentence similarity, this study uses Sent2Vec, a sentence embedding, as well as morpheme word embedding. Vectors representing words are input to the 1-dimension convolutional neural network (1D-CNN) with various sizes of kernels and bidirectional long short-term memory (Bi-LSTM). Self-attention is applied to the features transformed through Bi-LSTM. Subsequently, vectors undergoing 1D-CNN and self-attention are converted through global max pooling and global average pooling to extract specific values, respectively. The vectors generated through the above process are concatenated to the vector generated through Sent2Vec and are represented as a single vector. The vector is input to softmax layer, and finally, the similarity between the two sentences is determined. The proposed model can improve the accuracy by up to 5.42% point compared with the conventional sentence similarity estimation models.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Kazi Nabiul Alam ◽  
Md Shakib Khan ◽  
Abdur Rab Dhruba ◽  
Mohammad Monirujjaman Khan ◽  
Jehad F. Al-Amri ◽  
...  

The COVID-19 pandemic has had a devastating effect on many people, creating severe anxiety, fear, and complicated feelings or emotions. After the initiation of vaccinations against coronavirus, people’s feelings have become more diverse and complex. Our aim is to understand and unravel their sentiments in this research using deep learning techniques. Social media is currently the best way to express feelings and emotions, and with the help of Twitter, one can have a better idea of what is trending and going on in people’s minds. Our motivation for this research was to understand the diverse sentiments of people regarding the vaccination process. In this research, the timeline of the collected tweets was from December 21 to July21. The tweets contained information about the most common vaccines available recently from across the world. The sentiments of people regarding vaccines of all sorts were assessed using the natural language processing (NLP) tool, Valence Aware Dictionary for sEntiment Reasoner (VADER). Initializing the polarities of the obtained sentiments into three groups (positive, negative, and neutral) helped us visualize the overall scenario; our findings included 33.96% positive, 17.55% negative, and 48.49% neutral responses. In addition, we included our analysis of the timeline of the tweets in this research, as sentiments fluctuated over time. A recurrent neural network- (RNN-) oriented architecture, including long short-term memory (LSTM) and bidirectional LSTM (Bi-LSTM), was used to assess the performance of the predictive models, with LSTM achieving an accuracy of 90.59% and Bi-LSTM achieving 90.83%. Other performance metrics such as precision,, F1-score, and a confusion matrix were also used to validate our models and findings more effectively. This study improves understanding of the public’s opinion on COVID-19 vaccines and supports the aim of eradicating coronavirus from the world.


2018 ◽  
Vol 10 (11) ◽  
pp. 113 ◽  
Author(s):  
Yue Li ◽  
Xutao Wang ◽  
Pengjian Xu

Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.


Symmetry ◽  
2019 ◽  
Vol 11 (10) ◽  
pp. 1290 ◽  
Author(s):  
Rahman ◽  
Siddiqui

Abstractive text summarization that generates a summary by paraphrasing a long text remains an open significant problem for natural language processing. In this paper, we present an abstractive text summarization model, multi-layered attentional peephole convolutional LSTM (long short-term memory) (MAPCoL) that automatically generates a summary from a long text. We optimize parameters of MAPCoL using central composite design (CCD) in combination with the response surface methodology (RSM), which gives the highest accuracy in terms of summary generation. We record the accuracy of our model (MAPCoL) on a CNN/DailyMail dataset. We perform a comparative analysis of the accuracy of MAPCoL with that of the state-of-the-art models in different experimental settings. The MAPCoL also outperforms the traditional LSTM-based models in respect of semantic coherence in the output summary.


Electronics ◽  
2019 ◽  
Vol 8 (6) ◽  
pp. 681 ◽  
Author(s):  
Praveen Edward James ◽  
Hou Kit Mun ◽  
Chockalingam Aravind Vaithilingam

The purpose of this work is to develop a spoken language processing system for smart device troubleshooting using human-machine interaction. This system combines a software Bidirectional Long Short Term Memory Cell (BLSTM)-based speech recognizer and a hardware LSTM-based language processor for Natural Language Processing (NLP) using the serial RS232 interface. Mel Frequency Cepstral Coefficient (MFCC)-based feature vectors from the speech signal are directly input into a BLSTM network. A dropout layer is added to the BLSTM layer to reduce over-fitting and improve robustness. The speech recognition component is a combination of an acoustic modeler, pronunciation dictionary, and a BLSTM network for generating query text, and executes in real time with an 81.5% Word Error Rate (WER) and average training time of 45 s. The language processor comprises a vectorizer, lookup dictionary, key encoder, Long Short Term Memory Cell (LSTM)-based training and prediction network, and dialogue manager, and transforms query intent to generate response text with a processing time of 0.59 s, 5% hardware utilization, and an F1 score of 95.2%. The proposed system has a 4.17% decrease in accuracy compared with existing systems. The existing systems use parallel processing and high-speed cache memories to perform additional training, which improves the accuracy. However, the performance of the language processor has a 36.7% decrease in processing time and 50% decrease in hardware utilization, making it suitable for troubleshooting smart devices.


Author(s):  
Tao Gui ◽  
Qi Zhang ◽  
Lujun Zhao ◽  
Yaosong Lin ◽  
Minlong Peng ◽  
...  

In recent years, long short-term memory (LSTM) has been successfully used to model sequential data of variable length. However, LSTM can still experience difficulty in capturing long-term dependencies. In this work, we tried to alleviate this problem by introducing a dynamic skip connection, which can learn to directly connect two dependent words. Since there is no dependency information in the training data, we propose a novel reinforcement learning-based method to model the dependency relationship and connect dependent words. The proposed model computes the recurrent transition functions based on the skip connections, which provides a dynamic skipping advantage over RNNs that always tackle entire sentences sequentially. Our experimental results on three natural language processing tasks demonstrate that the proposed method can achieve better performance than existing methods. In the number prediction experiment, the proposed model outperformed LSTM with respect to accuracy by nearly 20%.


Sign in / Sign up

Export Citation Format

Share Document