semantic errors
Recently Published Documents


TOTAL DOCUMENTS

153
(FIVE YEARS 38)

H-INDEX

22
(FIVE YEARS 1)

Author(s):  
Mieradilijiang Maimaiti ◽  
Yang Liu ◽  
Huanbo Luan ◽  
Zegao Pan ◽  
Maosong Sun

Data augmentation is an approach for several text generation tasks. Generally, in the machine translation paradigm, mainly in low-resource language scenarios, many data augmentation methods have been proposed. The most used approaches for generating pseudo data mainly lay in word omission, random sampling, or replacing some words in the text. However, previous methods barely guarantee the quality of augmented data. In this work, we try to build the data by using paraphrase embedding and POS-Tagging. Namely, we generate the fake monolingual corpus by replacing the main four POS-Tagging labels, such as noun, adjective, adverb, and verb, based on both the paraphrase table and their similarity. We select the bigger corpus size of the paraphrase table with word level and obtain the word embedding of each word in the table, then calculate the cosine similarity between these words and tagged words in the original sequence. In addition, we exploit the ranking algorithm to choose highly similar words to reduce semantic errors and leverage the POS-Tagging replacement to mitigate syntactic error to some extent. Experimental results show that our augmentation method consistently outperforms all previous SOTA methods on the low-resource language pairs in seven language pairs from four corpora by 1.16 to 2.39 BLEU points.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Samir Al-Janabi ◽  
Ryszard Janicki

PurposeData quality is a major challenge in data management. For organizations, the cleanliness of data is a significant problem that affects many business activities. Errors in data occur for different reasons, such as violation of business rules. However, because of the huge amount of data, manual cleaning alone is infeasible. Methods are required to repair and clean the dirty data through automatic detection, which are data quality issues to address. The purpose of this work is to extend the density-based data cleaning approach using conditional functional dependencies to achieve better data repair.Design/methodology/approachA set of conditional functional dependencies is introduced as an input to the density-based data cleaning algorithm. The algorithm repairs inconsistent data using this set.FindingsThis new approach was evaluated through experiments on real-world as well as synthetic datasets. The repair quality was determined using the F-measure. The results showed that the quality and scalability of the density-based data cleaning approach improved when conditional functional dependencies were introduced.Originality/valueConditional functional dependencies capture semantic errors among data values. This work demonstrates that the density-based data cleaning approach can be improved in terms of repairing inconsistent data by using conditional functional dependencies.


K ta Kita ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. 180-186
Author(s):  
Ellara Yusea Ananda ◽  
Henny Putri Saking Wijaya

This study was conducted to find out the types of lexical errors that high and low proficiency learners produced in their writing, and find the differences and similarities from the production of the two groups. To answer the research questions, the writers used the theory by James (2013) about lexical error classification. This study was a qualitative study. The sources of the data of this study were the lexical errors from 32 writing drafts. The writers divided the students into two groups: eight students from high proficiency level and eight students from low proficiency level. The findings showed that the high proficiency learners produced five types of formal errors and four types of semantic errors, while the low proficiency learners produced two types of formal errors and six types of semantic errors. In conclusion, the high and the low proficiency learners’ lexical error productions are due to the learners’ lack of knowledge in sense relation and collocation, and the learners’ wrong terms of near-synonyms production.Keywords: lexical errors, high proficiency learners, low proficiency learners, error analysis


2021 ◽  
Author(s):  
Zahra VaraminyBahnemiry ◽  
Jessie Galasso ◽  
Khalid Belharbi ◽  
Houari Sahraoui

Author(s):  
Yuli Yana ◽  
Hendi Mustofa ◽  
Lina Dwi Safitri

In this study, researchers are interested in analyzing language errors in the field of semantics. The subject of this research is the speech of Nadiem Makarim (Minister of Education and Culture). The first speech is about National Teacher's Day 2020, and the second speech is about the commemoration of National Education Day 2021. The method used in this analysis is a qualitative method. The qualitative method shows that the analysis is only based on the facts contained in the speech of the Minister of Education and Culture (Nadiem Makarim). The technique used in this study to collect data is the technique of observing linguistic elements in the speech of the Minister of Education and Culture (Nadiem Makarim) and the technique of systematically recording language errors contained in the speech. In the Mendikbud's speech on the commemoration of National Teacher's Day 2020, there were seven semantic errors, namely in the words chain, laboratory, jump, stakeholder, walking, painter, and the word high. Meanwhile, from Nadiem Makarim's speech, in commemoration of National Education Day 2021, there are also seven semantic errors, namely in sheet words, grasping, breakthroughs, walking in place, leaps, fields, and stretched words. The technique used in this study is a technique for observing linguistic elements in the speech of the Minister of Education and Culture (Nadiem Makarim), and a technique for systematically recording language errors in speeches.


Author(s):  
Karima Boussaha ◽  
Farid Mokhati ◽  
Amira Hanneche

This article introduces a new learner's self-assessment environment as CEHL that allows comparison of learners' programs with those elaborated by the teacher. The subjacent idea is to indirectly compare programs through their graphical representations described by ontologies. So, CEHL developed so-called S_Onto_ALPPWA which allows comparing learners' productions with those elaborated by the teacher. The tool allows essentially (1) generating two ontologies from the learner's program and the teacher's one, (2) applies some matching algorithms for measuring degrees of similarity and dissimilarity between learner's program and teacher's one, and (3) assessing the learners by giving them a list of semantic and syntactic errors detected in their programs. The present work is an extension of the authors' previous work, which did not take into account semantics errors. In the present work, they have managed to detect syntactic and semantic errors by using ontologies. To demonstrate the effectiveness of the system, two prospective experiments were conducted. The obtained results were very encouraging.


2021 ◽  
Vol 15 ◽  
Author(s):  
Effrosyni Ntemou ◽  
Ann-Katrin Ohlerth ◽  
Sebastian Ille ◽  
Sandro M. Krieg ◽  
Roelien Bastiaanse ◽  
...  

Navigated Transcranial Magnetic Stimulation (nTMS) is used to understand the cortical organization of language in preparation for the surgical removal of a brain tumor. Action naming with finite verbs can be employed for that purpose, providing additional information to object naming. However, little research has focused on the properties of the verbs that are used in action naming tasks, such as their status as transitive (taking an object; e.g., to read) or intransitive (not taking an object; e.g., to wink). Previous neuroimaging data show higher activation for transitive compared to intransitive verbs in posterior perisylvian regions bilaterally. In the present study, we employed nTMS and production of finite verbs to investigate the cortical underpinnings of transitivity. Twenty neurologically healthy native speakers of German participated in the study. They underwent language mapping in both hemispheres with nTMS. The action naming task with finite verbs consisted of transitive (e.g., The man reads the book) and intransitive verbs (e.g., The woman winks) and was controlled for relevant psycholinguistic variables. Errors were classified in four different error categories (i.e., non-linguistic errors, grammatical errors, lexico-semantic errors and, errors at the sound level) and were analyzed quantitatively. We found more nTMS-positive points in the left hemisphere, particularly in the left parietal lobe for the production of transitive compared to intransitive verbs. These positive points most commonly corresponded to lexico-semantic errors. Our findings are in line with previous aphasia and neuroimaging studies, suggesting that a more widespread network is used for the production of verbs with a larger number of arguments (i.e., transitives). The higher number of lexico-semantic errors with transitive compared to intransitive verbs in the left parietal lobe supports previous claims for the role of left posterior areas in the retrieval of argument structure information.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yanning Chen

In recent years, most of the communication places are using business English manual simultaneous interpretation or electronic equipment translation. In the context of diverse cultures, the way English is used and its grammar vary from country to country. In the face of this situation, how to optimize business English translation technology and improve the accuracy of business communication content is one of the research contents of scholars all over the world. This paper first introduces the purpose of business English translation and the gap between business English translation and general English translation. Secondly, a genetic algorithm is used to optimize the structure of the BP neural network, and the combination of the two improves the ability of translation search. This paper compares the influence of the traditional BP algorithm and the BP algorithm optimized by genetic algorithm on the construction of a business English translation model. The results show that BP neural network optimized by the genetic algorithm can improve the speed of business English text translation, reduce the impact of semantic errors on the accuracy of the translation model, and improve the efficiency of translation.


2021 ◽  
Vol 102 (2) ◽  
pp. 141-149
Author(s):  
G. Akhmetova ◽  
◽  
A. Bizhkenova ◽  

The present research paper discusses the issues of identifying common lexical and semantic mistakes in Kazakh pre-intermediate EFL learners studying in homogeneous groups at university. Words are viewed as powerful tools and when used correctly, words may evoke different feelings and emotions and cause various actions. It’s important to learn how to spot difficult words, correct them timely, and master lexical competence teaching to use the words correctly. The data of the study were collected by learner’s EFL teacher from their final essays. Thirty-one essays were used as the instrument of the study to obtain real language from the participants. The authors of the study hope that the results of the conducted research would contribute to the understanding of the phenomenon of lexical and semantic errors in English language teaching which will help teachers to elaborate the differentiated tasks and ways of explaining new vocabulary preventing students’ misunderstanding. Furthermore, the results of the presented research can serve as guidance and be used in compiling EFL textbooks for Kazakhstani students. As a result, researchers managed to classify lexical and semantic errors in English language teaching, identified frequent errors, and described their causes. The findings of the research illustrate that the participants of the study make errors of word choice and incorrect collocations the most. Moreover, incorrect usage of the preposition and literal translation from L1 are included in the frequent mistakes


Logopedija ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 78-83
Author(s):  
Katarina Pavičić Dokoza

Speed, accuracy, and type of errors during word processing in children with developmental language disorder (DLD) have often been in the focus of various lexical studies. Results of these studies are uniform: children with DLD show slow and less accurate processing. Less is known about the speed and accuracy of verb processing. Therefore, the aim of this study is to explore whether there are differences in the speed and accuracy of verb processing between children with DLD and their typical developing chronological peers (TDC) and younger, language age-matched peers (TDC-y), with special attention to the type of errors produced. The participants in this pilot study were 30 children between the ages of 7;11 and 11 years. Average age of children with DLD was 10;2; TDC children 9;9; and TDC-y children were 8,1. Research procedure included stimulus word presented in auditory form, and children’s task was to choose which one of the three presented pictures on computer screen represent the verb they just heard. Results showed no statistically significant differences regarding speed and accuracy between groups of participants. The difference in proportion of errors in picture selection task did not reach statistical significance when it comes to phonological mistakes, nor when it comes to semantic errors. However, the proportion of phonological errors had a tendency of highest scores in group of children with DLD, while proportion of semantic errors was highest in TDC-y. According to findings from this study, it seems important to emphasize the importance of phonological exercises parallel with exercises focused on vocabulary span in work with children with DLD. Number of exposures to the new word in children with DLD can play a significant role in speed of processing but it can also lead to overlearning affecting research outcomes. Children with DLD who participated in this study had been enrolled in speech and language therapy for several years. Future studies should, among other, also control this variable.


Sign in / Sign up

Export Citation Format

Share Document