Creating a grammar checker for CALL by constraint relaxation: a feasibility study

ReCALL ◽  
2001 ◽  
Vol 13 (1) ◽  
pp. 110-120 ◽  
Author(s):  
ANNE VANDEVENTER

Intelligent feedback on learners’ full written sentence productions requires the use of Natural Language Processing (NLP) tools and, in particular, of a diagnosis system. Most syntactic parsers, on which grammar checkers are based, are designed to parse grammatical sentences and/or native speaker productions. They are therefore not necessarily suitable for language learners. In this paper, we concentrate on the transformation of a French syntactic parser into a grammar checker geared towards intermediate to advanced learners of French. Several techniques are envisaged to allow the parser to handle ill-formed input, including constraint relaxation. By the very nature of this technique, parsers can generate complete analyses for ungrammatical sentences. Proper labelling of where the analysis has been able to proceed thanks to a specific constraint relaxation forms the basis of the error diagnosis. Parsers with relaxed constraints tend to produce more complete, although incorrect, analyses for grammatical sentences, and several complete analyses for ungrammatical sentences. This increased number of analyses per sentence has one major drawback: it slows down the system and requires more memory. An experiment was conducted to observe the behaviour of our parser in the context of constraint relaxation. Three specific constraints, agreement in number, gender, and person, were selected and relaxed in different combinations. A learner corpus was parsed with each combination. The evolution of the number of correct diagnoses and of parsing speed, among other factors, were monitored. We then evaluated, by comparing the results, whether large scale constraint relaxation is a viable option to transform our syntactic parser into an efficient grammar checker for CALL.

2012 ◽  
Vol 2 (3) ◽  
pp. 95-124 ◽  
Author(s):  
Bor HODOŠČEK ◽  
Kikuko NISHINA

In this report, we introduce the Hinoki project, which set out to develop web-based Computer-Assisted Language Learning (CALL) systems for Japanese language learners more than a decade ago. Utilizing Natural Language Processing technologies and other linguistic resources, the project has come to encompass three systems, two corpora and many other resources. Beginning with the reading assistance system Asunaro, we describe the construction of Asunaro's multilingual dictionary and it's dependency grammar-based approach to reading assistance. The second system, Natsume, is a writing assistance system that uses large-scale corpora to provide an easy to use collocation search feature that is interesting for it's inclusion of the concept of genre. The final system, Nutmeg, is an extension of Natsume and the Natane learner corpus. It provides automatic correction of learners errors in compositions by using Natsume for its large corpus and genre-aware collocation data and Natane for its data on learner errors.


2021 ◽  
Author(s):  
Xinxu Shen ◽  
Troy Houser ◽  
David Victor Smith ◽  
Vishnu P. Murty

The use of naturalistic stimuli, such as narrative movies, is gaining popularity in many fields, characterizing memory, affect, and decision-making. Narrative recall paradigms are often used to capture the complexity and richness of memory for naturalistic events. However, scoring narrative recalls is time-consuming and prone to human biases. Here, we show the validity and reliability of using a natural language processing tool, the Universal Sentence Encoder (USE), to automatically score narrative recall. We compared the reliability in scoring made between two independent raters (i.e., hand-scored) and between our automated algorithm and individual raters (i.e., automated) on trial-unique, video clips of magic tricks. Study 1 showed that our automated segmentation approaches yielded high reliability and reflected measures yielded by hand-scoring, and further that the results using USE outperformed another popular natural language processing tool, GloVe. In study two, we tested whether our automated approach remained valid when testing individual’s varying on clinically-relevant dimensions that influence episodic memory, age and anxiety. We found that our automated approach was equally reliable across both age groups and anxiety groups, which shows the efficacy of our approach to assess narrative recall in large-scale individual difference analysis. In sum, these findings suggested that machine learning approaches implementing USE are a promising tool for scoring large-scale narrative recalls and perform individual difference analysis for research using naturalistic stimuli.


10.29007/pc58 ◽  
2018 ◽  
Author(s):  
Julia Lavid ◽  
Marta Carretero ◽  
Juan Rafael Zamorano

In this paper we set forth an annotation model for dynamic modality in English and Spanish, given its relevance not only for contrastive linguistic purposes, but also for its impact on practical annotation tasks in the Natural Language Processing (NLP) community. An annotation scheme is proposed, which captures both the functional-semantic meanings and the language-specific realisations of dynamic meanings in both languages. The scheme is validated through a reliability study performed on a randomly selected set of one hundred and twenty sentences from the MULTINOT corpus, resulting in a high degree of inter-annotator agreement. We discuss our main findings and give attention to the difficult cases as they are currently being used to develop detailed guidelines for the large-scale annotation of dynamic modality in English and Spanish.


Author(s):  
Kaan Ant ◽  
Ugur Sogukpinar ◽  
Mehmet Fatif Amasyali

The use of databases those containing semantic relationships between words is becoming increasingly widespread in order to make natural language processing work more effective. Instead of the word-bag approach, the suggested semantic spaces give the distances between words, but they do not express the relation types. In this study, it is shown how semantic spaces can be used to find the type of relationship and it is compared with the template method. According to the results obtained on a very large scale, while is_a and opposite are more successful for semantic spaces for relations, the approach of templates is more successful in the relation types at_location, made_of and non relational.


2019 ◽  
Vol 6 ◽  
Author(s):  
Catharina Marie Stille ◽  
Trevor Bekolay ◽  
Peter Blouw ◽  
Bernd J. Kröger

Author(s):  
Subasish Das ◽  
Anandi Dutta ◽  
Tomas Lindheimer ◽  
Mohammad Jalayer ◽  
Zachary Elgart

The automotive industry is currently experiencing a revolution with the advent and deployment of autonomous vehicles. Several countries are conducting large-scale testing of autonomous vehicles on private and even public roads. It is important to examine the attitudes and potential concerns of end users towards autonomous cars before mass deployment. To facilitate the transition to autonomous vehicles, the automotive industry produces many videos on its products and technologies. The largest video sharing website, YouTube.com, hosts many videos on autonomous vehicle technology. Content analysis and text mining of the comments related to the videos with large numbers of views can provide insight about potential end-user feedback. This study examines two questions: first, how do people view autonomous vehicles? Second, what polarities exist regarding (a) content and (b) automation level? The researchers found 107 videos on YouTube using a related keyword search and examined comments on the 15 most-viewed videos, which had a total of 60.9 million views and around 25,000 comments. The videos were manually clustered based on their content and automation level. This study used two natural language processing (NLP) tools to perform knowledge discovery from a bag of approximately seven million words. The key issues in the comment threads were mostly associated with efficiency, performance, trust, comfort, and safety. The perception of safety and risk increased in the textual contents when videos presented full automation level. Sentiment analysis shows mixed sentiments towards autonomous vehicle technologies, however, the positive sentiments were higher than the negative.


Author(s):  
Yan Huang ◽  
Akira Murakami ◽  
Theodora Alexopoulou ◽  
Anna Korhonen

Abstract As large-scale learner corpora become increasingly available, it is vital that natural language processing (NLP) technology is developed to provide rich linguistic annotations necessary for second language (L2) research. We present a system for automatically analyzing subcategorization frames (SCFs) for learner English. SCFs link lexis with morphosyntax, shedding light on the interplay between lexical and structural information in learner language. Meanwhile, SCFs are crucial to the study of a wide range of phenomena including individual verbs, verb classes and varying syntactic structures. To illustrate the usefulness of our system for learner corpus research and second language acquisition (SLA), we investigate how L2 learners diversify their use of SCFs in text and how this diversity changes with L2 proficiency.


Sign in / Sign up

Export Citation Format

Share Document