scholarly journals Designing and Validating an Annotation Model of Dynamic Modality for English and Spanish: Issues and Problems

10.29007/pc58 ◽  
2018 ◽  
Author(s):  
Julia Lavid ◽  
Marta Carretero ◽  
Juan Rafael Zamorano

In this paper we set forth an annotation model for dynamic modality in English and Spanish, given its relevance not only for contrastive linguistic purposes, but also for its impact on practical annotation tasks in the Natural Language Processing (NLP) community. An annotation scheme is proposed, which captures both the functional-semantic meanings and the language-specific realisations of dynamic meanings in both languages. The scheme is validated through a reliability study performed on a randomly selected set of one hundred and twenty sentences from the MULTINOT corpus, resulting in a high degree of inter-annotator agreement. We discuss our main findings and give attention to the difficult cases as they are currently being used to develop detailed guidelines for the large-scale annotation of dynamic modality in English and Spanish.

2021 ◽  
Author(s):  
Xinxu Shen ◽  
Troy Houser ◽  
David Victor Smith ◽  
Vishnu P. Murty

The use of naturalistic stimuli, such as narrative movies, is gaining popularity in many fields, characterizing memory, affect, and decision-making. Narrative recall paradigms are often used to capture the complexity and richness of memory for naturalistic events. However, scoring narrative recalls is time-consuming and prone to human biases. Here, we show the validity and reliability of using a natural language processing tool, the Universal Sentence Encoder (USE), to automatically score narrative recall. We compared the reliability in scoring made between two independent raters (i.e., hand-scored) and between our automated algorithm and individual raters (i.e., automated) on trial-unique, video clips of magic tricks. Study 1 showed that our automated segmentation approaches yielded high reliability and reflected measures yielded by hand-scoring, and further that the results using USE outperformed another popular natural language processing tool, GloVe. In study two, we tested whether our automated approach remained valid when testing individual’s varying on clinically-relevant dimensions that influence episodic memory, age and anxiety. We found that our automated approach was equally reliable across both age groups and anxiety groups, which shows the efficacy of our approach to assess narrative recall in large-scale individual difference analysis. In sum, these findings suggested that machine learning approaches implementing USE are a promising tool for scoring large-scale narrative recalls and perform individual difference analysis for research using naturalistic stimuli.


Author(s):  
Kaan Ant ◽  
Ugur Sogukpinar ◽  
Mehmet Fatif Amasyali

The use of databases those containing semantic relationships between words is becoming increasingly widespread in order to make natural language processing work more effective. Instead of the word-bag approach, the suggested semantic spaces give the distances between words, but they do not express the relation types. In this study, it is shown how semantic spaces can be used to find the type of relationship and it is compared with the template method. According to the results obtained on a very large scale, while is_a and opposite are more successful for semantic spaces for relations, the approach of templates is more successful in the relation types at_location, made_of and non relational.


2019 ◽  
Vol 6 ◽  
Author(s):  
Catharina Marie Stille ◽  
Trevor Bekolay ◽  
Peter Blouw ◽  
Bernd J. Kröger

Author(s):  
Subasish Das ◽  
Anandi Dutta ◽  
Tomas Lindheimer ◽  
Mohammad Jalayer ◽  
Zachary Elgart

The automotive industry is currently experiencing a revolution with the advent and deployment of autonomous vehicles. Several countries are conducting large-scale testing of autonomous vehicles on private and even public roads. It is important to examine the attitudes and potential concerns of end users towards autonomous cars before mass deployment. To facilitate the transition to autonomous vehicles, the automotive industry produces many videos on its products and technologies. The largest video sharing website, YouTube.com, hosts many videos on autonomous vehicle technology. Content analysis and text mining of the comments related to the videos with large numbers of views can provide insight about potential end-user feedback. This study examines two questions: first, how do people view autonomous vehicles? Second, what polarities exist regarding (a) content and (b) automation level? The researchers found 107 videos on YouTube using a related keyword search and examined comments on the 15 most-viewed videos, which had a total of 60.9 million views and around 25,000 comments. The videos were manually clustered based on their content and automation level. This study used two natural language processing (NLP) tools to perform knowledge discovery from a bag of approximately seven million words. The key issues in the comment threads were mostly associated with efficiency, performance, trust, comfort, and safety. The perception of safety and risk increased in the textual contents when videos presented full automation level. Sentiment analysis shows mixed sentiments towards autonomous vehicle technologies, however, the positive sentiments were higher than the negative.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Jennifer D’Souza ◽  
Sören Auer

Abstract Purpose This work aims to normalize the NlpContributions scheme (henceforward, NlpContributionGraph) to structure, directly from article sentences, the contributions information in Natural Language Processing (NLP) scholarly articles via a two-stage annotation methodology: 1) pilot stage—to define the scheme (described in prior work); and 2) adjudication stage—to normalize the graphing model (the focus of this paper). Design/methodology/approach We re-annotate, a second time, the contributions-pertinent information across 50 prior-annotated NLP scholarly articles in terms of a data pipeline comprising: contribution-centered sentences, phrases, and triple statements. To this end, specifically, care was taken in the adjudication annotation stage to reduce annotation noise while formulating the guidelines for our proposed novel NLP contributions structuring and graphing scheme. Findings The application of NlpContributionGraph on the 50 articles resulted finally in a dataset of 900 contribution-focused sentences, 4,702 contribution-information-centered phrases, and 2,980 surface-structured triples. The intra-annotation agreement between the first and second stages, in terms of F1-score, was 67.92% for sentences, 41.82% for phrases, and 22.31% for triple statements indicating that with increased granularity of the information, the annotation decision variance is greater. Research limitations NlpContributionGraph has limited scope for structuring scholarly contributions compared with STEM (Science, Technology, Engineering, and Medicine) scholarly knowledge at large. Further, the annotation scheme in this work is designed by only an intra-annotator consensus—a single annotator first annotated the data to propose the initial scheme, following which, the same annotator reannotated the data to normalize the annotations in an adjudication stage. However, the expected goal of this work is to achieve a standardized retrospective model of capturing NLP contributions from scholarly articles. This would entail a larger initiative of enlisting multiple annotators to accommodate different worldviews into a “single” set of structures and relationships as the final scheme. Given that the initial scheme is first proposed and the complexity of the annotation task in the realistic timeframe, our intra-annotation procedure is well-suited. Nevertheless, the model proposed in this work is presently limited since it does not incorporate multiple annotator worldviews. This is planned as future work to produce a robust model. Practical implications We demonstrate NlpContributionGraph data integrated into the Open Research Knowledge Graph (ORKG), a next-generation KG-based digital library with intelligent computations enabled over structured scholarly knowledge, as a viable aid to assist researchers in their day-to-day tasks. Originality/value NlpContributionGraph is a novel scheme to annotate research contributions from NLP articles and integrate them in a knowledge graph, which to the best of our knowledge does not exist in the community. Furthermore, our quantitative evaluations over the two-stage annotation tasks offer insights into task difficulty.


Sign in / Sign up

Export Citation Format

Share Document