multiple annotators
Recently Published Documents


TOTAL DOCUMENTS

27
(FIVE YEARS 12)

H-INDEX

5
(FIVE YEARS 1)

Terminology ◽  
2021 ◽  
Author(s):  
Oi Yee Kwong

Abstract In this paper, we address the system evaluation issue for commercial term extraction tools from the users’ perspective. We first revisit the gold standard approach commonly practised among researchers, and discuss the challenges it may pose on end users, taking translators as a typical example. Considering the very different motivations and needs of users and researchers, a user-driven approach is proposed as a variation and alternative to the gold standard approach to allow users to assess and understand the performance of commercial tools more objectively. Its feasibility and usefulness are demonstrated by deploying a benchmarking dataset of English-Chinese financial terms, produced by multiple annotators, in a case study with SDL MultiTerm Extract. The results also provide insight for future development of term extractors designed for translators, which will hopefully generate more accurate candidates, offer more customised features, enable better user experience, and enjoy wider popularity as a computer-aided translation tool.


2021 ◽  
Vol 11 (12) ◽  
pp. 5409
Author(s):  
Julián Gil-González ◽  
Andrés Valencia-Duque ◽  
Andrés Álvarez-Meza ◽  
Álvaro Orozco-Gutiérrez ◽  
Andrea García-Moreno

The increasing popularity of crowdsourcing platforms, i.e., Amazon Mechanical Turk, changes how datasets for supervised learning are built. In these cases, instead of having datasets labeled by one source (which is supposed to be an expert who provided the absolute gold standard), databases holding multiple annotators are provided. However, most state-of-the-art methods devoted to learning from multiple experts assume that the labeler’s behavior is homogeneous across the input feature space. Besides, independence constraints are imposed on annotators’ outputs. This paper presents a regularized chained deep neural network to deal with classification tasks from multiple annotators. The introduced method, termed RCDNN, jointly predicts the ground truth label and the annotators’ performance from input space samples. In turn, RCDNN codes interdependencies among the experts by analyzing the layers’ weights and includes l1, l2, and Monte-Carlo Dropout-based regularizers to deal with the over-fitting issue in deep learning models. Obtained results (using both simulated and real-world annotators) demonstrate that RCDNN can deal with multi-labelers scenarios for classification tasks, defeating state-of-the-art techniques.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Jennifer D’Souza ◽  
Sören Auer

Abstract Purpose This work aims to normalize the NlpContributions scheme (henceforward, NlpContributionGraph) to structure, directly from article sentences, the contributions information in Natural Language Processing (NLP) scholarly articles via a two-stage annotation methodology: 1) pilot stage—to define the scheme (described in prior work); and 2) adjudication stage—to normalize the graphing model (the focus of this paper). Design/methodology/approach We re-annotate, a second time, the contributions-pertinent information across 50 prior-annotated NLP scholarly articles in terms of a data pipeline comprising: contribution-centered sentences, phrases, and triple statements. To this end, specifically, care was taken in the adjudication annotation stage to reduce annotation noise while formulating the guidelines for our proposed novel NLP contributions structuring and graphing scheme. Findings The application of NlpContributionGraph on the 50 articles resulted finally in a dataset of 900 contribution-focused sentences, 4,702 contribution-information-centered phrases, and 2,980 surface-structured triples. The intra-annotation agreement between the first and second stages, in terms of F1-score, was 67.92% for sentences, 41.82% for phrases, and 22.31% for triple statements indicating that with increased granularity of the information, the annotation decision variance is greater. Research limitations NlpContributionGraph has limited scope for structuring scholarly contributions compared with STEM (Science, Technology, Engineering, and Medicine) scholarly knowledge at large. Further, the annotation scheme in this work is designed by only an intra-annotator consensus—a single annotator first annotated the data to propose the initial scheme, following which, the same annotator reannotated the data to normalize the annotations in an adjudication stage. However, the expected goal of this work is to achieve a standardized retrospective model of capturing NLP contributions from scholarly articles. This would entail a larger initiative of enlisting multiple annotators to accommodate different worldviews into a “single” set of structures and relationships as the final scheme. Given that the initial scheme is first proposed and the complexity of the annotation task in the realistic timeframe, our intra-annotation procedure is well-suited. Nevertheless, the model proposed in this work is presently limited since it does not incorporate multiple annotator worldviews. This is planned as future work to produce a robust model. Practical implications We demonstrate NlpContributionGraph data integrated into the Open Research Knowledge Graph (ORKG), a next-generation KG-based digital library with intelligent computations enabled over structured scholarly knowledge, as a viable aid to assist researchers in their day-to-day tasks. Originality/value NlpContributionGraph is a novel scheme to annotate research contributions from NLP articles and integrate them in a knowledge graph, which to the best of our knowledge does not exist in the community. Furthermore, our quantitative evaluations over the two-stage annotation tasks offer insights into task difficulty.


Author(s):  
J. Gil-Gonzalez ◽  
Juan-Jose Giraldo ◽  
A. M. Alvarez-Meza ◽  
A. Orozco-Gutierrez ◽  
M. A. Alvarez

2020 ◽  
Vol 34 (03) ◽  
pp. 2602-2610
Author(s):  
Ian Beaver ◽  
Cynthia Freeman ◽  
Abdullah Mueen

As Intelligent Virtual Agents (IVAs) increase in adoption and further emulate human personalities, we are interested in how humans apply relational strategies to them compared to other humans in a service environment. Human-computer data from three live customer service IVAs was collected, and annotators marked all text that was deemed unnecessary to the determination of user intention as well as the presence of multiple intents. After merging the selections of multiple annotators, a second round of annotation determined the classes of relational language present in the unnecessary sections such as Greetings, Backstory, Justification, Gratitude, Rants, or Expressing Emotions. We compare the usage of such language in human-human service interactions. We show that removal of this language from task-based inputs has a positive effect by both an increase in confidence and improvement in responses, as evaluated by humans, demonstrating the need for IVAs to anticipate relational language injection. This work provides a methodology to identify relational segments and a baseline of human performance in this task as well as laying the groundwork for IVAs to reciprocate relational strategies in order to improve their believeability.


2020 ◽  
Author(s):  
Diego L Guarin ◽  
Babak Taati ◽  
Tessa Hadlock ◽  
Yana Yunusova

Abstract Background Automatic facial landmark localization in videos is an important first step in many computer vision applications, including the objective assessment of orofacial function. Convolutional neural networks (CNN) for facial landmarks localization are typically trained on faces of healthy and young adults, so model performance is inferior when applied to faces of older adults or people with diseases that affect facial movements, a phenomenon known as algorithmic bias. Fine-tuning pre-trained CNN models with representative data is a well-known technique used to reduce algorithmic bias and improve performance on clinical populations. However, the question of how much data is needed to properly fine-tune the model remains. Methods In this paper, we fine-tuned a popular CNN model for automatic facial landmarks localization using different number of manually annotated photographs from patients with facial palsy and evaluated the effects of the number of photographs used for model fine-tuning in the model performance by computing the normalized root mean squared error between the facial landmarks positions predicted by the model and those provided by manual annotators. Furthermore, we studied the effect of annotator bias by fine-tuning and evaluating the model with data provided by multiple annotators. Results Our results showed that fine-tuning the model with as little as 8 photographs from a single patient significantly improved the model performance on other individuals from the same clinical population, and that the best performance was achieved by fine-tuning the model with 320 photographs from 40 patients. Using more photographs for fine-tuning did not improve the model performance further. Regarding the annotator bias, we found that fine-tuning a CNN model with data from one annotator resulted in models biased against other annotators; our results also showed that this effect can be diminished by averaging data from multiple annotators. Conclusions It is possible to remove the algorithmic bias of a\textbf{depth} CNN model for automatic facial landmark localization using data from only 40 participants (total of 320 photographs). These results pave the way to future clinical applications of CNN models for the automatic assessment of orofacial function in different clinical populations, including patients with Parkinson’s disease and stroke.


2020 ◽  
Vol 34 (2) ◽  
pp. 143-164 ◽  
Author(s):  
Tobias Baur ◽  
Alexander Heimerl ◽  
Florian Lingenfelser ◽  
Johannes Wagner ◽  
Michel F. Valstar ◽  
...  

Abstract In the following article, we introduce a novel workflow, which we subsume under the term “explainable cooperative machine learning” and show its practical application in a data annotation and model training tool called NOVA. The main idea of our approach is to interactively incorporate the ‘human in the loop’ when training classification models from annotated data. In particular, NOVA offers a collaborative annotation backend where multiple annotators join their workforce. A main aspect is the possibility of applying semi-supervised active learning techniques already during the annotation process by giving the possibility to pre-label data automatically, resulting in a drastic acceleration of the annotation process. Furthermore, the user-interface implements recent eXplainable AI techniques to provide users with both, a confidence value of the automatically predicted annotations, as well as visual explanation. We show in an use-case evaluation that our workflow is able to speed up the annotation process, and further argue that by providing additional visual explanations annotators get to understand the decision making process as well as the trustworthiness of their trained machine learning models.


Sign in / Sign up

Export Citation Format

Share Document