text annotation
Recently Published Documents


TOTAL DOCUMENTS

98
(FIVE YEARS 45)

H-INDEX

9
(FIVE YEARS 2)

Author(s):  
Darshita Kumar ◽  
Kshitija Choudhari ◽  
Pooja Patel ◽  
Shambhavi Pandey ◽  
Aparna Hajare ◽  
...  

2021 ◽  
Author(s):  
Xuehong Wu ◽  
Junwen Duan ◽  
Jianhua Li ◽  
Min Li
Keyword(s):  

2021 ◽  
Vol 11 (20) ◽  
pp. 9648
Author(s):  
Alexandros Kanterakis ◽  
Nikos Kanakaris ◽  
Manos Koutoulakis ◽  
Konstantina Pitianou ◽  
Nikos Karacapilidis ◽  
...  

Today, there are excellent resources for the semantic annotation of biomedical text. These resources span from ontologies, tools for NLP, annotators, and web services. Most of these are available either in the form of open source components (i.e., MetaMap) or as web services that offer free access (i.e., Whatizit). In order to use these resources in automatic text annotation pipelines, researchers face significant technical challenges. For open-source tools, the challenges include the setting up of the computational environment, the resolution of dependencies, as well as the compilation and installation of the software. For web services, the challenge is implementing clients to undertake communication with the respective web APIs. Even resources that are available as Docker containers (i.e., NCBO annotator) require significant technical skills for installation and setup. This work deals with the task of creating ready-to-install and run Research Objects (ROs) for a large collection of components in biomedical text analysis. These components include (a) tools such as cTAKES, NOBLE Coder, MetaMap, NCBO annotator, BeCAS, and Neji; (b) ontologies from BioPortal, NCBI BioSystems, and Open Biomedical Ontologies; and (c) text corpora such as BC4GO, Mantra Gold Standard Corpus, and the COVID-19 Open Research Dataset. We make these resources available in OpenBio.eu, an open-science RO repository and workflow management system. All ROs can be searched, shared, edited, downloaded, commented on, and rated. We also demonstrate how one can easily connect these ROs to form a large variety of text annotation pipelines.


2021 ◽  
Author(s):  
Franziska Weeber ◽  
Felix Hamborg ◽  
Karsten Donnay ◽  
Bela Gipp

2021 ◽  
Author(s):  
Timo Spinde ◽  
Kanishka Sinha ◽  
Norman Meuschke ◽  
Bela Gipp

2021 ◽  
Author(s):  
Sünje Paasch-Colberg ◽  
Joachim Trebbe ◽  
Christian Strippel ◽  
Martin Emmer

In the past decade, the public discourse on immigration in Germany has been strongly affected by right-wing populist, racist, and Islamophobic positions. This becomes evident especially in the comment sections of news websites and social media platforms, where user discussions often escalate and trigger hate comments against refugees and immigrants and also against journalists, politicians, and other groups. In view of the threatening consequences such sentiments can have for groups who are targeted by right-wing extremist violence, we take a closer look into such user discussions to gain detailed insights into the various forms of hate speech and offensive language against these groups. Using a modularized framework that goes beyond the common “hate/no-hate” dichotomy in the field, we conducted a structured text annotation of 5,031 user comments posted on German news websites and social media in March 2019. Most of the hate speech we found was directed against refugees and immigrants, while other groups were mostly exposed to various forms of offensive language. In comments containing hate speech, refugees and Muslims were frequently stereotyped as criminals, whereas extreme forms of hate speech, such as calls for violence, were rare in our data. These findings are discussed with a focus on their potential consequences for public discourse on immigration in Germany.


In this paper, the process of creating a Dependency Treebank for tweetsin Urdu,a morphologically rich and less-resourced languageis described. The 500 Urdu tweets treebank iscreated by manually annotating the treebank withlemma, POS tags, morphological and syntacticrelations using the Universal Dependencies annotation scheme, adopted to the peculiarities of Urdu social media text. annotation process is evaluated through Inter-annotator agreement for dependency relations and total agreement of 94.5% and resultant weighted Kappa = 0.876was observed. The treebank is evaluated through 10-fold cross validation using Maltparserwith various feature settings. Results show average UAS score of 74%, LAS score of 62.9% and LA score of 69.8%.


Author(s):  
João Rafael Almeida ◽  
João Figueira Silva ◽  
Sérgio Matos ◽  
Alejandro Pazos ◽  
José Luís Oliveira

The process of refining the research question in a medical study depends greatly on the current background of the investigated subject. The information found in prior works can directly impact several stages of the study, namely the cohort definition stage. Besides previous published methods, researchers could also leverage on other materials, such as the output of cohort selection tools, to enrich and to accelerate their own work. However, this kind of information is not always captured by search engines. In this paper, we present a methodology, based on a combination of content-based retrieval and text annotation techniques, to identify relevant scientific publications related to a research question and to the selected data sources.


Genes ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 384
Author(s):  
Sara Castellano ◽  
Federica Cestari ◽  
Giovanni Faglioni ◽  
Elena Tenedini ◽  
Marco Marino ◽  
...  

The rapid evolution of Next Generation Sequencing in clinical settings, and the resulting challenge of variant reinterpretation given the constantly updated information, require robust data management systems and organized approaches. In this paper, we present iVar: a freely available and highly customizable tool with a user-friendly web interface. It represents a platform for the unified management of variants identified by different sequencing technologies. iVar accepts variant call format (VCF) files and text annotation files and elaborates them, optimizing data organization and avoiding redundancies. Updated annotations can be periodically re-uploaded and associated with variants as historically tracked attributes, i.e., modifications can be recorded whenever an updated value is imported, thus keeping track of all changes. Data can be visualized through variant-centered and sample-centered interfaces. A customizable search function can be exploited to periodically check if pathogenicity-related data of a variant has changed over time. Patient recontacting ensuing from variant reinterpretation is made easier by iVar through the effective identification of all patients present in the database carrying a specific variant. We tested iVar by uploading 4171 VCF files and 1463 annotation files, obtaining a database of 4166 samples and 22,569 unique variants. iVar has proven to be a useful tool with good performance in terms of collecting and managing data from a medium-throughput laboratory.


2021 ◽  
Author(s):  
Brendan Kennedy ◽  
Ashwini Ashokkumar ◽  
Ryan L. Boyd ◽  
Morteza Dehghani

Due to the explosion of new sources of human language data and the rapid progression of computational methods for extracting meaning from natural language, language analysis is a promising, though complicated, category of psychological research. In this chapter, we give a modern perspective on language analysis as it applies to psychology, uniting historical context, the diverse range of domains studied in psychology via language, and the methodological rigor of natural language processing (NLP) and machine learning. Top–down methods (e.g., dictionary approaches, text annotation) are presented alongside bottom–up methods (e.g., topic modeling, word embedding, language modeling) in order to give the reader a comprehensive grounding in the tools available and the recommended practices involved in applying them. We conclude with a view of the future of language analysis, specifically the ways in which psychology and NLP will continue to co-develop.


Sign in / Sign up

Export Citation Format

Share Document