shared task
Recently Published Documents


TOTAL DOCUMENTS

675
(FIVE YEARS 307)

H-INDEX

28
(FIVE YEARS 5)

Author(s):  
Tharindu Ranasinghe ◽  
Marcos Zampieri

Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g., hate speech, cyberbullying, and cyberaggression). The clear majority of these studies deal with English partially because most annotated datasets available contain English data. In this article, we take advantage of available English datasets by applying cross-lingual contextual word embeddings and transfer learning to make predictions in low-resource languages. We project predictions on comparable data in Arabic, Bengali, Danish, Greek, Hindi, Spanish, and Turkish. We report results of 0.8415 F1 macro for Bengali in TRAC-2 shared task [23], 0.8532 F1 macro for Danish and 0.8701 F1 macro for Greek in OffensEval 2020 [58], 0.8568 F1 macro for Hindi in HASOC 2019 shared task [27], and 0.7513 F1 macro for Spanish in in SemEval-2019 Task 5 (HatEval) [7], showing that our approach compares favorably to the best systems submitted to recent shared tasks on these three languages. Additionally, we report competitive performance on Arabic and Turkish using the training and development sets of OffensEval 2020 shared task. The results for all languages confirm the robustness of cross-lingual contextual embeddings and transfer learning for this task.


2021 ◽  
pp. 1-12
Author(s):  
Fazlourrahman Balouchzahi ◽  
Grigori Sidorov ◽  
Hosahalli Lakshmaiah Shashirekha

Complex learning approaches along with complicated and expensive features are not always the best or the only solution for Natural Language Processing (NLP) tasks. Despite huge progress and advancements in learning approaches such as Deep Learning (DL) and Transfer Learning (TL), there are many NLP tasks such as Text Classification (TC), for which basic Machine Learning (ML) classifiers perform superior to DL or TL approaches. Added to this, an efficient feature engineering step can significantly improve the performance of ML based systems. To check the efficacy of ML based systems and feature engineering on TC, this paper explores char, character sequences, syllables, word n-grams as well as syntactic n-grams as features and SHapley Additive exPlanations (SHAP) values to select the important features from the collection of extracted features. Voting Classifiers (VC) with soft and hard voting of four ML classifiers, namely: Support Vector Machine (SVM) with Linear and Radial Basis Function (RBF) kernel, Logistic Regression (LR), and Random Forest (RF) was trained and evaluated on Fake News Spreaders Profiling (FNSP) shared task dataset in PAN 2020. This shared task consists of profiling fake news spreaders in English and Spanish languages. The proposed models exhibited an average accuracy of 0.785 for both languages in this shared task and outperformed the best models submitted to this task.


Author(s):  
Jaehun Shin ◽  
Wonkee Lee ◽  
Byung-Hyun Go ◽  
Baikjin Jung ◽  
Youngkil Kim ◽  
...  

Automatic post-editing (APE) is the study of correcting translation errors in the output of an unknown machine translation (MT) system and has been considered as a method of improving translation quality without any modification to conventional MT systems. Recently, several variants of Transformer that take both the MT output and its corresponding source sentence as inputs have been proposed for APE; and models introducing an additional attention layer into the encoder to jointly encode the MT output with its source sentence recorded a high-rank in the WMT19 APE shared task. We examine the effectiveness of such joint-encoding strategy in a controlled environment and compare four types of decoder multi-source attention strategies that have been introduced into previous APE models. The experimental results indicate that the joint-encoding strategy is effective and that taking the final encoded representation of the source sentence is the more proper strategy than taking such representation within the same encoder stack. Furthermore, among the multi-source attention strategies combined with the joint-encoding, the strategy that applies attention to the concatenated input representation and the strategy that adds up the individual attention to each input improve the quality of APE results over the strategy using the joint-encoding only.


2021 ◽  
Vol 72 (5-6) ◽  
pp. 285-290
Author(s):  
Susan Illing

Zusammenfassung Die in diesem Artikel vorgestellte Bachelorarbeit behandelt die Ergebnisse einer Shared Task im Bereich eHealth. Es wird untersucht, ob die Klassifikationsgenauigkeit ausgewählter klinischer Codiersysteme durch den Einsatz von Ensemble-Methoden verbessert werden kann. Entscheidend dafür sind die Werte der Evaluationsmaße Mean Average Precision und F1-Maß.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0256874
Author(s):  
Iknoor Singh ◽  
Carolina Scarton ◽  
Kalina Bontcheva

The Coronavirus (COVID-19) pandemic has led to a rapidly growing ‘infodemic’ of health information online. This has motivated the need for accurate semantic search and retrieval of reliable COVID-19 information across millions of documents, in multiple languages. To address this challenge, this paper proposes a novel high precision and high recall neural Multistage BiCross encoder approach. It is a sequential three-stage ranking pipeline which uses the Okapi BM25 retrieval algorithm and transformer-based bi-encoder and cross-encoder to effectively rank the documents with respect to the given query. We present experimental results from our participation in the Multilingual Information Access (MLIA) shared task on COVID-19 multilingual semantic search. The independently evaluated MLIA results validate our approach and demonstrate that it outperforms other state-of-the-art approaches according to nearly all evaluation metrics in cases of both monolingual and bilingual runs.


2021 ◽  
Author(s):  
R. Gretter ◽  
Marco Matassoni ◽  
D. Falavigna ◽  
A. Misra ◽  
C.W. Leong ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document