ARQMath

The Answer Retrieval for Questions on Math (ARQMath) evaluation was run for the first time at CLEF 2020. ARQMath is the first Community Question Answering (CQA) shared task for math, retrieving existing answers from Math Stack Exchange (MSE) that can help to answer previously unseen math questions. ARQMath also introduces a new protocol for math formula search, where formulas are evaluated in context using a query formula's associated question post, and posts associated with each retrieved formula. Over 70 topics were annotated for each task by eight undergraduate students supervised by a professor of mathematics. A formula index is provided in three formats: LATEX, Presentation MathML, and Content MathML, avoiding the need for participants to extract these themselves. In addition to detailed relevance judgments, tools are provided to parse MSE data, generate question threads in HTML, and evaluate retrieval results. To make comparisons with participating systems fairer, nDCG' (i.e., nDCG for assessed hits only) is used to compare systems for each task. ARQMath will continue in CLEF 2021, with training data from 2020 and baseline systems for both tasks to reduce barriers to entry for this challenging problem domain.

Download Full-text

Arabic community question answering

Natural Language Engineering ◽

10.1017/s1351324918000426 ◽

2018 ◽

Vol 25 (1) ◽

pp. 5-41

Author(s):

PRESLAV NAKOV ◽

LLUÍS MÀRQUEZ ◽

ALESSANDRO MOSCHITTI ◽

HAMDY MUBARAK

Keyword(s):

Question Answering ◽

International Workshop ◽

Lessons Learned ◽

Future Research ◽

Learning Approaches ◽

Shared Task ◽

Community Question Answering ◽

Syntactic Information ◽

Corpus Creation ◽

Difficult Cases

AbstractWe analyze resources and models for Arabic community Question Answering (cQA). In particular, we focus on CQA-MD, our cQA corpus for Arabic in the domain of medical forums. We describe the corpus and the main challenges it poses due to its mix of informal and formal language, and of different Arabic dialects, as well as due to its medical nature. We further present a shared task on cQA at SemEval, the International Workshop on Semantic Evaluation, based on this corpus. We discuss the features and the machine learning approaches used by the teams who participated in the task, with focus on the models that exploit syntactic information using convolutional tree kernels and neural word embeddings. We further analyze and extend the outcome of the SemEval challenge by training a meta-classifier combining the output of several systems. This allows us to compare different features and different learning algorithms in an indirect way. Finally, we analyze the most frequent errors common to all approaches, categorizing them into prototypical cases, and zooming into the way syntactic information in tree kernel approaches can help solve some of the most difficult cases. We believe that our analysis and the lessons learned from the process of corpus creation as well as from the shared task analysis will be helpful for future research on Arabic cQA.

Download Full-text

Interactive Text Ranking with Bayesian Optimization: A Case Study on Community QA and Summarization

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00344 ◽

2020 ◽

Vol 8 ◽

pp. 759-775

Author(s):

Edwin Simpson ◽

Yang Gao ◽

Iryna Gurevych

Keyword(s):

Question Answering ◽

Training Data ◽

Bayesian Optimization ◽

Small Data ◽

Large Space ◽

Reward Function ◽

Community Question Answering ◽

Interactive Text ◽

Text Ranking

For many NLP applications, such as question answering and summarization, the goal is to select the best solution from a large space of candidates to meet a particular user’s needs. To address the lack of user or task-specific training data, we propose an interactive text ranking approach that actively selects pairs of candidates, from which the user selects the best. Unlike previous strategies, which attempt to learn a ranking across the whole candidate space, our method uses Bayesian optimization to focus the user’s labeling effort on high quality candidates and integrate prior knowledge to cope better with small data scenarios. We apply our method to community question answering (cQA) and extractive multidocument summarization, finding that it significantly outperforms existing interactive approaches. We also show that the ranking function learned by our method is an effective reward function for reinforcement learning, which improves the state of the art for interactive summarization.

Download Full-text

COALA: A Neural Coverage-Based Approach for Long Answer Selection with Small Data

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016932 ◽

2019 ◽

Vol 33 ◽

pp. 6932-6939 ◽

Cited By ~ 2

Author(s):

Andreas Rücklé ◽

Nafise Sadat Moosavi ◽

Iryna Gurevych

Keyword(s):

Question Answering ◽

Training Data ◽

Small Data ◽

Huge Number ◽

Passage Retrieval ◽

Large Margin ◽

Community Question Answering ◽

Syntactic Information ◽

Selection Approach ◽

Training Examples

Current neural network based community question answering (cQA) systems fall short of (1) properly handling long answers which are common in cQA; (2) performing under small data conditions, where a large amount of training data is unavailable—i.e., for some domains in English and even more so for a huge number of datasets in other languages; and (3) benefiting from syntactic information in the model—e.g., to differentiate between identical lexemes with different syntactic roles. In this paper, we propose COALA, an answer selection approach that (a) selects appropriate long answers due to an effective comparison of all question-answer aspects, (b) has the ability to generalize from a small number of training examples, and (c) makes use of the information about syntactic roles of words. We show that our approach outperforms existing answer selection models by a large margin on six cQA datasets from different domains. Furthermore, we report the best results on the passage retrieval benchmark WikiPassageQA.

Download Full-text