Supervised attention for answer selection in community question answering

Answer selection is an important task in Community Question Answering (CQA). In recent years, attention-based neural networks have been extensively studied in various natural language processing problems, including question answering. This paper explores matchLSTM for answer selection in CQA. A lexical gap in CQA is more challenging as questions and answers typical contain multiple sentences, irrelevant information, and noisy expressions. In our investigation, word-by-word attention in the original model does not work well on social question-answer pairs. We propose integrating supervised attention into matchLSTM. Specifically, we leverage lexical-semantic from external to guide the learning of attention weights for question-answer pairs. The proposed model learns more meaningful attention that allows performing better than the basic model. Our performance is among the top on SemEval datasets.

Download Full-text

Enhancing Natural Language Inference Using New and Expanded Training Data Sets and New Learning Models

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6371 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8504-8511

Author(s):

Arindam Mitra ◽

Ishan Shrivastava ◽

Chitta Baral

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Training Data ◽

Data Sets ◽

Learning Models ◽

New Learning ◽

Word Attention ◽

Attention Function

Natural Language Inference (NLI) plays an important role in many natural language processing tasks such as question answering. However, existing NLI modules that are trained on existing NLI datasets have several drawbacks. For example, they do not capture the notion of entity and role well and often end up making mistakes such as “Peter signed a deal” can be inferred from “John signed a deal”. As part of this work, we have developed two datasets that help mitigate such issues and make the systems better at understanding the notion of “entities” and “roles”. After training the existing models on the new dataset we observe that the existing models do not perform well on one of the new benchmark. We then propose a modification to the “word-to-word” attention function which has been uniformly reused across several popular NLI architectures. The resulting models perform as well as their unmodified counterparts on the existing benchmarks and perform significantly well on the new benchmarks that emphasize “roles” and “entities”.

Download Full-text

Language processing and learning models for community question answering in Arabic

Information Processing & Management ◽

10.1016/j.ipm.2017.07.003 ◽

2019 ◽

Vol 56 (2) ◽

pp. 274-290 ◽

Cited By ~ 4

Author(s):

Salvatore Romeo ◽

Giovanni Da San Martino ◽

Yonatan Belinkov ◽

Alberto Barrón-Cedeño ◽

Mohamed Eldesouki ◽

...

Keyword(s):

Language Processing ◽

Question Answering ◽

Learning Models ◽

Community Question Answering

Download Full-text

Neural Abstractive Summarization with Structural Attention

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/514 ◽

2020 ◽

Author(s):

Tanya Chowdhury ◽

Sachin Kumar ◽

Tanmoy Chakraborty

Keyword(s):

Question Answering ◽

Popular Opinion ◽

Document Summarization ◽

Community Question Answering ◽

Proposed Model ◽

Abstractive Summarization

Attentional, RNN-based encoder-decoder architectures have obtained impressive performance on abstractive summarization of news articles. However, these methods fail to account for long term dependencies within the sentences of a document. This problem is exacerbated in multi-document summarization tasks such as summarizing the popular opinion in threads present in community question answering (CQA) websites such as Yahoo! Answers and Quora. These threads contain answers which often overlap or contradict each other. In this work, we present a hierarchical encoder based on structural attention to model such inter-sentence and inter-document dependencies. We set the popular pointer-generator architecture and some of the architectures derived from it as our baselines and show that they fail to generate good summaries in a multi-document setting. We further illustrate that our proposed model achieves significant improvement over the baseline in both single and multi-document summarization settings -- in the former setting, it beats the baseline by 1.31 and 7.8 ROUGE-1 points on CNN and CQA datasets, respectively; in the latter setting, the performance is further improved by 1.6 ROUGE-1 points on the CQA dataset.

Download Full-text

Leap-LSTM: Enhancing Long Short-Term Memory for Text Categorization

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/697 ◽

2019 ◽

Cited By ~ 2

Author(s):

Ting Huang ◽

Gehui Shen ◽

Zhi-Hong Deng

Keyword(s):

Language Processing ◽

Text Categorization ◽

Question Answering ◽

Short Term Memory ◽

Data Sets ◽

Trade Offs ◽

Memory For Text ◽

Long Short Term Memory ◽

Current Word ◽

Better Than

Recurrent Neural Networks (RNNs) are widely used in the field of natural language processing (NLP), ranging from text categorization to question answering and machine translation. However, RNNs generally read the whole text from beginning to end or vice versa sometimes, which makes it inefficient to process long texts. When reading a long document for a categorization task, such as topic categorization, large quantities of words are irrelevant and can be skipped. To this end, we propose Leap-LSTM, an LSTM-enhanced model which dynamically leaps between words while reading texts. At each step, we utilize several feature encoders to extract messages from preceding texts, following texts and the current word, and then determine whether to skip the current word. We evaluate Leap-LSTM on several text categorization tasks: sentiment analysis, news categorization, ontology classification and topic classification, with five benchmark data sets. The experimental results show that our model reads faster and predicts better than standard LSTM. Compared to previous models which can also skip words, our model achieves better trade-offs between performance and efficiency.

Download Full-text

Dual-View Variational Autoencoders for Semi-Supervised Text Matching

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/737 ◽

2019 ◽

Author(s):

Zhongbin Xie ◽

Shuai Ma

Keyword(s):

Question Answering ◽

Fundamental Problem ◽

Sentence Pair ◽

Community Question Answering ◽

Word Level ◽

Sentence Level ◽

Proposed Model ◽

Variational Autoencoder ◽

Matching Models ◽

Text Matching

Semantically matching two text sequences (usually two sentences) is a fundamental problem in NLP. Most previous methods either encode each of the two sentences into a vector representation (sentence-level embedding) or leverage word-level interaction features between the two sentences. In this study, we propose to take the sentence-level embedding features and the word-level interaction features as two distinct views of a sentence pair, and unify them with a framework of Variational Autoencoders such that the sentence pair is matched in a semi-supervised manner. The proposed model is referred to as Dual-View Variational AutoEncoder (DV-VAE), where the optimization of the variational lower bound can be interpreted as an implicit Co-Training mechanism for two matching models over distinct views. Experiments on SNLI, Quora and a Community Question Answering dataset demonstrate the superiority of our DV-VAE over several strong semi-supervised and supervised text matching models.

Download Full-text

Joint Learning of Answer Selection and Answer Summary Generation in Community Question Answering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6266 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7651-7658 ◽

Cited By ~ 2

Author(s):

Yang Deng ◽

Wai Lam ◽

Yuexiang Xie ◽

Daoyuan Chen ◽

Yaliang Li ◽

...

Keyword(s):

Large Scale ◽

Question Answering ◽

State Of The Art ◽

Reading Difficulties ◽

Text Summarization ◽

Essential Information ◽

Joint Learning ◽

Community Question Answering ◽

Proposed Model ◽

Correlation Information

Community question answering (CQA) gains increasing popularity in both academy and industry recently. However, the redundancy and lengthiness issues of crowdsourced answers limit the performance of answer selection and lead to reading difficulties and misunderstandings for community users. To solve these problems, we tackle the tasks of answer selection and answer summary generation in CQA with a novel joint learning model. Specifically, we design a question-driven pointer-generator network, which exploits the correlation information between question-answer pairs to aid in attending the essential information when generating answer summaries. Meanwhile, we leverage the answer summaries to alleviate noise in original lengthy answers when ranking the relevancy degrees of question-answer pairs. In addition, we construct a new large-scale CQA corpus, WikiHowQA, which contains long answers for answer selection as well as reference summaries for answer summarization. The experimental results show that the joint learning method can effectively address the answer redundancy issue in CQA and achieves state-of-the-art results on both answer selection and text summarization tasks. Furthermore, the proposed model is shown to be of great transferring ability and applicability for resource-poor CQA tasks, which lack of reference answer summaries.

Download Full-text

Question Tagging via Graph-guided Ranking

ACM Transactions on Information Systems ◽

10.1145/3468270 ◽

2022 ◽

Vol 40 (1) ◽

pp. 1-23

Author(s):

Xiao Zhang ◽

Meng Liu ◽

Jianhua Yin ◽

Zhaochun Ren ◽

Liqiang Nie

Keyword(s):

Directed Acyclic Graph ◽

Question Answering ◽

Vital Role ◽

Portable Devices ◽

Context Modeling ◽

Community Question Answering ◽

Proposed Model ◽

Limited Experience ◽

Multi Level ◽

The One

With the increasing prevalence of portable devices and the popularity of community Question Answering (cQA) sites, users can seamlessly post and answer many questions. To effectively organize the information for precise recommendation and easy searching, these platforms require users to select topics for their raised questions. However, due to the limited experience, certain users fail to select appropriate topics for their questions. Thereby, automatic question tagging becomes an urgent and vital problem for the cQA sites, yet it is non-trivial due to the following challenges. On the one hand, vast and meaningful topics are available yet not utilized in the cQA sites; how to model and tag them to relevant questions is a highly challenging problem. On the other hand, related topics in the cQA sites may be organized into a directed acyclic graph. In light of this, how to exploit relations among topics to enhance their representations is critical. To settle these challenges, we devise a graph-guided topic ranking model to tag questions in the cQA sites appropriately. In particular, we first design a topic information fusion module to learn the topic representation by jointly considering the name and description of the topic. Afterwards, regarding the special structure of topics, we propose an information propagation module to enhance the topic representation. As the comprehension of questions plays a vital role in question tagging, we design a multi-level context-modeling-based question encoder to obtain the enhanced question representation. Moreover, we introduce an interaction module to extract topic-aware question information and capture the interactive information between questions and topics. Finally, we utilize the interactive information to estimate the ranking scores for topics. Extensive experiments on three Chinese cQA datasets have demonstrated that our proposed model outperforms several state-of-the-art competitors.

Download Full-text

Multi-Modal Explicit Sparse Attention Networks for Visual Question Answering

Sensors ◽

10.3390/s20236758 ◽

2020 ◽

Vol 20 (23) ◽

pp. 6758

Author(s):

Zihan Guo ◽

Dezhi Han

Keyword(s):

Language Processing ◽

Visual Information ◽

Question Answering ◽

Visual Image ◽

Irrelevant Information ◽

Transport Systems ◽

Image Region ◽

Attention Networks ◽

Visual Question Answering ◽

Will Force

Visual question answering (VQA) is a multi-modal task involving natural language processing (NLP) and computer vision (CV), which requires models to understand of both visual information and textual information simultaneously to predict the correct answer for the input visual image and textual question, and has been widely used in smart and intelligent transport systems, smart city, and other fields. Today, advanced VQA approaches model dense interactions between image regions and question words by designing co-attention mechanisms to achieve better accuracy. However, modeling interactions between each image region and each question word will force the model to calculate irrelevant information, thus causing the model’s attention to be distracted. In this paper, to solve this problem, we propose a novel model called Multi-modal Explicit Sparse Attention Networks (MESAN), which concentrates the model’s attention by explicitly selecting the parts of the input features that are the most relevant to answering the input question. We consider that this method based on top-k selection can reduce the interference caused by irrelevant information and ultimately help the model to achieve better performance. The experimental results on the benchmark dataset VQA v2 demonstrate the effectiveness of our model. Our best single model delivers 70.71% and 71.08% overall accuracy on the test-dev and test-std sets, respectively. In addition, we also demonstrate that our model can obtain better attended features than other advanced models through attention visualization. Our work proves that the models with sparse attention mechanisms can also achieve competitive results on VQA datasets. We hope that it can promote the development of VQA models and the application of artificial intelligence (AI) technology related to VQA in various aspects.

Download Full-text

User Embedding for Expert Finding in Community Question Answering

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3441302 ◽

2021 ◽

Vol 15 (4) ◽

pp. 1-16

Author(s):

Negin Ghasemi ◽

Ramin Fatourechi ◽

Saeedeh Momtazi

Keyword(s):

Question Answering ◽

Community Relations ◽

Expert Finding ◽

Community Question Answering ◽

Proposed Model

The number of users who have the appropriate knowledge to answer asked questions in community question answering is lower than those who ask questions. Therefore, finding expert users who can answer the questions is very crucial and useful. In this article, we propose a framework to find experts for given questions and assign them the related questions. The proposed model benefits from users’ relations in a community along with the lexical and semantic similarities between new question and existing answers. Node embedding is applied to the community graph to find similar users. Our experiments on four different Stack Exchange datasets show that adding community relations improves the performance of expert finding models.

Download Full-text

Semantic Similarity Analysis for Corpus Development and Paraphrase Detection in Arabic

The International Arab Journal of Information Technology ◽

10.34028/iajit/18/1/1 ◽

2020 ◽

Vol 18 (1) ◽

pp. 1-7

Author(s):

Adnen Mahmoud ◽

Mounir Zrigui

Keyword(s):

Language Processing ◽

Latent Dirichlet Allocation ◽

Question Answering ◽

Semantic Analysis ◽

Arabic Language ◽

Plagiarism Detection ◽

Part Of Speech ◽

Document Frequency ◽

Proposed Model ◽

Semantic Similarity Analysis

Paraphrase detection allows determining how original and suspect documents convey the same meaning. It has attracted attention from researchers in many Natural Language Processing (NLP) tasks such as plagiarism detection, question answering, information retrieval, etc., Traditional methods (e.g., Term Frequency-Inverse Document Frequency (TF-IDF), Latent Dirichlet Allocation (LDA), and Latent Semantic Analysis (LSA)) cannot capture efficiently hidden semantic relations when sentences may not contain any common words or the co-occurrence of words is rarely present. Therefore, we proposed a deep learning model based on Global Word embedding (GloVe) and Recurrent Convolutional Neural Network (RCNN). It was efficient for capturing more contextual dependencies between words vectors with precise semantic meanings. Seeing the lack of resources in Arabic language publicly available, we developed a paraphrased corpus automatically. It preserved syntactic and semantic structures of Arabic sentences using word2vec model and Part-Of-Speech (POS) annotation. Overall experiments shown that our proposed model outperformed the state-of-the-art methods in terms of precision and recall

Download Full-text