A Goal-Driven Tree-Structured Neural Model for Math Word Problems

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/736 ◽

2019 ◽

Cited By ~ 1

Author(s):

Zhipeng Xie ◽

Shichao Sun

Keyword(s):

Word Problems ◽

State Of The Art ◽

Neural Model ◽

Neural Models ◽

Top Down ◽

Feedforward Networks ◽

Whole Process ◽

Expression Tree ◽

Generate Solution ◽

Human Problem

Most existing neural models for math word problems exploit Seq2Seq model to generate solution expressions sequentially from left to right, whose results are far from satisfactory due to the lack of goal-driven mechanism commonly seen in human problem solving. This paper proposes a tree-structured neural model to generate expression tree in a goal-driven manner. Given a math word problem, the model first identifies and encodes its goal to achieve, and then the goal gets decomposed into sub-goals combined by an operator in a top-down recursive way. The whole process is repeated until the goal is simple enough to be realized by a known quantity as leaf node. During the process, two-layer gated-feedforward networks are designed to implement each step of goal decomposition, and a recursive neural network is used to encode fulfilled subtrees into subtree embeddings, which provides a better representation of subtrees than the simple goals of subtrees. Experimental results on the dataset Math23K have shown that our tree-structured model outperforms significantly several state-of-the-art models.

Download Full-text

Neural Unsupervised Semantic Role Labeling

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3461613 ◽

2021 ◽

Vol 20 (6) ◽

pp. 1-16

Author(s):

Kashif Munir ◽

Hai Zhao ◽

Zuchao Li

Keyword(s):

Argument Structure ◽

State Of The Art ◽

Neural Model ◽

Semantic Structure ◽

Semantic Role ◽

Semantic Role Labeling ◽

Neural Models ◽

Previous State ◽

Dependency Relations ◽

Predicate Argument Structure

The task of semantic role labeling ( SRL ) is dedicated to finding the predicate-argument structure. Previous works on SRL are mostly supervised and do not consider the difficulty in labeling each example which can be very expensive and time-consuming. In this article, we present the first neural unsupervised model for SRL. To decompose the task as two argument related subtasks, identification and clustering, we propose a pipeline that correspondingly consists of two neural modules. First, we train a neural model on two syntax-aware statistically developed rules. The neural model gets the relevance signal for each token in a sentence, to feed into a BiLSTM, and then an adversarial layer for noise-adding and classifying simultaneously, thus enabling the model to learn the semantic structure of a sentence. Then we propose another neural model for argument role clustering, which is done through clustering the learned argument embeddings biased toward their dependency relations. Experiments on the CoNLL-2009 English dataset demonstrate that our model outperforms the previous state-of-the-art baseline in terms of non-neural models for argument identification and classification.

Download Full-text

Fine-Grained Entity Typing for Domain Independent Entity Linking

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6380 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8576-8583 ◽

Cited By ~ 1

Author(s):

Yasumasa Onoe ◽

Greg Durrett

Keyword(s):

State Of The Art ◽

Neural Model ◽

Test Time ◽

Entity Linking ◽

Neural Models ◽

Label Data ◽

Fine Grained ◽

Independent Entity ◽

Domain Independent ◽

Better Than

Neural entity linking models are very powerful, but run the risk of overfitting to the domain they are trained in. For this problem, a “domain” is characterized not just by genre of text but even by factors as specific as the particular distribution of entities, as neural models tend to overfit by memorizing properties of frequent entities in a dataset. We tackle the problem of building robust entity linking models that generalize effectively and do not rely on labeled entity linking data with a specific entity distribution. Rather than predicting entities directly, our approach models fine-grained entity properties, which can help disambiguate between even closely related entities. We derive a large inventory of types (tens of thousands) from Wikipedia categories, and use hyperlinked mentions in Wikipedia to distantly label data and train an entity typing model. At test time, we classify a mention with this typing model and use soft type predictions to link the mention to the most similar candidate entity. We evaluate our entity linking system on the CoNLL-YAGO dataset (Hoffart et al. 2011) and show that our approach outperforms prior domain-independent entity linking systems. We also test our approach in a harder setting derived from the WikilinksNED dataset (Eshel et al. 2017) where all the mention-entity pairs are unseen during test time. Results indicate that our approach generalizes better than a state-of-the-art neural model on the dataset.

Download Full-text

PERLEX: A Bilingual Persian-English Gold Dataset for Relation Extraction

Scientific Programming ◽

10.1155/2021/8893270 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Majid Asgari-Bidhendi ◽

Mehrdad Nasser ◽

Behrooz Janfada ◽

Behrouz Minaei-Bidgoli

Keyword(s):

Language Processing ◽

Question Answering ◽

State Of The Art ◽

Relation Extraction ◽

Neural Model ◽

Semantic Relations ◽

Base Population ◽

Neural Models ◽

Persian Language ◽

Knowledge Base Population

Relation extraction is the task of extracting semantic relations between entities in a sentence. It is an essential part of some natural language processing tasks such as information extraction, knowledge extraction, question answering, and knowledge base population. The main motivations of this research stem from a lack of a dataset for relation extraction in the Persian language as well as the necessity of extracting knowledge from the growing big data in the Persian language for different applications. In this paper, we present “PERLEX” as the first Persian dataset for relation extraction, which is an expert-translated version of the “SemEval-2010-Task-8” dataset. Moreover, this paper addresses Persian relation extraction utilizing state-of-the-art language-agnostic algorithms. We employ six different models for relation extraction on the proposed bilingual dataset, including a non-neural model (as the baseline), three neural models, and two deep learning models fed by multilingual BERT contextual word representations. The experiments result in the maximum F1-score of 77.66% (provided by BERTEM-MTB method) as the state of the art of relation extraction in the Persian language.

Download Full-text

Meaningful Answer Generation of E-Commerce Question-Answering

ACM Transactions on Information Systems ◽

10.1145/3432689 ◽

2021 ◽

Vol 39 (2) ◽

pp. 1-26

Author(s):

Shen Gao ◽

Xiuying Chen ◽

Zhaochun Ren ◽

Dongyan Zhao ◽

Rui Yan

Keyword(s):

Large Scale ◽

Question Answering ◽

State Of The Art ◽

Neural Model ◽

Product Reviews ◽

Neural Models ◽

Product Attributes ◽

Human Evaluation ◽

Final Answer ◽

Answer Pattern

In e-commerce portals, generating answers for product-related questions has become a crucial task. In this article, we focus on the task of product-aware answer generation , which learns to generate an accurate and complete answer from large-scale unlabeled e-commerce reviews and product attributes. However, safe answer problems (i.e., neural models tend to generate meaningless and universal answers) pose significant challenges to text generation tasks, and e-commerce question-answering task is no exception. To generate more meaningful answers, in this article, we propose a novel generative neural model, called the Meaningful Product Answer Generator ( MPAG ), which alleviates the safe answer problem by taking product reviews, product attributes, and a prototype answer into consideration. Product reviews and product attributes are used to provide meaningful content, while the prototype answer can yield a more diverse answer pattern. To this end, we propose a novel answer generator with a review reasoning module and a prototype answer reader. Our key idea is to obtain the correct question-aware information from a large-scale collection of reviews and learn how to write a coherent and meaningful answer from an existing prototype answer. To be more specific, we propose a read-and-write memory consisting of selective writing units to conduct reasoning among these reviews . We then employ a prototype reader consisting of comprehensive matching to extract the answer skeleton from the prototype answer. Finally, we propose an answer editor to generate the final answer by taking the question and the above parts as input. Conducted on a real-world dataset collected from an e-commerce platform, extensive experimental results show that our model achieves state-of-the-art performance in terms of both automatic metrics and human evaluations. Human evaluation also demonstrates that our model can consistently generate specific and proper answers.

Download Full-text

Bottom-up and Layerwise Domain Adaptation for Pedestrian Detection in Thermal Images

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3418213 ◽

2021 ◽

Vol 17 (1) ◽

pp. 1-19

Author(s):

My Kieu ◽

Andrew D. Bagdanov ◽

Marco Bertini

Keyword(s):

Domain Adaptation ◽

State Of The Art ◽

Pedestrian Detection ◽

Challenging Problem ◽

Top Down ◽

Bottom Up ◽

Security Applications ◽

Lighting Conditions ◽

Initial Layers ◽

Single Modality

Pedestrian detection is a canonical problem for safety and security applications, and it remains a challenging problem due to the highly variable lighting conditions in which pedestrians must be detected. This article investigates several domain adaptation approaches to adapt RGB-trained detectors to the thermal domain. Building on our earlier work on domain adaptation for privacy-preserving pedestrian detection, we conducted an extensive experimental evaluation comparing top-down and bottom-up domain adaptation and also propose two new bottom-up domain adaptation strategies. For top-down domain adaptation, we leverage a detector pre-trained on RGB imagery and efficiently adapt it to perform pedestrian detection in the thermal domain. Our bottom-up domain adaptation approaches include two steps: first, training an adapter segment corresponding to initial layers of the RGB-trained detector adapts to the new input distribution; then, we reconnect the adapter segment to the original RGB-trained detector for final adaptation with a top-down loss. To the best of our knowledge, our bottom-up domain adaptation approaches outperform the best-performing single-modality pedestrian detection results on KAIST and outperform the state of the art on FLIR.

Download Full-text

Towards corpus and model: Hierarchical structured-attention-based features for Indonesian named entity recognition

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202286 ◽

2021 ◽

pp. 1-12

Author(s):

Yingwen Fu ◽

Nankai Lin ◽

Xiaotian Lin ◽

Shengyi Jiang

Keyword(s):

Language Processing ◽

State Of The Art ◽

Named Entity Recognition ◽

Entity Recognition ◽

Language Models ◽

Neural Models ◽

Performance Models ◽

Named Entity ◽

High Resource ◽

Benchmark Datasets

Named entity recognition (NER) is fundamental to natural language processing (NLP). Most state-of-the-art researches on NER are based on pre-trained language models (PLMs) or classic neural models. However, these researches are mainly oriented to high-resource languages such as English. While for Indonesian, related resources (both in dataset and technology) are not yet well-developed. Besides, affix is an important word composition for Indonesian language, indicating the essentiality of character and token features for token-wise Indonesian NLP tasks. However, features extracted by currently top-performance models are insufficient. Aiming at Indonesian NER task, in this paper, we build an Indonesian NER dataset (IDNER) comprising over 50 thousand sentences (over 670 thousand tokens) to alleviate the shortage of labeled resources in Indonesian. Furthermore, we construct a hierarchical structured-attention-based model (HSA) for Indonesian NER to extract sequence features from different perspectives. Specifically, we use an enhanced convolutional structure as well as an enhanced attention structure to extract deeper features from characters and tokens. Experimental results show that HSA establishes competitive performance on IDNER and three benchmark datasets.

Download Full-text

Assessment of the Content of Dry Matter and Dry Organic Matter in Compost with Neural Modelling Methods

Agriculture ◽

10.3390/agriculture11040307 ◽

2021 ◽

Vol 11 (4) ◽

pp. 307

Author(s):

Dawid Wojcieszak ◽

Maciej Zaborowicz ◽

Jacek Przybył ◽

Piotr Boniecki ◽

Aleksander Jędruś

Keyword(s):

Image Analysis ◽

Organic Matter ◽

Dry Matter ◽

Neural Model ◽

Original Method ◽

Neural Models ◽

Test Error ◽

Effective Assessment ◽

Modelling Methods

Neural image analysis is commonly used to solve scientific problems of biosystems and mechanical engineering. The method has been applied, for example, to assess the quality of foodstuffs such as fruit and vegetables, cereal grains, and meat. The method can also be used to analyse composting processes. The scientific problem lets us formulate the research hypothesis: it is possible to identify representative traits of the image of composted material that are necessary to create a neural model supporting the process of assessment of the content of dry matter and dry organic matter in composted material. The effect of the research is the identification of selected features of the composted material and the methods of neural image analysis resulted in a new original method enabling effective assessment of the content of dry matter and dry organic matter. The content of dry matter and dry organic matter can be analysed by means of parameters specifying the colour of compost. The best developed neural models for the assessment of the content of dry matter and dry organic matter in compost are: in visible light RBF 19:19-2-1:1 (test error 0.0922) and MLP 14:14-14-11-1:1 (test error 0.1722), in mixed light RBF 30:30-8-1:1 (test error 0.0764) and MLP 7:7-9-7-1:1 (test error 0.1795). The neural models generated for the compost images taken in mixed light had better qualitative characteristics.

Download Full-text

Sentence Generation for Entity Description with Content-Plan Attention

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6439 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9057-9064

Author(s):

Bayu Trisedya ◽

Jianzhong Qi ◽

Rui Zhang

Keyword(s):

State Of The Art ◽

Neural Models ◽

Time Step ◽

Two Stage ◽

Sentence Generation ◽

Neural Data ◽

Attention Model ◽

Linear Sequence ◽

Proper Order ◽

Real World Datasets

We study neural data-to-text generation. Specifically, we consider a target entity that is associated with a set of attributes. We aim to generate a sentence to describe the target entity. Previous studies use encoder-decoder frameworks where the encoder treats the input as a linear sequence and uses LSTM to encode the sequence. However, linearizing a set of attributes may not yield the proper order of the attributes, and hence leads the encoder to produce an improper context to generate a description. To handle disordered input, recent studies propose two-stage neural models that use pointer networks to generate a content-plan (i.e., content-planner) and use the content-plan as input for an encoder-decoder model (i.e., text generator). However, in two-stage models, the content-planner may yield an incomplete content-plan, due to missing one or more salient attributes in the generated content-plan. This will in turn cause the text generator to generate an incomplete description. To address these problems, we propose a novel attention model that exploits content-plan to highlight salient attributes in a proper order. The challenge of integrating a content-plan in the attention model of an encoder-decoder framework is to align the content-plan and the generated description. We handle this problem by devising a coverage mechanism to track the extent to which the content-plan is exposed in the previous decoding time-step, and hence it helps our proposed attention model select the attributes to be mentioned in the description in a proper order. Experimental results show that our model outperforms state-of-the-art baselines by up to 3% and 5% in terms of BLEU score on two real-world datasets, respectively.

Download Full-text

Verification of filter efficiency of horizontal roughing filter by Weglin's design criteria and Artificial Neural Network

Drinking Water Engineering and Science ◽

10.5194/dwes-2-21-2009 ◽

2009 ◽

Vol 2 (1) ◽

pp. 21-27 ◽

Cited By ~ 1

Author(s):

◽

Keyword(s):

Genetic Model ◽

Neural Model ◽

Model Verification ◽

Design Criteria ◽

Experimental Setup ◽

Mean Square ◽

Raw Water ◽

Neural Models ◽

Sand Filter ◽

Slow Sand Filter

Abstract. The general objective of this study is to estimate the performance of the Horizontal Roughing Filter (HRF) by using Weglin's design criteria based on 1/3–2/3 filter theory. The main objective of the present study is to validate HRF developed in the laboratory with Slow Sand Filter (SSF) as a pretreatment unit with the help of Weglin's design criteria for HRF with respect to raw water condition and neuro-genetic model developed based on the filter dataset. The results achieved from the three different models were compared to find whether the performance of the experimental HRF with SSF output conforms to the other two models which will verify the validity of the former. According to the results, the experimental setup was coherent with the neural model but incoherent with the results from Weglin's formula as lowest mean square error was observed in case of the neuro-genetic model while comparing with the values found from the experimental SSF-HRF unit. As neural models are known to learn a problem with utmost efficiency, the model verification result was taken as positive.

Download Full-text

Bottom-up and Top-down: Bidirectional Additive Net for Edge Detection

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/83 ◽

2020 ◽

Author(s):

Lianli Gao ◽

Zhilong Zhou ◽

Heng Tao Shen ◽

Jingkuan Song

Keyword(s):

Edge Detection ◽

Spatial Information ◽

New Records ◽

State Of The Art ◽

Top Down ◽

Bottom Up ◽

Image Edge Detection ◽

Universal Network ◽

Image Edge ◽

Hierarchical Representations

Image edge detection is considered as a cornerstone task in computer vision. Due to the nature of hierarchical representations learned in CNN, it is intuitive to design side networks utilizing the richer convolutional features to improve the edge detection. However, there is no consensus way to integrate the hierarchical information. In this paper, we propose an effective and end-to-end framework, named Bidirectional Additive Net (BAN), for image edge detection. In the proposed framework, we focus on two main problems: 1) how to design a universal network for incorporating hierarchical information sufficiently; and 2) how to achieve effective information flow between different stages and gradually improve the edge map stage by stage. To tackle these problems, we design a consecutive bottom-up and top-down architecture, where a bottom-up branch can gradually remove detailed or sharp boundaries to enable accurate edge detection and a top-down branch offers a chance of error-correcting by revisiting the low-level features that contain rich textual and spatial information. And attended additive module (AAM) is designed to cumulatively refine edges by selecting pivotal features in each stage. Experimental results show that our proposed methods can improve the edge detection performance to new records and achieve state-of-the-art results on two public benchmarks: BSDS500 and NYUDv2.

Download Full-text