scholarly journals A Goal-Driven Tree-Structured Neural Model for Math Word Problems

Author(s):  
Zhipeng Xie ◽  
Shichao Sun

Most existing neural models for math word problems exploit Seq2Seq model to generate solution expressions sequentially from left to right, whose results are far from satisfactory due to the lack of goal-driven mechanism commonly seen in human problem solving. This paper proposes a tree-structured neural model to generate expression tree in a goal-driven manner. Given a math word problem, the model first identifies and encodes its goal to achieve, and then the goal gets decomposed into sub-goals combined by an operator in a top-down recursive way. The whole process is repeated until the goal is simple enough to be realized by a known quantity as leaf node. During the process, two-layer gated-feedforward networks are designed to implement each step of goal decomposition, and a recursive neural network is used to encode fulfilled subtrees into subtree embeddings, which provides a better representation of subtrees than the simple goals of subtrees. Experimental results on the dataset Math23K have shown that our tree-structured model outperforms significantly several state-of-the-art models.

Author(s):  
Kashif Munir ◽  
Hai Zhao ◽  
Zuchao Li

The task of semantic role labeling ( SRL ) is dedicated to finding the predicate-argument structure. Previous works on SRL are mostly supervised and do not consider the difficulty in labeling each example which can be very expensive and time-consuming. In this article, we present the first neural unsupervised model for SRL. To decompose the task as two argument related subtasks, identification and clustering, we propose a pipeline that correspondingly consists of two neural modules. First, we train a neural model on two syntax-aware statistically developed rules. The neural model gets the relevance signal for each token in a sentence, to feed into a BiLSTM, and then an adversarial layer for noise-adding and classifying simultaneously, thus enabling the model to learn the semantic structure of a sentence. Then we propose another neural model for argument role clustering, which is done through clustering the learned argument embeddings biased toward their dependency relations. Experiments on the CoNLL-2009 English dataset demonstrate that our model outperforms the previous state-of-the-art baseline in terms of non-neural models for argument identification and classification.


2020 ◽  
Vol 34 (05) ◽  
pp. 8576-8583 ◽  
Author(s):  
Yasumasa Onoe ◽  
Greg Durrett

Neural entity linking models are very powerful, but run the risk of overfitting to the domain they are trained in. For this problem, a “domain” is characterized not just by genre of text but even by factors as specific as the particular distribution of entities, as neural models tend to overfit by memorizing properties of frequent entities in a dataset. We tackle the problem of building robust entity linking models that generalize effectively and do not rely on labeled entity linking data with a specific entity distribution. Rather than predicting entities directly, our approach models fine-grained entity properties, which can help disambiguate between even closely related entities. We derive a large inventory of types (tens of thousands) from Wikipedia categories, and use hyperlinked mentions in Wikipedia to distantly label data and train an entity typing model. At test time, we classify a mention with this typing model and use soft type predictions to link the mention to the most similar candidate entity. We evaluate our entity linking system on the CoNLL-YAGO dataset (Hoffart et al. 2011) and show that our approach outperforms prior domain-independent entity linking systems. We also test our approach in a harder setting derived from the WikilinksNED dataset (Eshel et al. 2017) where all the mention-entity pairs are unseen during test time. Results indicate that our approach generalizes better than a state-of-the-art neural model on the dataset.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Majid Asgari-Bidhendi ◽  
Mehrdad Nasser ◽  
Behrooz Janfada ◽  
Behrouz Minaei-Bidgoli

Relation extraction is the task of extracting semantic relations between entities in a sentence. It is an essential part of some natural language processing tasks such as information extraction, knowledge extraction, question answering, and knowledge base population. The main motivations of this research stem from a lack of a dataset for relation extraction in the Persian language as well as the necessity of extracting knowledge from the growing big data in the Persian language for different applications. In this paper, we present “PERLEX” as the first Persian dataset for relation extraction, which is an expert-translated version of the “SemEval-2010-Task-8” dataset. Moreover, this paper addresses Persian relation extraction utilizing state-of-the-art language-agnostic algorithms. We employ six different models for relation extraction on the proposed bilingual dataset, including a non-neural model (as the baseline), three neural models, and two deep learning models fed by multilingual BERT contextual word representations. The experiments result in the maximum F1-score of 77.66% (provided by BERTEM-MTB method) as the state of the art of relation extraction in the Persian language.


2021 ◽  
Vol 39 (2) ◽  
pp. 1-26
Author(s):  
Shen Gao ◽  
Xiuying Chen ◽  
Zhaochun Ren ◽  
Dongyan Zhao ◽  
Rui Yan

In e-commerce portals, generating answers for product-related questions has become a crucial task. In this article, we focus on the task of product-aware answer generation , which learns to generate an accurate and complete answer from large-scale unlabeled e-commerce reviews and product attributes. However, safe answer problems (i.e., neural models tend to generate meaningless and universal answers) pose significant challenges to text generation tasks, and e-commerce question-answering task is no exception. To generate more meaningful answers, in this article, we propose a novel generative neural model, called the Meaningful Product Answer Generator ( MPAG ), which alleviates the safe answer problem by taking product reviews, product attributes, and a prototype answer into consideration. Product reviews and product attributes are used to provide meaningful content, while the prototype answer can yield a more diverse answer pattern. To this end, we propose a novel answer generator with a review reasoning module and a prototype answer reader. Our key idea is to obtain the correct question-aware information from a large-scale collection of reviews and learn how to write a coherent and meaningful answer from an existing prototype answer. To be more specific, we propose a read-and-write memory consisting of selective writing units to conduct reasoning among these reviews . We then employ a prototype reader consisting of comprehensive matching to extract the answer skeleton from the prototype answer. Finally, we propose an answer editor to generate the final answer by taking the question and the above parts as input. Conducted on a real-world dataset collected from an e-commerce platform, extensive experimental results show that our model achieves state-of-the-art performance in terms of both automatic metrics and human evaluations. Human evaluation also demonstrates that our model can consistently generate specific and proper answers.


Author(s):  
My Kieu ◽  
Andrew D. Bagdanov ◽  
Marco Bertini

Pedestrian detection is a canonical problem for safety and security applications, and it remains a challenging problem due to the highly variable lighting conditions in which pedestrians must be detected. This article investigates several domain adaptation approaches to adapt RGB-trained detectors to the thermal domain. Building on our earlier work on domain adaptation for privacy-preserving pedestrian detection, we conducted an extensive experimental evaluation comparing top-down and bottom-up domain adaptation and also propose two new bottom-up domain adaptation strategies. For top-down domain adaptation, we leverage a detector pre-trained on RGB imagery and efficiently adapt it to perform pedestrian detection in the thermal domain. Our bottom-up domain adaptation approaches include two steps: first, training an adapter segment corresponding to initial layers of the RGB-trained detector adapts to the new input distribution; then, we reconnect the adapter segment to the original RGB-trained detector for final adaptation with a top-down loss. To the best of our knowledge, our bottom-up domain adaptation approaches outperform the best-performing single-modality pedestrian detection results on KAIST and outperform the state of the art on FLIR.


2021 ◽  
pp. 1-12
Author(s):  
Yingwen Fu ◽  
Nankai Lin ◽  
Xiaotian Lin ◽  
Shengyi Jiang

Named entity recognition (NER) is fundamental to natural language processing (NLP). Most state-of-the-art researches on NER are based on pre-trained language models (PLMs) or classic neural models. However, these researches are mainly oriented to high-resource languages such as English. While for Indonesian, related resources (both in dataset and technology) are not yet well-developed. Besides, affix is an important word composition for Indonesian language, indicating the essentiality of character and token features for token-wise Indonesian NLP tasks. However, features extracted by currently top-performance models are insufficient. Aiming at Indonesian NER task, in this paper, we build an Indonesian NER dataset (IDNER) comprising over 50 thousand sentences (over 670 thousand tokens) to alleviate the shortage of labeled resources in Indonesian. Furthermore, we construct a hierarchical structured-attention-based model (HSA) for Indonesian NER to extract sequence features from different perspectives. Specifically, we use an enhanced convolutional structure as well as an enhanced attention structure to extract deeper features from characters and tokens. Experimental results show that HSA establishes competitive performance on IDNER and three benchmark datasets.


Agriculture ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 307
Author(s):  
Dawid Wojcieszak ◽  
Maciej Zaborowicz ◽  
Jacek Przybył ◽  
Piotr Boniecki ◽  
Aleksander Jędruś

Neural image analysis is commonly used to solve scientific problems of biosystems and mechanical engineering. The method has been applied, for example, to assess the quality of foodstuffs such as fruit and vegetables, cereal grains, and meat. The method can also be used to analyse composting processes. The scientific problem lets us formulate the research hypothesis: it is possible to identify representative traits of the image of composted material that are necessary to create a neural model supporting the process of assessment of the content of dry matter and dry organic matter in composted material. The effect of the research is the identification of selected features of the composted material and the methods of neural image analysis resulted in a new original method enabling effective assessment of the content of dry matter and dry organic matter. The content of dry matter and dry organic matter can be analysed by means of parameters specifying the colour of compost. The best developed neural models for the assessment of the content of dry matter and dry organic matter in compost are: in visible light RBF 19:19-2-1:1 (test error 0.0922) and MLP 14:14-14-11-1:1 (test error 0.1722), in mixed light RBF 30:30-8-1:1 (test error 0.0764) and MLP 7:7-9-7-1:1 (test error 0.1795). The neural models generated for the compost images taken in mixed light had better qualitative characteristics.


2020 ◽  
Vol 34 (05) ◽  
pp. 9057-9064
Author(s):  
Bayu Trisedya ◽  
Jianzhong Qi ◽  
Rui Zhang

We study neural data-to-text generation. Specifically, we consider a target entity that is associated with a set of attributes. We aim to generate a sentence to describe the target entity. Previous studies use encoder-decoder frameworks where the encoder treats the input as a linear sequence and uses LSTM to encode the sequence. However, linearizing a set of attributes may not yield the proper order of the attributes, and hence leads the encoder to produce an improper context to generate a description. To handle disordered input, recent studies propose two-stage neural models that use pointer networks to generate a content-plan (i.e., content-planner) and use the content-plan as input for an encoder-decoder model (i.e., text generator). However, in two-stage models, the content-planner may yield an incomplete content-plan, due to missing one or more salient attributes in the generated content-plan. This will in turn cause the text generator to generate an incomplete description. To address these problems, we propose a novel attention model that exploits content-plan to highlight salient attributes in a proper order. The challenge of integrating a content-plan in the attention model of an encoder-decoder framework is to align the content-plan and the generated description. We handle this problem by devising a coverage mechanism to track the extent to which the content-plan is exposed in the previous decoding time-step, and hence it helps our proposed attention model select the attributes to be mentioned in the description in a proper order. Experimental results show that our model outperforms state-of-the-art baselines by up to 3% and 5% in terms of BLEU score on two real-world datasets, respectively.


2009 ◽  
Vol 2 (1) ◽  
pp. 21-27 ◽  
Author(s):  
◽  
◽  
◽  
◽  

Abstract. The general objective of this study is to estimate the performance of the Horizontal Roughing Filter (HRF) by using Weglin's design criteria based on 1/3–2/3 filter theory. The main objective of the present study is to validate HRF developed in the laboratory with Slow Sand Filter (SSF) as a pretreatment unit with the help of Weglin's design criteria for HRF with respect to raw water condition and neuro-genetic model developed based on the filter dataset. The results achieved from the three different models were compared to find whether the performance of the experimental HRF with SSF output conforms to the other two models which will verify the validity of the former. According to the results, the experimental setup was coherent with the neural model but incoherent with the results from Weglin's formula as lowest mean square error was observed in case of the neuro-genetic model while comparing with the values found from the experimental SSF-HRF unit. As neural models are known to learn a problem with utmost efficiency, the model verification result was taken as positive.


Author(s):  
Lianli Gao ◽  
Zhilong Zhou ◽  
Heng Tao Shen ◽  
Jingkuan Song

Image edge detection is considered as a cornerstone task in computer vision. Due to the nature of hierarchical representations learned in CNN, it is intuitive to design side networks utilizing the richer convolutional features to improve the edge detection. However, there is no consensus way to integrate the hierarchical information. In this paper, we propose an effective and end-to-end framework, named Bidirectional Additive Net (BAN), for image edge detection. In the proposed framework, we focus on two main problems: 1) how to design a universal network for incorporating hierarchical information sufficiently; and 2) how to achieve effective information flow between different stages and gradually improve the edge map stage by stage. To tackle these problems, we design a consecutive bottom-up and top-down architecture, where a bottom-up branch can gradually remove detailed or sharp boundaries to enable accurate edge detection and a top-down branch offers a chance of error-correcting by revisiting the low-level features that contain rich textual and spatial information. And attended additive module (AAM) is designed to cumulatively refine edges by selecting pivotal features in each stage. Experimental results show that our proposed methods can improve the edge detection performance to new records and achieve state-of-the-art results on two public benchmarks: BSDS500 and NYUDv2.


Sign in / Sign up

Export Citation Format

Share Document