Order-Free Learning Alleviating Exposure Bias in Multi-Label Classification

Multi-label classification (MLC) assigns multiple labels to each sample. Prior studies show that MLC can be transformed to a sequence prediction problem with a recurrent neural network (RNN) decoder to model the label dependency. However, training a RNN decoder requires a predefined order of labels, which is not directly available in the MLC specification. Besides, RNN thus trained tends to overfit the label combinations in the training set and have difficulty generating unseen label sequences. In this paper, we propose a new framework for MLC which does not rely on a predefined label order and thus alleviates exposure bias. The experimental results on three multi-label classification benchmark datasets show that our method outperforms competitive baselines by a large margin. We also find the proposed approach has a higher probability of generating label combinations not seen during training than the baseline models. The result shows that the proposed approach has better generalization capability.

Download Full-text

An Innovative Way to Measure the Quality of a Neural Network Without the Use of a Test Set

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2001.p0031 ◽

2001 ◽

Vol 5 (1) ◽

pp. 31-36 ◽

Cited By ~ 3

Author(s):

Giovanni Pilato ◽

◽

Filippo Sorbello ◽

Giorgio Vassallo

Keyword(s):

Neural Network ◽

Computational Cost ◽

Experimental Results ◽

Output Function ◽

Quality Factors ◽

Generalization Capability ◽

Training Set ◽

Test Set ◽

Network Output

In this paper, three quality factors are introduced in order to measure the quality of a neural network. Each factor deals with a particular feature of quality: the ability of the network in learning training set samples; generalization capability related to the gradient, in the nearby of the training patterns, of the network output function; the computational cost of the architecture during the production phase, related to the number of connections between neural units. The validity of the proposed solution has been tested using three well-known benchmarks. Experimental results show that quality factors introduced in this paper can be a valid alternative to the test set method.

Download Full-text

Natural language description of images using hybrid recurrent neural network

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v9i4.pp2932-2940 ◽

2019 ◽

Vol 9 (4) ◽

pp. 2932

Author(s):

Md. Asifuzzaman Jishan ◽

Khan Raqib Mahmud ◽

Abul Kalam Al Azad

Keyword(s):

Neural Network ◽

Natural Language ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Text Line ◽

Short Term ◽

Word Representation ◽

Benchmark Datasets ◽

Long Short Term Memory ◽

Language Description

We presented a learning model that generated natural language description of images. The model utilized the connections between natural language and visual data by produced text line based contents from a given image. Our Hybrid Recurrent Neural Network model is based on the intricacies of Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Bi-directional Recurrent Neural Network (BRNN) models. We conducted experiments on three benchmark datasets, e.g., Flickr8K, Flickr30K, and MS COCO. Our hybrid model utilized LSTM model to encode text line or sentences independent of the object location and BRNN for word representation, this reduced the computational complexities without compromising the accuracy of the descriptor. The model produced better accuracy in retrieving natural language based description on the dataset.

Download Full-text

The Daily Container Volumes Prediction of Storage Yard in Port with Long Short-Term Memory Recurrent Neural Network

Journal of Advanced Transportation ◽

10.1155/2019/5764602 ◽

2019 ◽

Vol 2019 ◽

pp. 1-11

Author(s):

Yinping Gao ◽

Daofang Chang ◽

Ting Fang ◽

Yiqun Fan

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Bp Neural Network ◽

Short Term Memory ◽

Arima Model ◽

Training Set ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Box Plot

The effective forecast of container volumes can provide decision support for port scheduling and operating. In this work, by deep learning the historical dataset, the long short-term memory (LSTM) recurrent neural network (RNN) is used to predict daily volumes of containers which will enter the storage yard. The raw dataset of daily container volumes in a certain port is chosen as the training set and preprocessed with box plot. Then the LSTM model is established with Python and Tensorflow framework. The comparison between LSTM and other prediction methods like ARIMA model and BP neural network is also provided in this study, and the prediction gap of LSTM is lower than other methods. It is promising that the proposed LSTM is helpful to predict the daily volumes of containers.

Download Full-text

Improved recurrent neural network-based manipulator control with remote center of motion constraints: Experimental results

Neural Networks ◽

10.1016/j.neunet.2020.07.033 ◽

2020 ◽

Vol 131 ◽

pp. 291-299 ◽

Cited By ~ 6

Author(s):

Hang Su ◽

Yingbai Hu ◽

Hamid Reza Karimi ◽

Alois Knoll ◽

Giancarlo Ferrigno ◽

...

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Experimental Results ◽

Motion Constraints ◽

Manipulator Control

Download Full-text

Flood Detection from Satellite Images Based on Deep Convolutional Neural Network and Layered Recurrent Neural Network

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.e3144.039520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 2041-2045

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Recurrent Neural Network ◽

Satellite Images ◽

Satellite Image ◽

High Accuracy ◽

Training Phase ◽

Training Set ◽

Flood Detection ◽

And Training

Satellite images are important for developing and protected environmental resources that can be used for flood detection. The satellite image of before-flooding and after-flooding to be segmented and feature with integration of deeply LRNN and CNN networks for giving high accuracy. It is also important for learning LRNN and CNN is able to find the feature of flooding regions sufficiently and, it will influence the effectiveness of flood relief. The CNNs and LRNNs consists of two set are training set and testing set. The before flooding and after flooding of satellite images to be extract and segment formed by testing and training phase of data patches. All patches are trained by LRNN where changes occur or any misdetection of flooded region to extract accurately without delay. This proposed method obtain accuracy of system is 99% of flood region detections.

Download Full-text

Learning Dynamic Factors to Improve the Accuracy of Bus Arrival Time Prediction via a Recurrent Neural Network

Future Internet ◽

10.3390/fi11120247 ◽

2019 ◽

Vol 11 (12) ◽

pp. 247

Author(s):

Xin Zhou ◽

Peixin Dong ◽

Jianping Xing ◽

Peijia Sun

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Public Transportation ◽

Prediction Accuracy ◽

Arrival Time ◽

Attention Mechanism ◽

Experimental Results ◽

Support Vector ◽

Data Set ◽

Dynamic Factors

Accurate prediction of bus arrival times is a challenging problem in the public transportation field. Previous studies have shown that to improve prediction accuracy, more heterogeneous measurements provide better results. So what other factors should be added into the prediction model? Traditional prediction methods mainly use the arrival time and the distance between stations, but do not make full use of dynamic factors such as passenger number, dwell time, bus driving efficiency, etc. We propose a novel approach that takes full advantage of dynamic factors. Our approach is based on a Recurrent Neural Network (RNN). The experimental results indicate that a variety of prediction algorithms (such as Support Vector Machine, Kalman filter, Multilayer Perceptron, and RNN) have significantly improved performance after using dynamic factors. Further, we introduce RNN with an attention mechanism to adaptively select the most relevant input factors. Experiments demonstrate that the prediction accuracy of RNN with an attention mechanism is better than RNN with no attention mechanism when there are heterogeneous input factors. The experimental results show the superior performances of our approach on the data set provided by Jinan Public Transportation Corporation.

Download Full-text

A transformer-based approach to irony and sarcasm detection

Neural Computing and Applications ◽

10.1007/s00521-020-05102-3 ◽

2020 ◽

Vol 32 (23) ◽

pp. 17309-17320

Author(s):

Rolandos Alexandros Potamias ◽

Georgios Siolas ◽

Andreas - Georgios Stafylopatis

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Architecture ◽

Figurative Language ◽

State Of The Art ◽

Unresolved Issue ◽

Discussion Forums ◽

Large Margin ◽

Neural Architecture ◽

Benchmark Datasets

AbstractFigurative language (FL) seems ubiquitous in all social media discussion forums and chats, posing extra challenges to sentiment analysis endeavors. Identification of FL schemas in short texts remains largely an unresolved issue in the broader field of natural language processing, mainly due to their contradictory and metaphorical meaning content. The main FL expression forms are sarcasm, irony and metaphor. In the present paper, we employ advanced deep learning methodologies to tackle the problem of identifying the aforementioned FL forms. Significantly extending our previous work (Potamias et al., in: International conference on engineering applications of neural networks, Springer, Berlin, pp 164–175, 2019), we propose a neural network methodology that builds on a recently proposed pre-trained transformer-based network architecture which is further enhanced with the employment and devise of a recurrent convolutional neural network. With this setup, data preprocessing is kept in minimum. The performance of the devised hybrid neural architecture is tested on four benchmark datasets, and contrasted with other relevant state-of-the-art methodologies and systems. Results demonstrate that the proposed methodology achieves state-of-the-art performance under all benchmark datasets, outperforming, even by a large margin, all other methodologies and published studies.

Download Full-text

A SEGMENTATION-FREE RECOGNITION OF HANDWRITTEN TOUCHING NUMERAL PAIRS USING MODULAR NEURAL NETWORK

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001401001271 ◽

2001 ◽

Vol 15 (06) ◽

pp. 949-966 ◽

Cited By ~ 4

Author(s):

SOON-MAN CHOI ◽

Il-SEOK OH

Keyword(s):

Neural Network ◽

Conventional Approach ◽

Experimental Results ◽

Large Set ◽

Training Set ◽

Modular Neural Network ◽

Large Size

The conventional approach to the recognition of handwritten touching numeral pairs uses a process with two steps; splitting the touching numerals and recognizing individual numerals. It shows a limitation mainly due to a large variation in touching styles between two numerals. In this paper, we adopt the segmentation-free approach, which regards a touching numeral pair as an atomic pattern. Two important issues are raised, i.e. solving the large-set classification and constructing a large-size training set. For the 100-class classification, we use a modular neural network which consists of 100 separate subnetworks. We construct the training set with a balance among 100 classes and using a sufficient amount by extracting actual samples from a numeral database and synthesizing samples with a scheme of forcing two numerals to touch. The experimental results show a promising performance.

Download Full-text

End-to-End Bootstrapping Neural Network for Entity Set Expansion

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6482 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9402-9409

Author(s):

Lingyong Yan ◽

Xianpei Han ◽

Ben He ◽

Le Sun

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Substantial Improvement ◽

High Order ◽

Experimental Results ◽

Target Category ◽

Attention Network ◽

Order Relations ◽

Decoder Architecture ◽

End To End

Bootstrapping for entity set expansion (ESE) has long been modeled as a multi-step pipelined process. Such a paradigm, unfortunately, often suffers from two main challenges: 1) the entities are expanded in multiple separate steps, which tends to introduce noisy entities and results in the semantic drift problem; 2) it is hard to exploit the high-order entity-pattern relations for entity set expansion. In this paper, we propose an end-to-end bootstrapping neural network for entity set expansion, named BootstrapNet, which models the bootstrapping in an encoder-decoder architecture. In the encoding stage, a graph attention network is used to capture both the first- and the high-order relations between entities and patterns, and encode useful information into their representations. In the decoding stage, the entities are sequentially expanded through a recurrent neural network, which outputs entities at each stage, and its hidden state vectors, representing the target category, are updated at each expansion step. Experimental results demonstrate substantial improvement of our model over previous ESE approaches.

Download Full-text

An External Knowledge Enhanced Graph-based Neural Network for Sentence Ordering

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.12078 ◽

2021 ◽

Vol 70 ◽

pp. 545-566

Author(s):

Yongjing Yin ◽

Shaopeng Lai ◽

Linfeng Song ◽

Chulun Zhou ◽

Xianpei Han ◽

...

Keyword(s):

Neural Network ◽

State Of The Art ◽

Recurrent Network ◽

Experimental Results ◽

Graph Representation ◽

External Knowledge ◽

Text Coherence ◽

Depth Analysis ◽

Benchmark Datasets ◽

Modeling Task

As an important text coherence modeling task, sentence ordering aims to coherently organize a given set of unordered sentences. To achieve this goal, the most important step is to effectively capture and exploit global dependencies among these sentences. In this paper, we propose a novel and flexible external knowledge enhanced graph-based neural network for sentence ordering. Specifically, we first represent the input sentences as a graph, where various kinds of relations (i.e., entity-entity, sentence-sentence and entity-sentence) are exploited to make the graph representation more expressive and less noisy. Then, we introduce graph recurrent network to learn semantic representations of the sentences. To demonstrate the effectiveness of our model, we conduct experiments on several benchmark datasets. The experimental results and in-depth analysis show our model significantly outperforms the existing state-of-the-art models.

Download Full-text