Implementation of Sparse Neural Networks on Fixed Size Arrays

Author(s):  
Manavendra Misra ◽  
V. K. Prasanna Kumar
Keyword(s):  
2021 ◽  
Vol 40 (1) ◽  
pp. 551-563
Author(s):  
Liqiong Lu ◽  
Dong Wu ◽  
Ziwei Tang ◽  
Yaohua Yi ◽  
Faliang Huang

This paper focuses on script identification in natural scene images. Traditional CNNs (Convolution Neural Networks) cannot solve this problem perfectly for two reasons: one is the arbitrary aspect ratios of scene images which bring much difficulty to traditional CNNs with a fixed size image as the input. And the other is that some scripts with minor differences are easily confused because they share a subset of characters with the same shapes. We propose a novel approach combing Score CNN, Attention CNN and patches. Attention CNN is utilized to determine whether a patch is a discriminative patch and calculate the contribution weight of the discriminative patch to script identification of the whole image. Score CNN uses a discriminative patch as input and predict the score of each script type. Firstly patches with the same size are extracted from the scene images. Secondly these patches are used as inputs to Score CNN and Attention CNN to train two patch-level classifiers. Finally, the results of multiple discriminative patches extracted from the same image via the above two classifiers are fused to obtain the script type of this image. Using patches with the same size as inputs to CNN can avoid the problems caused by arbitrary aspect ratios of scene images. The trained classifiers can mine discriminative patches to accurately identify some confusing scripts. The experimental results show the good performance of our approach on four public datasets.


2018 ◽  
Vol 10 (12) ◽  
pp. 115
Author(s):  
Wanli Yang ◽  
Yimin Chen ◽  
Chen Huang ◽  
Mingke Gao

In recent years, the application of deep neural networks to human behavior recognition has become a hot topic. Although remarkable achievements have been made in the field of image recognition, there are still many problems to be solved in the area of video. It is well known that convolutional neural networks require a fixed size image input, which not only limits the network structure but also affects the recognition accuracy. Although this problem has been solved in the field of images, it has not yet been broken through in the field of video. To address the input problem of fixed size video frames in video recognition, we propose a three-dimensional (3D) densely connected convolutional network based on spatial pyramid pooling (3D-DenseNet-SPP). As the name implies, the network structure is mainly composed of three parts: 3DCNN, DenseNet, and SPPNet. Our models were evaluated on a KTH dataset and UCF101 dataset separately. The experimental results showed that our model has better performance in the field of video-based behavior recognition in comparison to the existing models.


2018 ◽  
Vol 6 ◽  
pp. 687-702 ◽  
Author(s):  
Wenpeng Yin ◽  
Hinrich Schütze

In NLP, convolutional neural networks (CNNs) have benefited less than recurrent neural networks (RNNs) from attention mechanisms. We hypothesize that this is because the attention in CNNs has been mainly implemented as attentive pooling (i.e., it is applied to pooling) rather than as attentive convolution (i.e., it is integrated into convolution). Convolution is the differentiator of CNNs in that it can powerfully model the higher-level representation of a word by taking into account its local fixed-size context in the input text t x. In this work, we propose an attentive convolution network, ATTCONV. It extends the context scope of the convolution operation, deriving higher-level features for a word not only from local context, but also from information extracted from nonlocal context by the attention mechanism commonly used in RNNs. This nonlocal context can come (i) from parts of the input text t x that are distant or (ii) from extra (i.e., external) contexts t y. Experiments on sentence modeling with zero-context (sentiment analysis), single-context (textual entailment) and multiple-context (claim verification) demonstrate the effectiveness of ATTCONV in sentence representation learning with the incorporation of context. In particular, attentive convolution outperforms attentive pooling and is a strong competitor to popular attentive RNNs. 1


1999 ◽  
Vol 22 (8) ◽  
pp. 723-728 ◽  
Author(s):  
Artymiak ◽  
Bukowski ◽  
Feliks ◽  
Narberhaus ◽  
Zenner

1995 ◽  
Vol 40 (11) ◽  
pp. 1110-1110
Author(s):  
Stephen James Thomas

Sign in / Sign up

Export Citation Format

Share Document