scholarly journals Text Detection Using Multi-Stage Region Proposal Network Sensitive to Text Scale

Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1232
Author(s):  
Yoshito Nagaoka ◽  
Tomo Miyazaki ◽  
Yoshihiro Sugaya ◽  
Shinichiro Omachi

Recently, attention has surged concerning intelligent sensors using text detection. However, there are challenges in detecting small texts. To solve this problem, we propose a novel text detection CNN (convolutional neural network) architecture sensitive to text scale. We extract multi-resolution feature maps in multi-stage convolution layers that have been employed to prevent losing information and maintain the feature size. In addition, we developed the CNN considering the receptive field size to generate proposal stages. The experimental results show the importance of the receptive field size.

2019 ◽  
Vol 53 (1) ◽  
pp. 2-19 ◽  
Author(s):  
Erion Çano ◽  
Maurizio Morisio

Purpose The fabulous results of convolution neural networks in image-related tasks attracted attention of text mining, sentiment analysis and other text analysis researchers. It is, however, difficult to find enough data for feeding such networks, optimize their parameters, and make the right design choices when constructing network architectures. The purpose of this paper is to present the creation steps of two big data sets of song emotions. The authors also explore usage of convolution and max-pooling neural layers on song lyrics, product and movie review text data sets. Three variants of a simple and flexible neural network architecture are also compared. Design/methodology/approach The intention was to spot any important patterns that can serve as guidelines for parameter optimization of similar models. The authors also wanted to identify architecture design choices which lead to high performing sentiment analysis models. To this end, the authors conducted a series of experiments with neural architectures of various configurations. Findings The results indicate that parallel convolutions of filter lengths up to 3 are usually enough for capturing relevant text features. Also, max-pooling region size should be adapted to the length of text documents for producing the best feature maps. Originality/value Top results the authors got are obtained with feature maps of lengths 6–18. An improvement on future neural network models for sentiment analysis could be generating sentiment polarity prediction of documents using aggregation of predictions on smaller excerpt of the entire text.


2005 ◽  
Vol 93 (6) ◽  
pp. 3537-3547 ◽  
Author(s):  
Chong Weng ◽  
Chun-I Yeh ◽  
Carl R. Stoelzel ◽  
Jose-Manuel Alonso

Each point in visual space is encoded at the level of the thalamus by a group of neighboring cells with overlapping receptive fields. Here we show that the receptive fields of these cells differ in size and response latency but not at random. We have found that in the cat lateral geniculate nucleus (LGN) the receptive field size and response latency of neighboring neurons are significantly correlated: the larger the receptive field, the faster the response to visual stimuli. This correlation is widespread in LGN. It is found in groups of cells belonging to the same type (e.g., Y cells), and of different types (i.e., X and Y), within a specific layer or across different layers. These results indicate that the inputs from the multiple geniculate afferents that converge onto a cortical cell (approximately 30) are likely to arrive in a sequence determined by the receptive field size of the geniculate afferents. Recent studies have shown that the peak of the spatial frequency tuning of a cortical cell shifts toward higher frequencies as the response progresses in time. Our results are consistent with the idea that these shifts in spatial frequency tuning arise from differences in the response time course of the thalamic inputs.


1987 ◽  
Vol 510 (1 Olfaction and) ◽  
pp. 504-505
Author(s):  
CHARLOTTE M. MISTRETTA ◽  
TAKATOSHI NAGAI ◽  
ROBERT M. BRADLEY

2008 ◽  
Vol 25 (4) ◽  
pp. 419-427 ◽  
Author(s):  
Kazunori Yamamoto ◽  
Hiroshi Jouhou ◽  
Masanori Iwasaki ◽  
Akimichi Kaneko ◽  
Masahiro Yamada

2006 ◽  
Vol 46 (4) ◽  
pp. 467-474 ◽  
Author(s):  
Herbert A. Reitsamer ◽  
Renate Pflug ◽  
Melchior Franz ◽  
Sonja Huber

1986 ◽  
Vol 55 (6) ◽  
pp. 1136-1152 ◽  
Author(s):  
C. L. Baker ◽  
M. S. Cynader

Responses of direction-selective neurons in cat striate cortex (area 17) were studied with flashed-bar stimuli. Spatial parameters of interactions within the receptive field giving rise to direction selectivity and of receptive-field subunits were quantitatively determined for the same cells and correlated. A bar stimulus flashed sequentially at two nearby locations in the receptive field produced direction-selective behavior comparable with that elicited by continuously moving stimuli. Each cell exhibited a characteristic optimal spatial displacement, Dopt, for which responses in the presumed preferred and null directions were maximally distinct. In all cases, Dopt was much smaller than the receptive-field size. The spatial structure of receptive fields in simple cells was studied using single narrow-bar stimuli flashed at different locations in the receptive field. The resulting line-weighting function exhibited alternating regions of ON and OFF responses having a characteristic spatial period or wavelength, lambda. Spatial subunit structure in complex cells was determined by flashing two bars simultaneously in the receptive field. The response as a function of bar separation was again a wavelike function having a spatial wavelength, lambda. Values of the optimal displacement for direction selectivity, Dopt, showed a clear relationship with the spatial wavelength, lambda, for a given unit. Dopt was also correlated to a somewhat lesser degree with receptive-field size. Generally, the ratio of Dopt to lambda was approximately 1/10 to 1/4, in agreement with theoretical predictions by Marr and Poggio. Taken together with the findings of Movshon et al., these results indicate a systematic relationship between Dopt and the spatial frequency of a sinusoidal grating, which is optimal for that cell. Such a relationship is consistent with the results of human psychophysical experiments on apparent motion.


Sign in / Sign up

Export Citation Format

Share Document