Improved Ring Radius Transform-Based Reconstruction for Video Character Recognition

Author(s):  
Zhiheng Huang ◽  
Palaiahnakote Shivakumara ◽  
Tong Lu ◽  
Umapada Pal ◽  
Michael Blumenstein ◽  
...  

Character shape reconstruction in video is challenging due to low contrast, complex backgrounds and arbitrary orientation of characters. This work proposes an Improved Ring Radius Transform (IRRT) for reconstructing impaired characters through medial axis prediction. At first, the technique proposes a novel idea based on the Tangent Vector (TV) concept that identifies each actual pair of end pixels caused by gaps in impaired character components. Next, the actual direction to predict medial axis pixels using IRRT for each pair of end pixels is proposed with a new normal vector concept. The process of prediction repeats iteratively to find all the medial axis pixels for every gap in question. Further, medial axis pixels with their radii are used to reconstruct the shapes of impaired characters. The proposed technique is tested on benchmark datasets consisting of video, natural scenes, objects and multi-lingual data to demonstrate that it reconstructs shapes well, even for heterogeneous data. Comparative studies with different binarization and character recognition methods show that the proposed technique is effective, useful and outperforms existing methods.

2021 ◽  
Vol 226 (4) ◽  
pp. 989-1006
Author(s):  
Ilenia Salsano ◽  
Valerio Santangelo ◽  
Emiliano Macaluso

AbstractPrevious studies demonstrated that long-term memory related to object-position in natural scenes guides visuo-spatial attention during subsequent search. Memory-guided attention has been associated with the activation of memory regions (the medial-temporal cortex) and with the fronto-parietal attention network. Notably, these circuits represent external locations with different frames of reference: egocentric (i.e., eyes/head-centered) in the dorsal attention network vs. allocentric (i.e., world/scene-centered) in the medial temporal cortex. Here we used behavioral measures and fMRI to assess the contribution of egocentric and allocentric spatial information during memory-guided attention. At encoding, participants were presented with real-world scenes and asked to search for and memorize the location of a high-contrast target superimposed in half of the scenes. At retrieval, participants viewed again the same scenes, now all including a low-contrast target. In scenes that included the target at encoding, the target was presented at the same scene-location. Critically, scenes were now shown either from the same or different viewpoint compared with encoding. This resulted in a memory-by-view design (target seen/unseen x same/different view), which allowed us teasing apart the role of allocentric vs. egocentric signals during memory-guided attention. Retrieval-related results showed greater search-accuracy for seen than unseen targets, both in the same and different views, indicating that memory contributes to visual search notwithstanding perspective changes. This view-change independent effect was associated with the activation of the left lateral intra-parietal sulcus. Our results demonstrate that this parietal region mediates memory-guided attention by taking into account allocentric/scene-centered information about the objects' position in the external world.


2015 ◽  
Vol 4 (3) ◽  
pp. 1-29 ◽  
Author(s):  
P. Sudir ◽  
M. Ravishankar

In present day video text greatly helps video indexing and retrieval system as they often carry significant semantic information. Video text analysis is challenging due to varying background, multiple orientations and low contrast between text and non-text regions. Proposed approach explores a new framework for curved video text detection and recognition where from the observation that curve text regions can be well defined by edges size and uniform texture, Probable curved text edge detection is accomplished by processing wavelet sub bands followed by text localization by utilizing fast texture descriptor LU-transform. Binarization is achieved by maximal H-transform. A Connected Component filtering method followed by B-Spline curve fitting on centroid of each character vertically aligns each oriented character. The aligned text string is recognized by optical character recognition (OCR). Experiments on various curved video frames shows that proposed method is efficacious and robust in detecting and recognizing curved videotext.


2014 ◽  
Vol 40 (4) ◽  
pp. 751-756 ◽  
Author(s):  
Cun-Zhao SHI ◽  
Chun-Heng WANG ◽  
Bai-Hua XIAO ◽  
Yang ZHANG ◽  
Song GAO

Author(s):  
XINGE YOU ◽  
QIUHUI CHEN ◽  
BIN FANG ◽  
YUAN YAN TANG

An essential step in character recognition is to extract the skeleton characteristics of the character. In this paper, an efficient algorithm is proposed to extract visually satisfactory skeleton from printed and handwritten characters, which overcomes fundamental shortcomings of our previous skeletonization technique based on the maximum modulus symmetry of wavelet transform (WT). The proposed method is motivated from some desirable properties of the WT with constructed wavelet functions: namely, the local modulus minima of the WT are scale-independent at different level scales and are located at the medial axis of the symmetrical contours of character stroke. Thus the modulus minima of the WT are computed as the intrinsic skeletons of character strokes. To achieve faster implementation, a multiscale processing technique is employed. Thus major structures of the skeleton are extracted using the coarse scale, while fine structures are extracted using the fine scale. We have tested the algorithm on handwritten and printed character images. Experimental results show that the proposed algorithm is applicable to not only binary image but also gray-level image where it can be impractical to use other skeletonization techniques, such as thinning and distance transforms. Further, it can effectively remove unwanted artifacts and branches from the extracted skeletons at the intersections and junctions of character strokes and is robust against noises while most existing methods perform poorly.


2001 ◽  
Vol 16 (21) ◽  
pp. 3535-3551 ◽  
Author(s):  
GUO-HONG YANG ◽  
SHI-XIANG FENG ◽  
GUANG-JIONG NI ◽  
YI-SHI DUAN

In Riemann geometry, the relations of two transversal submanifolds and global manifold are discussed without any concrete models. By replacing the normal vector of a submanifold with the tangent vector of another submanifold, the metric tensors, Christoffel symbols and curvature tensors of the three manifolds are connected at the intersection points of the two submanifolds. When the inner product of the two tangent vectors of submanifolds vanishes, some corollaries of these relations give the most important second fundamental form and Gauss–Codazzi equation in the conventional submanifold theory. As a special case, the global manifold which is Euclidean is considered. It is pointed out that, in order to obtain the nonzero energy–momentum tensor of matter field in a submanifold, there must be the contributions of the above inner product and the other submanifold. Generally speaking, a submanifold is closely related to the matter fields of the other submanifold and the two submanifolds affect each other through the above inner product.


2018 ◽  
Vol 232 ◽  
pp. 02007
Author(s):  
Qi Zhang

Most existing approaches for detecting salient areas in natural scenes are based on the saliency contrast within the local context of image. Nowadays, a few approaches not only consider the difference between the foreground objects and the surrounding background areas, but also consider the saliency objects as the candidates for the center of attention from the human’s perspective. This article provides a survey of saliency detection with visual attention, which exploit visual cues of foreground salient areas, visual attention based on saliency map, and deep learning based saliency detection. The published works are explained and descripted in detail, and some related key benchmark datasets are briefly presented. In this article, all documents are published from 2013 to 2018, giving an overview of the progress of the field of saliency detection.


Sign in / Sign up

Export Citation Format

Share Document