scholarly journals Detecting Text in Natural Scenes Based on a Reduction of Photometric Effects: Problem of Text Detection

Author(s):  
Alain Trémeau ◽  
Basura Fernando ◽  
Sezer Karaoglu ◽  
Damien Muselet
Author(s):  
Enze Xie ◽  
Yuhang Zang ◽  
Shuai Shao ◽  
Gang Yu ◽  
Cong Yao ◽  
...  

Scene text detection methods based on deep learning have achieved remarkable results over the past years. However, due to the high diversity and complexity of natural scenes, previous state-of-the-art text detection methods may still produce a considerable amount of false positives, when applied to images captured in real-world environments. To tackle this issue, mainly inspired by Mask R-CNN, we propose in this paper an effective model for scene text detection, which is based on Feature Pyramid Network (FPN) and instance segmentation. We propose a supervised pyramid context network (SPCNET) to precisely locate text regions while suppressing false positives.Benefited from the guidance of semantic information and sharing FPN, SPCNET obtains significantly enhanced performance while introducing marginal extra computation. Experiments on standard datasets demonstrate that our SPCNET clearly outperforms start-of-the-art methods. Specifically, it achieves an F-measure of 92.1% on ICDAR2013, 87.2% on ICDAR2015, 74.1% on ICDAR2017 MLT and 82.9% on


2021 ◽  
pp. 198-212
Author(s):  
Aline Geovanna Soares ◽  
Byron Leite Dantas Bezerra ◽  
Estanislau Baptista Lima

Author(s):  
Kuntpong Woraratpanya ◽  
Pimlak Boonchukusol ◽  
Yoshimitsu Kuroki ◽  
Yasushi Kato

2015 ◽  
Vol 2015 ◽  
pp. 1-7 ◽  
Author(s):  
Lin Li ◽  
Shengsheng Yu ◽  
Luo Zhong ◽  
Xiaozhen Li

Multilingual text detection in natural scenes is still a challenging task in computer vision. In this paper, we apply an unsupervised learning algorithm to learn language-independent stroke feature and combine unsupervised stroke feature learning and automatically multilayer feature extraction to improve the representational power of text feature. We also develop a novel nonlinear network based on traditional Convolutional Neural Network that is able to detect multilingual text regions in the images. The proposed method is evaluated on standard benchmarks and multilingual dataset and demonstrates improvement over the previous work.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 888
Author(s):  
Xiqi Wang ◽  
Shunyi Zheng ◽  
Ce Zhang ◽  
Rui Li ◽  
Li Gui

Accurate and efficient text detection in natural scenes is a fundamental yet challenging task in computer vision, especially when dealing with arbitrarily-oriented texts. Most contemporary text detection methods are designed to identify horizontal or approximately horizontal text, which cannot satisfy practical detection requirements for various real-world images such as image streams or videos. To address this lacuna, we propose a novel method called Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrarily-oriented texts in natural image scenes. First, a rotated anchor box with angle information is used as the text bounding box over various orientations. Second, features of various scales are extracted from the input image to determine the probability, confidence, and inclined bounding boxes of the text. Finally, Rotational Distance Intersection over Union Non-Maximum Suppression is used to eliminate redundancy and acquire detection results with the highest accuracy. Experiments on benchmark comparison are conducted upon four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and ICDAR2017-MLT. The results indicate that the proposed R-YOLO method significantly outperforms state-of-the-art methods in terms of detection efficiency while maintaining high accuracy; for example, the proposed R-YOLO method achieves an F-measure of 82.3% at 62.5 fps with 720 p resolution on the ICDAR2015 dataset.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2657
Author(s):  
Shuangshuang Li ◽  
Wenming Cao

Recently, various object detection frameworks have been applied to text detection tasks and have achieved good performance in the final detection. With the further expansion of text detection application scenarios, the research value of text detection topics has gradually increased. Text detection in natural scenes is more challenging for horizontal text based on a quadrilateral detection box and for curved text of any shape. Most networks have a good effect on the balancing of target samples in text detection, but it is challenging to deal with small targets and solve extremely unbalanced data. We continued to use PSENet to deal with such problems in this work. On the other hand, we studied the problem that most of the existing scene text detection methods use ResNet and FPN as the backbone of feature extraction, and improved the ResNet and FPN network parts of PSENet to make it more conducive to the combination of feature extraction in the early stage. A SEMPANet framework without an anchor and in one stage is proposed to implement a lightweight model, which is embodied in the training time of about 24 h. Finally, we selected the two most representative datasets for oriented text and curved text to conduct experiments. On ICDAR2015, the improved network’s latest results further verify its effectiveness; it reached 1.01% in F-measure compared with PSENet-1s. On CTW1500, the improved network performed better than the original network on average.


Text Detection from natural scene images and videos is imperative for applications in real world domain analysis. However the text detection process isperplexingbecause of exigent scenarios that the text exhibit. The information present in the video is either perceptual or it is either in semantic form. Amongst the different content that exists in the video, the text information is a major important content that describes more about the nature of the video. The text present in the video can be categorized into Caption text and Scene text. The caption text is the artificial text that is easy to detect while scene text are natural text which is difficult to identify. In this paper text extraction in natural images by edge based method is implemented. The algorithms are estimated with a set of images of natural scenes that differ alongside the scope of font size, illumination, scale and text direction. Precision, accuracy and recall rates are determined to evaluate the performance. The proposed system worked for all difficult scenarios of varied text and gave better results than the existing methods.


Sign in / Sign up

Export Citation Format

Share Document