Multilingual Scene Text Detection Using Gradient Morphology

Author(s):  
Dibyajyoti Dhar ◽  
Neelotpal Chakraborty ◽  
Sayan Choudhury ◽  
Ashis Paul ◽  
Ayatullah Faruk Mollah ◽  
...  

Text detection in natural scene images is an interesting problem in the field of information retrieval. Several methods have been proposed over the past few decades for scene text detection. However, the robustness and efficiency of these methods are downgraded due to high sensitivity towards various complexities of an image. Also, in multi-lingual environment where texts may occur in multiple languages, a method may not be suitable for detecting scene texts in certain languages. To counter these challenges, a gradient morphology-based method is proposed in this paper that proves to be robust against image complexities and efficiently detects scene texts irrespective of their languages. The method is validated using low quality images from standard multi-lingual datasets like MSRA-TD500 and MLe2e. The performance of the method is compared with that of some state-of-the-art methods, and comparably better results are observed.

Author(s):  
Enze Xie ◽  
Yuhang Zang ◽  
Shuai Shao ◽  
Gang Yu ◽  
Cong Yao ◽  
...  

Scene text detection methods based on deep learning have achieved remarkable results over the past years. However, due to the high diversity and complexity of natural scenes, previous state-of-the-art text detection methods may still produce a considerable amount of false positives, when applied to images captured in real-world environments. To tackle this issue, mainly inspired by Mask R-CNN, we propose in this paper an effective model for scene text detection, which is based on Feature Pyramid Network (FPN) and instance segmentation. We propose a supervised pyramid context network (SPCNET) to precisely locate text regions while suppressing false positives.Benefited from the guidance of semantic information and sharing FPN, SPCNET obtains significantly enhanced performance while introducing marginal extra computation. Experiments on standard datasets demonstrate that our SPCNET clearly outperforms start-of-the-art methods. Specifically, it achieves an F-measure of 92.1% on ICDAR2013, 87.2% on ICDAR2015, 74.1% on ICDAR2017 MLT and 82.9% on


2017 ◽  
Vol 260 ◽  
pp. 112-122 ◽  
Author(s):  
Chunna Tian ◽  
Yong Xia ◽  
Xiangnan Zhang ◽  
Xinbo Gao

2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Weijia Wu ◽  
Jici Xing ◽  
Cheng Yang ◽  
Yuxing Wang ◽  
Hong Zhou

The performance of text detection is crucial for the subsequent recognition task. Currently, the accuracy of the text detector still needs further improvement, particularly those with irregular shapes in a complex environment. We propose a pixel-wise method based on instance segmentation for scene text detection. Specifically, a text instance is split into five components: a Text Skeleton and four Directional Pixel Regions, then restoring itself based on these elements and receiving supplementary information from other areas when one fails. Besides, a Confidence Scoring Mechanism is designed to filter characters similar to text instances. Experiments on several challenging benchmarks demonstrate that our method achieves state-of-the-art results in scene text detection with an F-measure of 84.6% on Total-Text and 86.3% on CTW1500.


Author(s):  
Tong Li ◽  
Wanggen Li ◽  
Nannan Zhu ◽  
Xuecheng Gong ◽  
Jiajia Chen

Author(s):  
Rajae Moumen ◽  
Raddouane Chiheb ◽  
Rdouan Faizi

The aim of this research is to propose a fully convolutional approach to address the problem of real-time scene text detection for Arabic language. Text detection is performed using a two-steps multi-scale approach. The first step uses light-weighted fully convolutional network: TextBlockDetector FCN, an adaptation of VGG-16 to eliminate non-textual elements, localize wide scale text and give text scale estimation. The second step determines narrow scale range of text using fully convolutional network for maximum performance. To evaluate the system, we confront the results of the framework to the results obtained with single VGG-16 fully deployed for text detection in one-shot; in addition to previous results in the state-of-the-art. For training and testing, we initiate a dataset of 575 images manually processed along with data augmentation to enrich training process. The system scores a precision of 0.651 vs 0.64 in the state-of-the-art and a FPS of 24.3 vs 31.7 for a VGG-16 fully deployed.


2021 ◽  
Author(s):  
Khalil Boukthir ◽  
Abdulrahman M. Qahtani ◽  
Omar Almutiry ◽  
habib dhahri ◽  
Adel Alimi

<div>- A novel approach is presented to reduced annotation based on Deep Active Learning for Arabic text detection in Natural Scene Images.</div><div>- A new Arabic text images dataset (7k images) using the Google Street View service named TSVD.</div><div>- A new semi-automatic method for generating natural scene text images from the streets.</div><div>- Training samples is reduced to 1/5 of the original training size on average.</div><div>- Much less training data to achieve better dice index : 0.84</div>


2021 ◽  
Author(s):  
Khalil Boukthir ◽  
Abdulrahman M. Qahtani ◽  
Omar Almutiry ◽  
habib dhahri ◽  
Adel Alimi

<div>- A novel approach is presented to reduced annotation based on Deep Active Learning for Arabic text detection in Natural Scene Images.</div><div>- A new Arabic text images dataset (7k images) using the Google Street View service named TSVD.</div><div>- A new semi-automatic method for generating natural scene text images from the streets.</div><div>- Training samples is reduced to 1/5 of the original training size on average.</div><div>- Much less training data to achieve better dice index : 0.84</div>


Sign in / Sign up

Export Citation Format

Share Document