Efficient Character Skew Rectification in Scene Text Images

Author(s):  
Michal Bušta ◽  
Tomáš Drtina ◽  
David Helekal ◽  
Lukáš Neumann ◽  
Jiří Matas
Keyword(s):  
Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1919
Author(s):  
Shuhua Liu ◽  
Huixin Xu ◽  
Qi Li ◽  
Fei Zhang ◽  
Kun Hou

With the aim to solve issues of robot object recognition in complex scenes, this paper proposes an object recognition method based on scene text reading. The proposed method simulates human-like behavior and accurately identifies objects with texts through careful reading. First, deep learning models with high accuracy are adopted to detect and recognize text in multi-view. Second, datasets including 102,000 Chinese and English scene text images and their inverse are generated. The F-measure of text detection is improved by 0.4% and the recognition accuracy is improved by 1.26% because the model is trained by these two datasets. Finally, a robot object recognition method is proposed based on the scene text reading. The robot detects and recognizes texts in the image and then stores the recognition results in a text file. When the user gives the robot a fetching instruction, the robot searches for corresponding keywords from the text files and achieves the confidence of multiple objects in the scene image. Then, the object with the maximum confidence is selected as the target. The results show that the robot can accurately distinguish objects with arbitrary shape and category, and it can effectively solve the problem of object recognition in home environments.


2018 ◽  
Vol 22 (4) ◽  
pp. 1361-1375 ◽  
Author(s):  
Ranjit Ghoshal ◽  
Anandarup Roy ◽  
Ayan Banerjee ◽  
Bibhas Chandra Dhara ◽  
Swapan K. Parui
Keyword(s):  

2020 ◽  
Vol 63 (2) ◽  
Author(s):  
Minghui Liao ◽  
Boyu Song ◽  
Shangbang Long ◽  
Minghang He ◽  
Cong Yao ◽  
...  

2021 ◽  
Author(s):  
Khalil Boukthir ◽  
Abdulrahman M. Qahtani ◽  
Omar Almutiry ◽  
habib dhahri ◽  
Adel Alimi

<div>- A novel approach is presented to reduced annotation based on Deep Active Learning for Arabic text detection in Natural Scene Images.</div><div>- A new Arabic text images dataset (7k images) using the Google Street View service named TSVD.</div><div>- A new semi-automatic method for generating natural scene text images from the streets.</div><div>- Training samples is reduced to 1/5 of the original training size on average.</div><div>- Much less training data to achieve better dice index : 0.84</div>


2021 ◽  
Author(s):  
Khalil Boukthir ◽  
Abdulrahman M. Qahtani ◽  
Omar Almutiry ◽  
habib dhahri ◽  
Adel Alimi

<div>- A novel approach is presented to reduced annotation based on Deep Active Learning for Arabic text detection in Natural Scene Images.</div><div>- A new Arabic text images dataset (7k images) using the Google Street View service named TSVD.</div><div>- A new semi-automatic method for generating natural scene text images from the streets.</div><div>- Training samples is reduced to 1/5 of the original training size on average.</div><div>- Much less training data to achieve better dice index : 0.84</div>


Author(s):  
Neelotpal Chakraborty ◽  
Soumyadeep Kundu ◽  
Sayantan Paul ◽  
Ayatullah Faruk Mollah ◽  
Subhadip Basu ◽  
...  

2021 ◽  
Vol 421 ◽  
pp. 222-233
Author(s):  
Mengkai Ma ◽  
Qiu-Feng Wang ◽  
Shan Huang ◽  
Shen Huang ◽  
Yannis Goulermas ◽  
...  

Author(s):  
Shancheng Fang ◽  
Hongtao Xie ◽  
Jianjun Chen ◽  
Jianlong Tan ◽  
Yongdong Zhang

In this work, we propose an entirely learning-based method to automatically synthesize text sequence in natural images leveraging conditional adversarial networks. As vanilla GANs are clumsy to capture structural text patterns, directly employing GANs for text image synthesis typically results in illegible images. Therefore, we design a two-stage architecture to generate repeated characters in images. Firstly, a character generator attempts to synthesize local character appearance independently, so that the legible characters in sequence can be obtained. To achieve style consistency of characters, we propose a novel style loss based on variance-minimization. Secondly, we design a pixel-manipulation word generator constrained by self-regularization, which learns to convert local characters to plausible word image. Experiments on SVHN dataset and ICDAR, IIIT5K datasets demonstrate our method is able to synthesize visually appealing text images. Besides, we also show the high-quality images synthesized by our method can be used to boost the performance of a scene text recognition algorithm.


Sign in / Sign up

Export Citation Format

Share Document