Camera-Based Bi-lingual Script Identification at Word Level using SFTA Features

Most of the documents in various application areas like Government, Business and Research are available in the form of bi-lingual/multi-lingual text document. The multilingual documents are captured from video/camera for identification of script of the text document for automatic reading and editing. In this paper, an attempt is made to address the problem of script identification from camera captured document images using SFTA features. The input image is decomposed into a group of binary images by applying TTBD with fixing the number of the threshold as t n =3 empirically, on each decomposed binary image, Box Count, Mean Gray Level, and Pixel Count are extracted to form the feature vector. This feature vector is submitted to K-NN classifier to identify the scripts of the input document image. In all 10 scripts of the Indian languages are considered along with common English language as bi-lingual documents. The novelty of the paper is that 7 features are selected as potential features to obtain the highest accuracy. Features like Box Count (3), Mean Gray Level (2), and Pixel Count (2) have obtained the 87.02% recognition accuracy for English and Hindi Script combinations for the collected dataset and encouraging results for other combinations. These 7 potential features were selected using the technique named as feed-forward feature selection, from the set all 18 features.

Download Full-text

Script Identification of Camera Based Bilingual Document Images Using SFTA Features

10.4018/978-1-6684-3690-5.ch040 ◽

2022 ◽

pp. 811-822

Author(s):

B.V. Dhandra ◽

Satishkumar Mallappa ◽

Gururaj Mukarambi

Keyword(s):

Block Size ◽

Texture Features ◽

Input Image ◽

Document Image ◽

Optimal Size ◽

Document Images ◽

Binary Images ◽

Script Identification ◽

Unified Algorithm ◽

Block Sizes

In this article, the exhaustive experiment is carried out to test the performance of the Segmentation based Fractal Texture Analysis (SFTA) features with nt = 4 pairs, and nt = 8 pairs, geometric features and their combinations. A unified algorithm is designed to identify the scripts of the camera captured bi-lingual document image containing International language English with each one of Hindi, Kannada, Telugu, Malayalam, Bengali, Oriya, Punjabi, and Urdu scripts. The SFTA algorithm decomposes the input image into a set of binary images from which the fractal dimension of the resulting regions are computed in order to describe the segmented texture patterns. This motivates use of the SFTA features as the texture features to identify the scripts of the camera-based document image, which has an effect of non-homogeneous illumination (Resolution). An experiment is carried on eleven scripts each with 1000 sample images of block sizes 128 × 128, 256 × 256, 512 × 512 and 1024 × 1024. It is observed that the block size 512 × 512 gives the maximum accuracy of 86.45% for Gujarathi and English script combination and is the optimal size. The novelty of this article is that unified algorithm is developed for the script identification of bilingual document images.

Download Full-text

Script Identification of Camera Based Bilingual Document Images Using SFTA Features

International Journal of Technology and Human Interaction ◽

10.4018/ijthi.2019100101 ◽

2019 ◽

Vol 15 (4) ◽

pp. 1-12

Author(s):

B.V. Dhandra ◽

Satishkumar Mallappa ◽

Gururaj Mukarambi

Keyword(s):

Block Size ◽

Texture Features ◽

Input Image ◽

Document Image ◽

Optimal Size ◽

Document Images ◽

Binary Images ◽

Script Identification ◽

Unified Algorithm ◽

Block Sizes

Download Full-text

Document Image Quality Assessment with Relaying Reference to Determine Minimum Readable Resolution for Compression

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.9.iqsp-323 ◽

2020 ◽

Vol 2020 (9) ◽

pp. 323-1-323-8

Author(s):

Litao Hu ◽

Zhenhua Hu ◽

Peter Bauer ◽

Todd J. Harris ◽

Jan P. Allebach

Keyword(s):

Image Quality ◽

Quality Assessment ◽

Image Quality Assessment ◽

Research Area ◽

Input Image ◽

Quality Score ◽

Document Image ◽

Digital Cameras ◽

Active Research ◽

Traditional Approaches

Image quality assessment has been a very active research area in the field of image processing, and there have been numerous methods proposed. However, most of the existing methods focus on digital images that only or mainly contain pictures or photos taken by digital cameras. Traditional approaches evaluate an input image as a whole and try to estimate a quality score for the image, in order to give viewers an idea of how “good” the image looks. In this paper, we mainly focus on the quality evaluation of contents of symbols like texts, bar-codes, QR-codes, lines, and hand-writings in target images. Estimating a quality score for this kind of information can be based on whether or not it is readable by a human, or recognizable by a decoder. Moreover, we mainly study the viewing quality of the scanned document of a printed image. For this purpose, we propose a novel image quality assessment algorithm that is able to determine the readability of a scanned document or regions in a scanned document. Experimental results on some testing images demonstrate the effectiveness of our method.

Download Full-text

Attention-based deep learning networks for identification of human gait using radar micro-Doppler spectrograms

International Journal of Microwave and Wireless Technologies ◽

10.1017/s1759078721000830 ◽

2021 ◽

pp. 1-6

Author(s):

Hannah Garcia Doherty ◽

Roberto Arnaiz Burgueño ◽

Roeland P. Trommel ◽

Vasileios Papanastasiou ◽

Ronny I. A. Harmanny

Keyword(s):

Neural Networks ◽

Feature Vector ◽

Classification Performance ◽

Input Image ◽

Human Gait ◽

Learning Networks ◽

Class Label ◽

Deep Convolutional Neural Networks ◽

Network Layers ◽

Feature Dimension

Abstract Identification of human individuals within a group of 39 persons using micro-Doppler (μ-D) features has been investigated. Deep convolutional neural networks with two different training procedures have been used to perform classification. Visualization of the inner network layers revealed the sections of the input image most relevant when determining the class label of the target. A convolutional block attention module is added to provide a weighted feature vector in the channel and feature dimension, highlighting the relevant μ-D feature-filled areas in the image and improving classification performance.

Download Full-text

A comparative review of the challenges encountered in sentiment analysis of Indian regional language tweets vs English language tweets

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.21.12394 ◽

2018 ◽

Vol 7 (2.21) ◽

pp. 319

Author(s):

Saini Jacob Soman ◽

P Swaminathan ◽

R Anandan ◽

K Kalaivani

Keyword(s):

Sentiment Analysis ◽

Life Events ◽

Social Networking Sites ◽

Positive Emotions ◽

English Language ◽

Indian Languages ◽

Regional Language ◽

Comparative Review ◽

Computational Research ◽

Regional Languages

With the developed use of online medium these days for sharing views, sentiments and opinions about products, services, organization and people, micro blogging and social networking sites are acquiring a huge popularity. One of the biggest social media sites namely Twitter is used by several people to share their life events, views and opinion about different areas and concepts. Sentiment analysis is the computational research of reviews, opinions, attitudes, views and peoples’ emotions about different products, services, firms and topics through categorizing them as negative and positive emotions. Sentiment analysis of tweets is a challenging task. This paper makes a critical review on the comparison of the challenges associated with sentiment analysis of Tweets in English Language versus Indian Regional Languages. Five Indian languages namely Tamil, Malayalam, Telugu, Hindi and Bengali have been considered in this research and several challenges associated with the analysis of Twitter sentiments in those languages have been identified and conceptualized in the form of a framework in this research through systematic review.

Download Full-text

WRITER IDENTIFICATION BY TEXTURE ANALYSIS BASED ON KANNADA HANDWRITING

International Journal of Communication Networks and Security ◽

10.47893/ijcns.2012.1057 ◽

2012 ◽

pp. 80-85

Author(s):

B.V. DHANDRA ◽

VIJAYALAXMI.M. B ◽

GURURAJ MUKARAMBI ◽

MALLIKARJUN. HANGARGE

Keyword(s):

Texture Analysis ◽

Identification Problem ◽

Document Image ◽

Independent Method ◽

Gray Level ◽

Writer Identification ◽

Texture Image ◽

Major Research ◽

Gabor Filtering ◽

Occurrence Matrix

Writer identification problem is one of the important area of research due to its various applications and is a challenging task. The major research on writer identification is based on handwritten English documents with text independent and dependent. However, there is no significant work on identification of writers based on Kannada document. Hence, in this paper, we propose a text-independent method for off-line writer identification based on Kannada handwritten scripts. By observing each individual’s handwriting as a different texture image, a set of features based on Discrete Cosine Transform, Gabor filtering and gray level co-occurrence matrix, are extracted from preprocessed document image blocks. Experimental results demonstrate that the Gabor energy features are more potential than the DCTs and GLCMs based features for writer identification from 20 people.

Download Full-text

Word-Level Script Identification Using Texture Based Features

International Journal of System Dynamics Applications ◽

10.4018/ijsda.2015040105 ◽

2015 ◽

Vol 4 (2) ◽

pp. 74-94

Author(s):

Pawan Kumar Singh ◽

Ram Sarkar ◽

Mita Nasipuri

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Statistical Significance ◽

Document Image ◽

Statistical Significance Testing ◽

Script Identification ◽

Word Level ◽

Histograms Of Oriented Gradients ◽

Handwritten Text ◽

Identification Technique

Script identification is an appealing research interest in the field of document image analysis during the last few decades. The accurate recognition of the script is paramount to many post-processing steps such as automated document sorting, machine translation and searching of text written in a particular script in multilingual environment. For automatic processing of such documents through Optical Character Recognition (OCR) software, it is necessary to identify different script words of the documents before feeding them to the OCR of individual scripts. In this paper, a robust word-level handwritten script identification technique has been proposed using texture based features to identify the words written in any of the seven popular scripts namely, Bangla, Devanagari, Gurumukhi, Malayalam, Oriya, Telugu, and Roman. The texture based features comprise of a combination of Histograms of Oriented Gradients (HOG) and Moment invariants. The technique has been tested on 7000 handwritten text words in which each script contributes 1000 words. Based on the identification accuracies and statistical significance testing of seven well-known classifiers, Multi-Layer Perceptron (MLP) has been chosen as the final classifier which is then tested comprehensively using different folds and with different epoch sizes. The overall accuracy of the system is found to be 94.7% using 5-fold cross validation scheme, which is quite impressive considering the complexities and shape variations of the said scripts. This is an extended version of the paper described in (Singh et al., 2014).

Download Full-text

A Hybrid Lossless-Lossy Binary Image Compression Scheme

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2013100103 ◽

2013 ◽

Vol 3 (4) ◽

pp. 37-50

Author(s):

Saif alZahir ◽

Syed M. Naqvi

Keyword(s):

Image Compression ◽

Mathematical Models ◽

Regression Models ◽

Input Image ◽

Binary Image ◽

Large Set ◽

Image Simulation ◽

Compression Scheme ◽

Binary Images ◽

Visual Distortion

In this paper, the authors present a binary image compression scheme that can be used either for lossless or lossy compression requirements. This scheme contains five new contributions. The lossless component of the scheme partitions the input image into a number of non-overlapping rectangles using a new line-by-line method. The upper-left and the lower-right vertices of each rectangle are identified and the coordinates of which are efficiently encoded using three methods of representation and compression. The lossy component, on the other hand, provides higher compression through two techniques. 1) It reduces the number of rectangles from the input image using our mathematical regression models. These mathematical models guarantees image quality so that rectangular reduction should not produce visual distortion in the image. The mathematical models have been obtained through subjective tests and regression analysis on a large set of binary images. 2) Further compression gain is achieved through discarding isolated pixels and 1-pixel rectangles from the image. Simulation results show that the proposed schemes provide significant improvements over previously published work for both the lossy and the lossless components.

Download Full-text

GRAPH-BASED THINNING FOR BINARY IMAGES

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001493000510 ◽

1993 ◽

Vol 07 (05) ◽

pp. 1009-1030 ◽

Cited By ~ 7

Author(s):

SATOSHI SUZUKI ◽

NAONORI UEDA ◽

JACK SKLANSKY

Keyword(s):

Input Image ◽

Binary Image ◽

Feature Points ◽

Shape Distortion ◽

Binary Images ◽

Ideal Line ◽

Geographical Maps ◽

Line Patterns

A thinning method for binary images is proposed which converts digital binary images into line patterns. The proposed method suppresses shape distortion as well as false feature points, thereby producing more natural line patterns than existing methods. In addition, this method guarantees that the produced line patterns are one pixel in width everywhere. In this method, an input binary image is transformed into a graph in which 1-pixels correspond to nodes and neighboring nodes are connected by edges. Next, nodes unnecessary for preserving the topology of the input image and the edges connecting them are deleted symmetrically. Then, edges that do not contribute to the preservation of the topology of the input image are deleted. The advantages of this graph-based thinning method are confirmed by applying it to ideal line patterns and geographical maps.

Download Full-text

Weighted combination of per-frame recognition results for text recognition in a video stream

Computer Optics ◽

10.18287/2412-6179-co-795 ◽

2021 ◽

Vol 45 (1) ◽

pp. 77-89

Author(s):

O. Petrova ◽

K. Bulatov ◽

V.V. Arlazarov ◽

V.L. Arlazarov

Keyword(s):

Video Stream ◽

Input Image ◽

Document Image ◽

Text Recognition ◽

Weighting Method ◽

Document Recognition ◽

Perspective Distortion ◽

Character Weighting ◽

Specialized Equipment ◽

Weighted Combination

The scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure controlled capturing conditions and, consequentially, high quality of input images. Unlike specialized scanners, mobile cameras allow using a video stream as an input, thus obtaining several images of the recognized object, captured with various characteristics. In this case, a problem of combining the information from multiple input frames arises. In this paper, we propose a weighing model for the process of combining the per-frame recognition results, two approaches to the weighted combination of the text recognition results, and two weighing criteria. The effectiveness of the proposed approaches is tested using datasets of identity documents captured with a mobile device camera in different conditions, including perspective distortion of the document image and low lighting conditions. The experimental results show that the weighting combination can improve the text recognition result quality in the video stream, and the per-character weighting method with input image focus estimation as a base criterion allows one to achieve the best results on the datasets analyzed.

Download Full-text