scholarly journals Character Segmentation and Recognition of Marathi Language

Author(s):  
Saurabh Ravindra Nikam

Abstract: In this paper Segmentation is one the most important process which decides the success of character recognition fashion. Segmentation is used to putrefy an image of a sequence of characters into sub images of individual symbols by segmenting lines and words. In segmentation image is partitioned into multiple corridor. With respect to the segmentation of handwritten words into characters it's a critical task because of complexity of structural features and kinds in writing styles. Due to this without segmentation these touching characters, it's delicate to fete the individual characters, hence arises the need for segmentation of touching characters in a word. Then we consider Marathi words and Marathi Numbers for segmentation. The algorithm is use for Segmentation of lines and also characters. The segmented characters are also stores in result variable. First it Separate the lines and also it Separate the characters from the input image. This procedure is repeated till end of train. Keywords: Image Segmentation, Handwritten Marathi Characters, Marathi Numbers, OCR.

2019 ◽  
Vol 12 (2) ◽  
pp. 67
Author(s):  
Yuna Sugianela ◽  
Nanik Suciati

Automation of Javanese script translation is needed to make it easier for people to understand the meaning of ancient Javanese script. By using Javanese script image as input, the translation system generally consists of character segmentation, character recognition, and combining the recognized characters as a meaningful word. The segmentation which obtains region of interest of each character, is an important process in the translation system. In the previous research, segmentation using projection profile method can separate each character well. The method can overcome characters overlapping, but it still produces truncated characters. In this study, we proposed a new segmentation to reduce the truncated character. The first step of the proposed method is pre-processing that consists of converting input into binary image and cleaning noises. The next step is to determine the connected component labels, which further perform as candidate of characters. Some of the candidates are still represented by more than one labels, so that we need a process to merge the connected component labels that have centroid distance less than threshold. We evaluate the proposed method using Intersection over Union (IoU). The evaluation shows the best accuracy 93,26%.


Author(s):  
Ipsita Pattnaik ◽  
Tushar Patnaik

Optical Character Recognition (OCR) is a field which converts printed text into computer understandable format that is editable in nature. Odia is a regional language used in Odisha, West Bengal & Jharkhand. It is used by over forty million people and still counting. With such large dependency on a language makes it important, to preserve its script, get a digital editable version of odia script. We propose a framework that takes computer printed odia script image as an input & gives a computer readable & user editable format of same, which eventually recognizes the characters printed in input image. The system uses various techniques to improve the image & perform Line segmentation followed by word segmentation & finally character segmentation using horizontal & vertical projection profile.


The concept of digitization has marked a revolution in the area of data conversion, data storage and data sharing by converting non-editable typographic & handwritten text into editable electronic text. Though numerous such works have been carried out across the world in various languages using Optical Character Recognition (OCR), satisfactory output has been observed only in a few languages. This paper is an endeavor towards taking a step ahead in the digitization of two of the most extensively spoken languages in the Indian sub-continent – Hindi and Bengali - using Google’s open source OCR Engine, Tesseract. Working on the scripts of these two languages of Brahmi origin has its own challenges owing to their varied traits of character segmentation and word formation. Here, the training of Tesseract with data sets of Hindi and Bengali typographic and handwritten characters has been integrated with an inimitable pre-processing stage involving input image customization and image augmentation that significantly enhances the image quality allowing Tesseract to offer more accurate results, especially in cases of handwritten texts and obscure images. Besides, it also incorporates the features of English translation and text to speech translation which render their significance among the non-natives and visually impaired mass. The focal idea of this paper has been to reach out to an extended mass by enabling digitization on the Android platform. Comparative analysis carried out on three distinctive parameters - on images with typographic texts, handwritten texts and on inferior quality images - shows that the paper, to a certain extent, does succeed in projecting superior output in at least two cases as compared to the most consistent Android application of today’s time.


2021 ◽  
Vol 13 (1) ◽  
pp. 89-99
Author(s):  
Marcelo Eidi Imamura ◽  
Francisco Assis da Silva ◽  
Leandro Luiz de Almeida ◽  
Danillo Roberto Pereira ◽  
Almir Olivette Artero ◽  
...  

Brazil has a large fleet of vehicles running daily along urban roads and highways, which requires the use of some computational solution to assist in control and management. In this work we developed an application to detect and recognize real-time licenseplates with various application possibilities. The methodology developed in this work has three main stages: plate detection, character segmentation and recognition. For the detection step we used the YOLO library, which makes use of machine learning techniques to detect objects in real time. YOLO was trained using a dataset with plate images in different environments. In the segmentation stage, the individual characters contained in the plate were separated, using image processing methods. In the last stage, character recognition was performed using two convolutional neural networks, obtaining a hit rate of 83.33%.


Author(s):  
Hendy Gunawan ◽  
Janson Hendryli ◽  
Dyah Erny Herwindiati

The Image Conversion Program of Music Notation being Numeric Notation is a character recognition system that accepts input in form of music notation image that produces an output of a DOCX file containing the numeric notation from the input image. Music notation has notation value, ritmic value and written with a music stave. The system consists of four main processes: preprocessing (grayscale and thresholding), notation line segmentation, notation character segmentation, and template matching. Template matching is used to recognize the music notation that obtained after segmentation. The recognition process obtained by comparing the image with the template image that has been inputted before to the database. This system has 100% success rate on segmentation of the character and success rate 38,4843% on the character recognition with template matching.


Author(s):  
Siddharth Salar Et.al

Handwritten text acknowledgment is yet an open examination issue in the area of Optical Character Recognition (OCR). This paper proposes a productive methodology towards the advancement of handwritten text acknowledgment frameworks. The primary goal of this task is to create AI calculation to empower element and information extraction from records with manually written explanations, with an, expect to distinguish transcribed words on a picture. The main aim of this project is to extract text, this text can be handwritten text or it can machine printed text and convert it into computer understandable or wNe can say computer editable format. To implement thais project we have used PyTesseract which is an open-sourcemOCR engine used to recognize handwritten text and OpenCV a library in python used to solve computer vision problems. So the input image is executed in various steps, first there is pre-processing of an image then there is text localization after that there is character segmentation and character recognition and finally we have post-processing               of image. Further image processingalgorithms can also be used to deal with the multiple characters input in a single image, tilt image, or rotated image. The prepared framework gives a normal precision of more than 95 % with the concealed test picture.


Author(s):  
B. Likith Ram ◽  
P. Naga Sai Teja ◽  
Y. Sai Avinash Kumar ◽  
Ch. Sai Raj

<p>License Plate Recognition (LPR) system is an application of computer vision and image processing technology that takes video of vehicles and take the vehicle frame as input image and by extracting their number plate from whole vehicle image, it displays the number plate information into text. The overall accuracy and efficiency of whole LPR system depends on number plate extraction phase as character segmentation and character recognition phases are also depend on the output of this phase. Higher be the quality of captured input vehicle image more will be the chances of proper extraction of vehicle number plate area. The approach used to segment the image is bilateral filtering algorithm and canny edge detection algorithm. Then we predict the license plate from processed image using py–tesseract OCR and match the retrieved text which is vehicle number plate with database. Finally we get the details of the particular vehicle from the database.</p>


2021 ◽  
Author(s):  
S. Prabu ◽  
J.M. Gnanasekar

Image processing techniques are essential part of the current computer technologies and that it plays vital role in various applications like medical field, object detection, video surveillance system, computer vision etc. The important process of Image processing is Image Segmentation. Image Segmentation is the process of splitting the images into various tiny parts called segments. Image processing makes to simplify the image representation in order to analyze the images. So many algorithms are developed for segmenting images, based on the certain feature of the pixel. In this paper different algorithms of segmentation can be reviewed, analyzed and finally list out the comparison for all the algorithms. This comparison study is useful for increasing accuracy and performance of segmentation methods in various image processing domains.


Author(s):  
J. Magelin Mary ◽  
Chitra K. ◽  
Y. Arockia Suganthi

Image processing technique in general, involves the application of signal processing on the input image for isolating the individual color plane of an image. It plays an important role in the image analysis and computer version. This paper compares the efficiency of two approaches in the area of finding breast cancer in medical image processing. The fundamental target is to apply an image mining in the area of medical image handling utilizing grouping guideline created by genetic algorithm. The parameter using extracted border, the border pixels are considered as population strings to genetic algorithm and Ant Colony Optimization, to find out the optimum value from the border pixels. We likewise look at cost of ACO and GA also, endeavors to discover which one gives the better solution to identify an affected area in medical image based on computational time.


Sign in / Sign up

Export Citation Format

Share Document