text line extraction
Recently Published Documents


TOTAL DOCUMENTS

65
(FIVE YEARS 11)

H-INDEX

13
(FIVE YEARS 1)

Author(s):  
Shibaprasad Sen ◽  
Ankan Bhattacharyya ◽  
Ram Sarkar ◽  
Kaushik Roy

The work reported in this article deals with the ground truth generation scheme for online handwritten Bangla documents at text-line, word, and stroke levels. The aim of the proposed scheme is twofold: firstly, to build a document level database so that future researchers can use the database to do research in this field. Secondly, the ground truth information will help other researchers to evaluate the performance of their algorithms developed for text-line extraction, word extraction, word segmentation, stroke recognition, and word recognition. The reported ground truth generation scheme starts with text-line extraction from the online handwritten Bangla documents, then words extraction from the text-lines, and finally segmentation of those words into basic strokes. After word segmentation, the basic strokes are assigned appropriate class labels by using modified distance-based feature extraction procedure and the MLP ( Multi-layer Perceptron ) classifier. The Unicode for the words are then generated from the sequence of stroke labels. XML files are used to store the stroke, word, and text-line levels ground truth information for the corresponding documents. The proposed system is semi-automatic and each step such as text-line extraction, word extraction, word segmentation, and stroke recognition has been implemented by using different algorithms. Thus, the proposed ground truth generation procedure minimizes huge manual intervention by reducing the number of mouse clicks required to extract text-lines, words from the document, and segment the words into basic strokes. The integrated stroke recognition module also helps to minimize the manual labor needed to assign appropriate stroke labels. The freely available and can be accessed at https://byanjon.herokuapp.com/ .


Author(s):  
Khader Mohammad ◽  
Aziz Qaroush ◽  
Mahdi Washha ◽  
Sos Agaian ◽  
Iyad Tumar

2020 ◽  
Vol 140 ◽  
pp. 112916 ◽  
Author(s):  
Soumyadeep Kundu ◽  
Sayantan Paul ◽  
Suman Kumar Bera ◽  
Ajith Abraham ◽  
Ram Sarkar

Sign in / Sign up

Export Citation Format

Share Document