scholarly journals Multi-task Layout Analysis for Historical Handwritten Documents Using Fully Convolutional Networks

Author(s):  
Yue Xu ◽  
Fei Yin ◽  
Zhaoxiang Zhang ◽  
Cheng-Lin Liu

Layout analysis is a fundamental process in document image analysis and understanding. It consists of several sub-processes such as page segmentation, text line segmentation, baseline detection and so on. In this work, we propose a multi-task layout analysis method that use a single FCN model to solve the above three problems simultaneously. The FCN is trained to segment the document image into different regions and detect the center line of each text line by classifying pixels into different categories. By supervised learning on document images with pixel-wise labels, the FCN can extract discriminative features and perform pixel-wise classification accurately. After pixel-wise classification, post-processing steps are taken to reduce noises, correct wrong segmentations and find out overlapping regions. Experimental results on the public dataset DIVA-HisDB containing challenging medieval manuscripts demonstrate the effectiveness and superiority of the proposed method.

2014 ◽  
Vol 23 (3) ◽  
pp. 245-260 ◽  
Author(s):  
Ram Sarkar ◽  
Nibaran Das ◽  
Subhadip Basu ◽  
Mahantapas Kundu ◽  
Mita Nasipuri

AbstractA novel piecewise water flow technique for text line extraction from multi-skewed document images of handwritten text of different scripts is presented here. The basic water flow technique assumes that the hypothetical water flows from both left and right sides of the image frame. This flow of water fills up the gaps between consecutive objects (texts) but faces obstruction if any object lies in the path of the flow. All unwetted regions in the document image are then labeled distinctly to extract the text lines. However, the technique fails when two neighboring text lines touch each other, as water gets obstructed by the touching segment(s). To get rid of this difficulty, we have modified the basic water flow technique by iteratively applying the same over the vertically segmented document images. The main purpose of this vertical segmentation is to localize the text line segment(s) where two text lines get joined. These segments are then horizontally fragmented, and each fragment is placed suitably to the text line in which it actually belongs to. This way, the probable data loss during isolation of the touching text line segment is minimized. Both the techniques (current and basic ones) have been tested on three different databases, viz., CMATERdb 1.1.1, CMATERdb 1.1.2, and ICDAR2009 handwritten segmentation contest pages, respectively. The test results show that the present technique outperforms the basic one for all three databases.


2013 ◽  
Vol 64 (4) ◽  
pp. 238-243 ◽  
Author(s):  
Darko Brodić ◽  
Zoran N. Milivojević

The paper presents the algorithm for text line segmentation based on the oriented anisotropic Gaussian kernel. Initially, the document image is split into connected components achieved by bounding boxes. These connected components are cleared from redundant fragments. Furthermore, the binary moments are applied to each of these connected components evaluating local text skewing. According to this information the orientation of the anisotropic Gaussian kernel is set. After the algorithm application the boundary growing areas around connected components are established. These areas are of major importance for the evaluation of text line segmentation. For testing purposes, the algorithm is evaluated under different text samples. Comparative analysis between algorithm with and without orientation based on the anisotropic Gaussian kernel is made. The results show the improvement in the domain of text line segmentation.


2015 ◽  
Vol 66 (3) ◽  
pp. 132-141 ◽  
Author(s):  
Darko Brodić

Abstarct This manuscript proposes an extension to the water flow algorithm for text line segmentation. Basic algorithm assumes hypothetical water flows under few specified angles of the document image frame from left to right and vice versa. As a result, unwetted image regions that incorporate text are extracted. These regions are of the major importance for text line segmentation. The extension of the basic algorithm means modification of water flow function that creates the unwetted region. Hence, the linear water flow function used in the basic algorithm is changed with its power function counterpart. Extended method was tested, examined and evaluated under different text samples. Results are encouraging due to improving text line segmentation which is a key process stage.


2017 ◽  
Vol 2 (2) ◽  
pp. 60
Author(s):  
Erick Paulus ◽  
Mira Suryani ◽  
Setiawan Hadi ◽  
Akik Hidayat

The variety of image quality of old Sundanese documents can be a real challenge for the process of text line segmentation. This paper describes the results of the investigation of two text line segmentation methods against several collections of Sunda document images, ie projection profile method and Seam Carving method. The deep investigation is done on handwritten documents written on lontar and paper media. The comparative experimental study was used as an investigative methodology in this study. Both methods is tested their performance capability on colored images and binary images using the evaluation matrix provided in handwriting segmentation competition ICDAR 2013. Experimental results show that projection profile method can work optimally on binary image and the type of writing is relatively horizontal. While the Seam Carving method is able to segment the lines in a non-linear manner and produce performance above 80%. With the added of binarization process in the pre-processing stage, the performance of Seam Carving method can increase up to 99% and the number of segmented lines is close to the number of groundtruth lines.


Sign in / Sign up

Export Citation Format

Share Document