document image segmentation
Recently Published Documents


TOTAL DOCUMENTS

60
(FIVE YEARS 5)

H-INDEX

10
(FIVE YEARS 0)

Author(s):  
Omar Boudraa ◽  
Walid Khaled Hidouci ◽  
Dominique Michelucci

Segmentation is one of the critical steps in historical document image analysis systems that determines the quality of the search, understanding, recognition and interpretation processes. It allows isolating the objects to be considered and separating the regions of interest (paragraphs, lines, words and characters) from other entities (figures, graphs, tables, etc.). This stage follows the thresholding, which aims to improve the quality of the document and to extract its background from its foreground, also for detecting and correcting the skew that leads to redress the document. Here, a hybrid method is proposed in order to locate words and characters in both handwritten and printed documents. Numerical results prove the robustness and the high precision of our approach applied on old degraded document images over four common datasets, in which the pair (Recall, Precision) reaches approximately 97.7% and 97.9%.


2020 ◽  
Author(s):  
Yangfan Tong ◽  
Shuo Feng ◽  
Ruiqing Zhang

Abstract In order to solve the problem that Film text is difficult to recognize and difficult to handle in Film Internet of Things, a method that can effectively identify the content in Film text is sought. This paper uses the Mask RCNN algorithm with ResNet101 as the backbone network to establish a Film document image segmentation model.The optimal hyperparameters are: the shape ratio of the anchor frame is [0.5, 1, 3], the threshold for non-maximum suppression is 0.15, and the confidence level is 0.85. The F1 score obtained at this time is 0.8951. When these hyperparameters are substituted into the IOU of 0.8, the F1 score is 0.7417. According to the results of the Pattern Recognition Laboratory of the Chinese Academy of Sciences, this algorithm model ranked first with an IOU of 0.6. Under the premise that IOU is 0.8, it is ranked second, and the first is a non-end-to-end model with a single task. It can be seen that the adjustment of the hyperparameters and the training of the algorithm model are relatively successful.The experimental results show that the MASK RCNN can accurately identify all the formulas in the Film Text. MASK RCNN is significantly better at identifying small objects such as formulas in Film Text images than traditional fast cnn and faster cnn.


Author(s):  
Ricardo Batista das Neves ◽  
Luiz Felipe Vercosa ◽  
David Macedo ◽  
Byron Leite Dantas Bezerra ◽  
Cleber Zanchettin

Sign in / Sign up

Export Citation Format

Share Document