document page
Recently Published Documents


TOTAL DOCUMENTS

42
(FIVE YEARS 7)

H-INDEX

9
(FIVE YEARS 0)

Author(s):  
Jayati Mukherjee ◽  
Swapan K. Parui ◽  
Utpal Roy

Segmentation of text lines and words in an unconstrained handwritten or a machine-printed degraded document is a challenging document analysis problem due to the heterogeneity in the document structure. Often there is un-even skew between the lines and also broken words in a document. In this article, the contribution lies in segmentation of a document page image into lines and words. We have proposed an unsupervised, robust, and simple statistical method to segment a document image that is either handwritten or machine-printed (degraded or otherwise). In our proposed method, the segmentation is treated as a two-class classification problem. The classification is done by considering the distribution of gap size (between lines and between words) in a binary page image. Our method is very simple and easy to implement. Other than the binarization of the input image, no pre-processing is necessary. There is no need of high computational resources. The proposed method is unsupervised in the sense that no annotated document page images are necessary. Thus, the issue of a training database does not arise. In fact, given a document page image, the parameters that are needed for segmentation of text lines and words are learned in an unsupervised manner. We have applied our proposed method on several popular publicly available handwritten and machine-printed datasets (ISIDDI, IAM-Hist, IAM, PBOK) of different Indian and other languages containing different fonts. Several experimental results are presented to show the effectiveness and robustness of our method. We have experimented on ICDAR-2013 handwriting segmentation contest dataset and our method outperforms the winning method. In addition to this, we have suggested a quantitative measure to compute the level of degradation of a document page image.


2021 ◽  
Author(s):  
Prashanth Pillai ◽  
Purnaprajna Mangsuli

Abstract In the O&G (Oil & Gas) industry, unstructured data sources such as technical reports on hydrocarbon production, daily drilling, well construction, etc. contain valuable information. This information however is conveyed through various formats such as tables, forms, text, figures, etc. Detecting these different entities in documents is essential for building a structured representation of the information within and for automated processing of documents at scale. Our work presents a document layout analysis workflow to detect/localize different entities based on a deep learning-based framework. The workflow comprises of a deep learning-based object-detection framework based on transformers to identify the spatial location of entities in a document page. The key elements of the object-detection pipeline include a residual network backbone for feature extraction and an encoder-decoder transformer based on the latest detection transformers (DETR) to predict object-bounding boxes and category labels. The object detection is formulated as a direct set prediction task using bipartite matching while also eliminating conventional operations like anchor box generation and non-maximal suppression. The availability of sufficient publicly available document layout data sets that incorporate the artifacts observed in historical O&G technical reports is often a major challenge. We attempt to address this challenge by using a novel training data augmentation methodology. The dense occurrence of elements in a page can often introduce uncertainties resulting in bounding boxes cutting through text content. We adopt a bounding box post-processing methodology to refine the bounding box coordinates to minimize undercuts. The proposed document layout analysis pipeline was trained to detect entity types such as headings, text blocks, tables, forms, and images/charts in a document page. A wide range of pages from lithology, stratigraphy, drilling, and field development reports were used for model training. The reports also included a considerable number of historical scanned reports. The trained object-detection model was evaluated on a test data set prepared from the O&G reports. DETR demonstrated superior performance when compared with the Mask R-CNN on our dataset.


2021 ◽  
Vol 2021 (04) ◽  
pp. 0426
Author(s):  
Terry Bollinger

For anyone trying to understand both the basics and the full range of options available when making a DOI metadata submission to Crossref, this linked table of XML element and attribute descriptions gives one small publisher’s best understanding of the most recent version of Crossref’s metadata submission elements and attributes. As of April 2021, the most recent version of Crossref XML files is 4.4.2. This table provides definitions for the six Crossref XML Schema Definition (xsd) files that include the most commonly used description elements of a DOI submission: crossref4.4.2.xsd, common4.4.2.xsd, fundref.xsd, AccessIndicators.xsd, clinicaltrials.xsd, and relations.xsd. The table also includes a brief description of the main features of the externally defined jats:abstract (JATS) element. This table focuses not on XML syntax but on the intent and structure of the elements from a small publisher perspective. This table is one small publisher’s interpretation of Crossref XML and is not authoritative in any way. It will inevitably contain errors, and the author takes no responsibility for its use, which is necessarily and entirely at your own risk. Any submissions created with information from this table should be verified for correctness against the official automated documentation and tools at the Crossref submission site. Note, however, that occasional errors and inconsistencies in those Crossref XML files were uncovered during the creation of this table. Every effort has been made here both to document inconsistencies in the original files and in this interpretation of those files. Important links to Crossref documentation, including comment on the apparent status of Crossref web pages, are provided in the References section after the table on the last document page.


Author(s):  
Rajneesh Rani ◽  
Renu Dhir ◽  
Deepti Kakkar ◽  
Nonita Sharma

The identification of script in a document page image is the first step for an OCR system processing multi-script documents. In this multilingual/multiscript world, document processing systems relying on the OCR that need human involvement to select the appropriate OCR package is definitely undesirable and inefficient. The development of robust and efficient methods for automatic script identification of a document is a subject of major importance for automatic document processing in a multilingual/multiscript environment. Thus, the basic objective is to come up with some intuitive methods having straightforward implementation without compromising with efficiency. The aim of this work is to evaluate state-of-the-art feature extraction and classification techniques in the field of automatic script identification of printed and handwritten documents and to propose the best combination for the same.


Author(s):  
Marian Wagdy ◽  
Khaild Amin ◽  
Mina Ibrahim

In recent years, everyone has his/her own handheld digital devices such as PDAs and camera phones which are used to capture any documents, for example, posters, magazine and books. This is the simplest way to disseminating and collecting information. Unfortunately, the snapshot of this document in an uncontrolled environment has been suffering from different perspectives and geometric distortions, especially when a picture is taken from rolled document, page of thick book, multi-folded documents and crumpled pages. In such cases, the most common distortion appeared is warping text lines. In this paper, we present a survey and a comparative study of document image dewarping techniques which aim to solve the curled lines and geometric distortion problems. We introduce a new classification of the available dewarping document image techniques and investigate their available datasets. Finally, we present the evaluation metric to test these techniques.


Author(s):  
Ricardo Batista das Neves Junior ◽  
Estanislau Lima ◽  
Byron L.D. Bezerra ◽  
Cleber Zanchettin ◽  
Alejandro H. Toselli

Sign in / Sign up

Export Citation Format

Share Document