document layout Latest Research Papers

Segmentation for document layout analysis: not dead yet

International Journal on Document Analysis and Recognition (IJDAR) ◽

10.1007/s10032-021-00391-3 ◽

2022 ◽

Author(s):

Logan Markewich ◽

Hao Zhang ◽

Yubin Xing ◽

Navid Lambert-Shirzad ◽

Zhexin Jiang ◽

...

Keyword(s):

Layout Analysis ◽

Document Layout Analysis ◽

Document Layout

Guías de publicación: ¿eficiencia editorial o desesperación profesional?

Anuario ThinkEPI ◽

10.3145/thinkepi.2021.e15e05 ◽

2021 ◽

Author(s):

Enrique Orduña-Malea

Keyword(s):

Scientific Research ◽

Research Management ◽

Scientific Journals ◽

Ethical Aspects ◽

Research Staff ◽

Editorial Policies ◽

Journal Editorial ◽

Document Layout ◽

Publication Guidelines ◽

Author Publication

Author publication guidelines (APG) are created by scientific journals to instruct authors when submitting manuscripts for publication. These documents include formal elements that articles must comply with for submission (e.g., format of references, document layout, word limit, and structure), as well as ethical aspects related to the scientific research or journal editorial policies. Despite the importance of these documents for research management, their clarity and quality vary among journals, causing frustration for research staff and financial expense for publishers. The objective of this study is to propose generic recommendations for publication guidelines and to classify the informative elements to be included in these documents. Resumen Las guías de publicación (GP) son documentos elaborados por las revistas con el fin de instruir a los autores a la hora de enviar un manuscrito para su publicación. A tal fin incluyen desde aspectos formales que deben cumplir los documentos para su envío (formato de las referencias bibliográficas, extensión, estructura, etc.) hasta información relativa a aspectos éticos del trabajo científico o políticas editoriales de las revistas. Pese a la importancia de estos documentos para la gestión de la investigación, su claridad y calidad son muy desiguales entre publicaciones, generando frustración al personal investigador y gastos económicos a las editoriales. El objetivo de este trabajo es proponer un decálogo de recomendaciones genéricas para la elaboración de guías de publicación, así como establecer una taxonomía de elementos informativos a incluir en estos documentos.

Document Layout Analysis Using Detection Transformers

10.2118/207266-ms ◽

2021 ◽

Author(s):

Prashanth Pillai ◽

Purnaprajna Mangsuli

Keyword(s):

Deep Learning ◽

Object Detection ◽

Superior Performance ◽

Layout Analysis ◽

Bounding Box ◽

Document Layout Analysis ◽

Wide Range ◽

Document Layout ◽

Bounding Boxes ◽

Document Page

Abstract In the O&G (Oil & Gas) industry, unstructured data sources such as technical reports on hydrocarbon production, daily drilling, well construction, etc. contain valuable information. This information however is conveyed through various formats such as tables, forms, text, figures, etc. Detecting these different entities in documents is essential for building a structured representation of the information within and for automated processing of documents at scale. Our work presents a document layout analysis workflow to detect/localize different entities based on a deep learning-based framework. The workflow comprises of a deep learning-based object-detection framework based on transformers to identify the spatial location of entities in a document page. The key elements of the object-detection pipeline include a residual network backbone for feature extraction and an encoder-decoder transformer based on the latest detection transformers (DETR) to predict object-bounding boxes and category labels. The object detection is formulated as a direct set prediction task using bipartite matching while also eliminating conventional operations like anchor box generation and non-maximal suppression. The availability of sufficient publicly available document layout data sets that incorporate the artifacts observed in historical O&G technical reports is often a major challenge. We attempt to address this challenge by using a novel training data augmentation methodology. The dense occurrence of elements in a page can often introduce uncertainties resulting in bounding boxes cutting through text content. We adopt a bounding box post-processing methodology to refine the bounding box coordinates to minimize undercuts. The proposed document layout analysis pipeline was trained to detect entity types such as headings, text blocks, tables, forms, and images/charts in a document page. A wide range of pages from lithology, stratigraphy, drilling, and field development reports were used for model training. The reports also included a considerable number of historical scanned reports. The trained object-detection model was evaluated on a test data set prepared from the O&G reports. DETR demonstrated superior performance when compared with the Mask R-CNN on our dataset.

White Appearance for Optimal Text-Background Lightness Combination Document Layout on a Tablet Display under Normal Light Levels

Color and Imaging Conference ◽

10.2352/issn.2169-2629.2021.29.188 ◽

2021 ◽

Vol 2021 (29) ◽

pp. 188-192

Author(s):

Huang Hsin-Pou ◽

Li Hung-Chung ◽

Wei Minchen ◽

Huang Yu-Cheng

Keyword(s):

The Other ◽

Visual Comfort ◽

Light Levels ◽

Other Hand ◽

Document Layout ◽

Normal Light

In the study, two psychophysical experiments are carried out to understand the visual comfort and white appearance of a tablet display. Twenty-four observers assess the visual comfort of document layouts, and eleven observers rate the whiteness percentage of the stimulus under normal light levels with a CCT of 6500 K. The result of the experiment for visual comfort indicates that a combination of black text with a light grey background presents the better visual comfort. On the other hand, the finding of the white appearance experiment shows that the observers rate the stimulus with CCT of 6515 K and a Duv of 0 as the whitest.

Complex Document Layout Segmentation Based on An Encoder-Decoder Architecture

Journal of Physics Conference Series ◽

10.1088/1742-6596/2010/1/012024 ◽

2021 ◽

Vol 2010 (1) ◽

pp. 012024

Author(s):

Jia Yao ◽

Linlin Huang

Keyword(s):

Decoder Architecture ◽

Document Layout

French vital records data gathering and analysis through image processing and machine learning algorithms

Journal of Data Mining & Digital Humanities ◽

10.46298/jdmdh.7327 ◽

2021 ◽

Vol 2021 ◽

Author(s):

Cyprien Plateau-Holleville ◽

Enzo Bonnot ◽

Franck Gechter ◽

Laurent Heyberger

Keyword(s):

Data Extraction ◽

Data Gathering ◽

Extraction Process ◽

Point Of View ◽

Machine Learning Algorithms ◽

The Social ◽

Vital Records ◽

International Audience ◽

Document Layout ◽

Scanned Documents

International audience Vital records are rich of meaningful historical data concerning city as well as countryside inhabitants that can be used, among others, to study former populations and then reveal the social, economic and demographic characteristics of those populations. However, these studies encounter a main difficulty for collecting the data needed since most of these records are scanned documents that need a manual transcription step in order to gather all the data and start exploiting it from a historical point of view. This step consequently slows down the historical research and is an obstacle to a better knowledge of the population habits depending on their social conditions. Therefore in this paper, we present a modular and self-sufficient analysis pipeline using state-of-the-art algorithms mostly regardless of the document layout that aims to automate this data extraction process.

Document Layout Analysis via Dynamic Residual Feature Fusion

2021 IEEE International Conference on Multimedia and Expo (ICME) ◽

10.1109/icme51207.2021.9428465 ◽

2021 ◽

Author(s):

Xingjiao Wu ◽

Ziling Hu ◽

Xiangcheng Du ◽

Jing Yang ◽

Liang He

Keyword(s):

Feature Fusion ◽

Layout Analysis ◽

Document Layout Analysis ◽

Document Layout

Investigating Document Layout and Placement Strategies for Collaborative Sensemaking in Augmented Reality

Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems ◽

10.1145/3411763.3451588 ◽

2021 ◽

Author(s):

Weizhou Luo ◽

Anke Lehmann ◽

Yushan Yang ◽

Raimund Dachselt

Keyword(s):

Augmented Reality ◽

Collaborative Sensemaking ◽

Document Layout

Document Layout Analysis with an Enhanced Object Detector

2021 5th International Conference on Pattern Recognition and Image Analysis (IPRIA) ◽

10.1109/ipria53572.2021.9483509 ◽

2021 ◽

Author(s):

Mohammad Minouei ◽

Mohammad Reza Soheili ◽

Didier Stricker

Keyword(s):

Layout Analysis ◽

Document Layout Analysis ◽

Document Layout

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

Journal of Data Mining & Digital Humanities ◽

10.46298/jdmdh.6107 ◽

2021 ◽

Vol HistoInformatics (HistoInformatics) ◽

Author(s):

Raphaël Barman ◽

Maud Ehrmann ◽

Simon Clematide ◽

Sofia Ares Oliveira ◽

Frédéric Kaplan

Keyword(s):

Predictive Power ◽

Research Work ◽

Semantic Segmentation ◽

Visual Features ◽

Learning Techniques ◽

Document Layout Analysis ◽

Document Layout ◽

Series Of Experiments ◽

Textual Features ◽

Extract Information

The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration. Research work seeking to automatically process facsimiles and extract information thereby are multiplying with, as a first essential step, document layout analysis. If the identification and categorization of segments of interest in document images have seen significant progress over the last years thanks to deep learning techniques, many challenges remain with, among others, the use of finer-grained segmentation typologies and the consideration of complex, heterogeneous documents such as historical newspapers. Besides, most approaches consider visual features only, ignoring textual signal. In this context, we introduce a multimodal approach for the semantic segmentation of historical newspapers that combines visual and textual features. Based on a series of experiments on diachronic Swiss and Luxembourgish newspapers, we investigate, among others, the predictive power of visual and textual features and their capacity to generalize across time and sources. Results show consistent improvement of multimodal models in comparison to a strong visual baseline, as well as better robustness to high material variance.

document layout
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Segmentation for document layout analysis: not dead yet

Guías de publicación: ¿eficiencia editorial o desesperación profesional?

Document Layout Analysis Using Detection Transformers

White Appearance for Optimal Text-Background Lightness Combination Document Layout on a Tablet Display under Normal Light Levels

Complex Document Layout Segmentation Based on An Encoder-Decoder Architecture

French vital records data gathering and analysis through image processing and machine learning algorithms

Document Layout Analysis via Dynamic Residual Feature Fusion

Investigating Document Layout and Placement Strategies for Collaborative Sensemaking in Augmented Reality

Document Layout Analysis with an Enhanced Object Detector

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

Export Citation Format

document layoutRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Segmentation for document layout analysis: not dead yet

Guías de publicación: ¿eficiencia editorial o desesperación profesional?

Document Layout Analysis Using Detection Transformers

White Appearance for Optimal Text-Background Lightness Combination Document Layout on a Tablet Display under Normal Light Levels

Complex Document Layout Segmentation Based on An Encoder-Decoder Architecture

French vital records data gathering and analysis through image processing and machine learning algorithms

Document Layout Analysis via Dynamic Residual Feature Fusion

Investigating Document Layout and Placement Strategies for Collaborative Sensemaking in Augmented Reality

Document Layout Analysis with an Enhanced Object Detector

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

document layout
Recently Published Documents