Logical Layout Analysis using Deep Learning

Author(s):  
Annus Zulfiqar ◽  
Adnan Ul-Hasan ◽  
Faisal Shafait
2021 ◽  
Author(s):  
Prashanth Pillai ◽  
Purnaprajna Mangsuli

Abstract In the O&G (Oil & Gas) industry, unstructured data sources such as technical reports on hydrocarbon production, daily drilling, well construction, etc. contain valuable information. This information however is conveyed through various formats such as tables, forms, text, figures, etc. Detecting these different entities in documents is essential for building a structured representation of the information within and for automated processing of documents at scale. Our work presents a document layout analysis workflow to detect/localize different entities based on a deep learning-based framework. The workflow comprises of a deep learning-based object-detection framework based on transformers to identify the spatial location of entities in a document page. The key elements of the object-detection pipeline include a residual network backbone for feature extraction and an encoder-decoder transformer based on the latest detection transformers (DETR) to predict object-bounding boxes and category labels. The object detection is formulated as a direct set prediction task using bipartite matching while also eliminating conventional operations like anchor box generation and non-maximal suppression. The availability of sufficient publicly available document layout data sets that incorporate the artifacts observed in historical O&G technical reports is often a major challenge. We attempt to address this challenge by using a novel training data augmentation methodology. The dense occurrence of elements in a page can often introduce uncertainties resulting in bounding boxes cutting through text content. We adopt a bounding box post-processing methodology to refine the bounding box coordinates to minimize undercuts. The proposed document layout analysis pipeline was trained to detect entity types such as headings, text blocks, tables, forms, and images/charts in a document page. A wide range of pages from lithology, stratigraphy, drilling, and field development reports were used for model training. The reports also included a considerable number of historical scanned reports. The trained object-detection model was evaluated on a test data set prepared from the O&G reports. DETR demonstrated superior performance when compared with the Mask R-CNN on our dataset.


Author(s):  
Stellan Ohlsson
Keyword(s):  

2019 ◽  
Vol 53 (3) ◽  
pp. 281-294
Author(s):  
Jean-Michel Foucart ◽  
Augustin Chavanne ◽  
Jérôme Bourriau

Nombreux sont les apports envisagés de l’Intelligence Artificielle (IA) en médecine. En orthodontie, plusieurs solutions automatisées sont disponibles depuis quelques années en imagerie par rayons X (analyse céphalométrique automatisée, analyse automatisée des voies aériennes) ou depuis quelques mois (analyse automatique des modèles numériques, set-up automatisé; CS Model +, Carestream Dental™). L’objectif de cette étude, en deux parties, est d’évaluer la fiabilité de l’analyse automatisée des modèles tant au niveau de leur numérisation que de leur segmentation. La comparaison des résultats d’analyse des modèles obtenus automatiquement et par l’intermédiaire de plusieurs orthodontistes démontre la fiabilité de l’analyse automatique; l’erreur de mesure oscillant, in fine, entre 0,08 et 1,04 mm, ce qui est non significatif et comparable avec les erreurs de mesures inter-observateurs rapportées dans la littérature. Ces résultats ouvrent ainsi de nouvelles perspectives quand à l’apport de l’IA en Orthodontie qui, basée sur le deep learning et le big data, devrait permettre, à moyen terme, d’évoluer vers une orthodontie plus préventive et plus prédictive.


2020 ◽  
Author(s):  
L Pennig ◽  
L Lourenco Caldeira ◽  
C Hoyer ◽  
L Görtz ◽  
R Shahzad ◽  
...  
Keyword(s):  

2020 ◽  
Author(s):  
A Heinrich ◽  
M Engler ◽  
D Dachoua ◽  
U Teichgräber ◽  
F Güttler
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document