document image processing Latest Research Papers

Rough Set Based Analysis of Document Images

10.31237/osf.io/36xm4 ◽

2021 ◽

Author(s):

Ushasi Chaudhuri

Keyword(s):

Image Processing ◽

Rough Set ◽

Character Recognition ◽

Document Image ◽

Training Dataset ◽

Small Subset ◽

Theoretic Model ◽

Binary Representation ◽

Small Range ◽

Document Image Processing

Rough set is a well-studied subject with a theoretical foundation and many applications. However, its usage in image processing has been very sparse. Most of the well-known algorithms for document image processing related to character recognition, character spotting, and logo retrieval resort to supervised classification, causing the system to slow down in the speed with increasing diversity in the documents, as well as the need to have a large training dataset. Hence, with an aim to resolve the tediousness and pitfalls of training, but without compromising on the efficiency, we introduce a rough-set-theoretic model. It is designed to perform an unsupervised classification of optical characters and logos with a small subset of attributes, called the semi-reduct. The semi-reduct attributes are mostly geometric and topological in nature, each having a small range of discrete values estimated from different combinatorial characteristics of rough-set approximations. This eventually leads to quick and easy discernibility of almost all the characters and logos. In this thesis, we first explain the basics of rough set theory. Subsequently, we propose various attributes that can be easily computed from the binary representation of the images. In subsequent chapters we show how one can select an appropriate subset of such attributes, known as semi-reduct, to perform a document processing task. We demonstrate in this thesis that using the above attributes one can design a character recognition system that is both computationally and storage efficient. Using a different semi-reduct, we show that one can also solve the very delicate task of character spotting in ancient inscriptions. Additionally, we propose appropriate pre-processing steps to binarize the old and dilapidated inscriptions. Finally, we propose a novel technique for logo retrieval using a suitably prepared semi-reduct. Comparison with other existing techniques substantiates our claim that attributes from the rough set are indeed good candidates for document image processing.

BID Dataset: a challenge dataset for document processing tasks

10.5753/sibgrapi.est.2020.12997 ◽

2020 ◽

Author(s):

Álysson De Sá Soares ◽

Ricardo Batista Das Neves Junior ◽

Byron Leite Dantas Bezerra

Keyword(s):

Image Processing ◽

Data Privacy ◽

Document Image ◽

Research Development ◽

Online Systems ◽

Document Image Processing ◽

Identity Document ◽

Identification Documents ◽

Governmental Institutions ◽

Public Datasets

The digital relationship between companies and customers happens through online systems where consumers must upload their identification documents pictures to prove their identities. The existence of this large volume of document images encourages the research development to generate image processing systems to automate tasks usually performed by humans, such as Document Type Classification and Document Reading. The lack of identification documents public datasets delays the research development in document image processing because researchers need to attempt partnerships with private or governmental institutions to obtain the data or build their dataset. In this context, this work presents as main contributions a system to support the automatic creation of identification document public datasets and the Brazilian Identity Document Dataset (BID Dataset): the first Brazilian identification documents public dataset. To accomplish the current personal data privacy law, all information in the BID Dataset comes from fake data. This work aims to increase the velocity of research development in identification document image processing, considering that researchers will be able to use the BID Dataset to develop their research freely.

iDocChip - A Configurable Hardware Architecture for Historical Document Image Processing: Text Line Extraction

2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig) ◽

10.1109/reconfig48160.2019.8994761 ◽

2019 ◽

Author(s):

Menbere Kina Tekleyohannes ◽

Vladimir Rybalkin ◽

Muhammad Mohsin Ghaffar ◽

Norbert Wehn ◽

Andreas Dengel

Keyword(s):

Image Processing ◽

Document Image ◽

Hardware Architecture ◽

Text Line ◽

Historical Document ◽

Line Extraction ◽

Document Image Processing ◽

Configurable Hardware ◽

Text Line Extraction

Hyperspectral document image processing: Applications, challenges and future prospects

Pattern Recognition ◽

10.1016/j.patcog.2019.01.026 ◽

2019 ◽

Vol 90 ◽

pp. 12-22 ◽

Cited By ~ 15

Author(s):

Rizwan Qureshi ◽

Muhammad Uzair ◽

Khurram Khurshid ◽

Hong Yan

Keyword(s):

Image Processing ◽

Document Image ◽

Future Prospects ◽

Document Image Processing

Document Image Processing for Scanning and Printing

10.1007/978-3-030-05342-0 ◽

2019 ◽

Author(s):

Ilia V. Safonov ◽

Ilya V. Kurilin ◽

Michael N. Rychagov ◽

Ekaterina V. Tolstaya

Keyword(s):

Image Processing ◽

Document Image ◽

Document Image Processing

10.3390/books978-3-03897-106-1 ◽

2018 ◽

Keyword(s):

Image Processing ◽

Document Image ◽

Document Image Processing

Journal of Imaging ◽

10.3390/jimaging4070084 ◽

2018 ◽

Vol 4 (7) ◽

pp. 84 ◽

Cited By ~ 2

Author(s):

Laurence Likforman-Sulem ◽

Ergina Kavallieratou

Keyword(s):

Image Processing ◽

Document Image ◽

Document Image Processing

Learning Surrogate Models of Document Image Quality Metrics for Automated Document Image Processing

2018 13th IAPR International Workshop on Document Analysis Systems (DAS) ◽

10.1109/das.2018.14 ◽

2018 ◽

Cited By ~ 2

Author(s):

Prashant Singh ◽

Ekta Vats ◽

Anders Hast

Keyword(s):

Image Processing ◽

Image Quality ◽

Surrogate Models ◽

Quality Metrics ◽

Document Image ◽

Document Image Processing ◽

Image Quality Metrics

libcrn, an Open-Source Document Image Processing Library

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR) ◽

10.1109/icfhr.2016.0049 ◽

2016 ◽

Author(s):

Yann Leydier ◽

Jean Duong ◽

Stephane Bres ◽

Veronique Eglin ◽

Frank Lebourgeois ◽

...

Keyword(s):

Image Processing ◽

Open Source ◽

Document Image ◽

Document Image Processing ◽

Source Document

ESTIMATION AND CORRECTION OF MULTIPLE SKEW IN DOCUMENT IMAGE PROCESSING

International Journal of Advance Engineering and Research Development ◽

10.21090/ijaerd.030515 ◽

2016 ◽

Vol 3 (05) ◽

Cited By ~ 1

Keyword(s):

Image Processing ◽

Document Image ◽

Document Image Processing ◽

Estimation And Correction

document image processing
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Rough Set Based Analysis of Document Images

BID Dataset: a challenge dataset for document processing tasks

iDocChip - A Configurable Hardware Architecture for Historical Document Image Processing: Text Line Extraction

Hyperspectral document image processing: Applications, challenges and future prospects

Document Image Processing for Scanning and Printing