SpottingNet: Learning the Similarity of Word Images with Convolutional Neural Network for Word Spotting in Handwritten Historical Documents

Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images.

Download Full-text

PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR) ◽

10.1109/icfhr.2016.0060 ◽

2016 ◽

Cited By ~ 58

Author(s):

Sebastian Sudholt ◽

Gernot A. Fink

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Deep Convolutional Neural Network ◽

Word Spotting ◽

Handwritten Documents

Download Full-text

Semantic Segmentation of Historical Documents via Fully-Convolutional Neural Network

Speech and Computer - Lecture Notes in Computer Science ◽

10.1007/978-3-030-26061-3_15 ◽

2019 ◽

pp. 142-149 ◽

Cited By ~ 1

Author(s):

Ivan Gruber ◽

Miroslav Hlaváč ◽

Marek Hrúz ◽

Miloš Železný

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Semantic Segmentation ◽

Historical Documents

Download Full-text

CONVOLUTIONAL NEURAL NETWORK BASED ALGORITHM FOR CECUM ACHIEVEMENT CONFIRMATION

10.1055/s-0040-1705059 ◽

2020 ◽

Author(s):

S Kashin ◽

D Zavyalov ◽

A Rusakov ◽

V Khryashchev ◽

A Lebedev

Keyword(s):

Neural Network ◽

Convolutional Neural Network

Download Full-text

No-Reference Utility Estimation with a Convolutional Neural Network

Electronic Imaging ◽

10.2352/issn.2470-1173.2018.09.iriacv-202 ◽

2018 ◽

Vol 2018 (9) ◽

pp. 202-1-202-6 ◽

Cited By ~ 2

Author(s):

Edward T. Scott ◽

Sheila S. Hemami

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Utility Estimation

Download Full-text

Non-Blind Image Deconvolution Based on “Ringing” Removal Using Convolutional Neural Network

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.10.ipas-180 ◽

2020 ◽

Vol 2020 (10) ◽

pp. 181-1-181-7

Author(s):

Takahiro Kudo ◽

Takanori Fujisawa ◽

Takuro Yamaguchi ◽

Masaaki Ikehara

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Network Architecture ◽

Large Scale ◽

Blind Deconvolution ◽

Training Dataset ◽

Image Deconvolution ◽

Classic Problem ◽

Key Points ◽

Blind Image

Image deconvolution has been an important issue recently. It has two kinds of approaches: non-blind and blind. Non-blind deconvolution is a classic problem of image deblurring, which assumes that the PSF is known and does not change universally in space. Recently, Convolutional Neural Network (CNN) has been used for non-blind deconvolution. Though CNNs can deal with complex changes for unknown images, some CNN-based conventional methods can only handle small PSFs and does not consider the use of large PSFs in the real world. In this paper we propose a non-blind deconvolution framework based on a CNN that can remove large scale ringing in a deblurred image. Our method has three key points. The first is that our network architecture is able to preserve both large and small features in the image. The second is that the training dataset is created to preserve the details. The third is that we extend the images to minimize the effects of large ringing on the image borders. In our experiments, we used three kinds of large PSFs and were able to observe high-precision results from our method both quantitatively and qualitatively.

Download Full-text