Handwritten Text Line Recognition Using Deep Learning

Offline Handwritten Text Recognition Using Deep Learning: A Review

Journal of Physics Conference Series ◽

10.1088/1742-6596/1848/1/012015 ◽

2021 ◽

Vol 1848 (1) ◽

pp. 012015

Author(s):

Yintong Wang ◽

Wenjie Xiao ◽

Shuo Li

Keyword(s):

Deep Learning ◽

Text Recognition ◽

Handwritten Text ◽

Handwritten Text Recognition

Get full-text (via PubEx)

INTEGRATION OF n-GRAM LANGUAGE MODELS IN MULTIPLE CLASSIFIER SYSTEMS FOR OFFLINE HANDWRITTEN TEXT LINE RECOGNITION

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001408006855 ◽

2008 ◽

Vol 22 (07) ◽

pp. 1301-1321 ◽

Cited By ~ 2

Author(s):

ROMAN BERTOLAMI ◽

HORST BUNKE

Keyword(s):

Language Model ◽

Language Models ◽

Combination Method ◽

Text Line ◽

Multiple Classifier Systems ◽

Classifier Systems ◽

Handwritten Text ◽

Handwritten Text Recognition ◽

Multiple Classifier ◽

N Gram

Current multiple classifier systems for unconstrained handwritten text recognition do not provide a straightforward way to utilize language model information. In this paper, we describe a generic method to integrate a statistical n-gram language model into the combination of multiple offline handwritten text line recognizers. The proposed method first builds a word transition network and then rescores this network with an n-gram language model. Experimental evaluation conducted on a large dataset of offline handwritten text lines shows that the proposed approach improves the recognition accuracy over a reference system as well as over the original combination method that does not include a language model.

Get full-text (via PubEx)

Graphic line extraction as an important element of handwriting analysis

Issues of Forensic Science ◽

10.34836/pk.2016.293.4 ◽

2016 ◽

Vol 293 ◽

pp. 81-85

Author(s):

Mieczysław Goc ◽

◽

Krystyn Łuszczuk ◽

Andrzej Łuszczuk ◽

◽

...

Keyword(s):

Computer Application ◽

Text Line ◽

Handwriting Analysis ◽

Line Extraction ◽

Handwritten Text

The article presents the capabilities and operating procedures of a computer application EDYTOR, dedicated for easy separation of the handwritten text line from the background containing elements interfering with the examined object. The application, developed by a team of specialists from the Polish Forensic Association, is mainly used in handwriting analysis.

Get full-text (via PubEx)

Handwritten Text Recognition using Deep Learning with TensorFlow

International Journal of Engineering Research and ◽

10.17577/ijertv9is050534 ◽

2020 ◽

Vol V9 (05) ◽

Author(s):

Sri. Yugandhar Manchala ◽

Jayaram Kinthali ◽

Kowshik Kotha ◽

Kanithi Santosh Kumar, Jagilinki Jayalaxmi ◽

Keyword(s):

Deep Learning ◽

Text Recognition ◽

Handwritten Text ◽

Handwritten Text Recognition

Get full-text (via PubEx)

Unconstrained Handwritten Text Line Segmentation for Kannada Language

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9624.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 953-956

Keyword(s):

Character Recognition ◽

Recognition System ◽

Text Line ◽

Connected Component ◽

Horizontal Projection ◽

Text Documents ◽

Handwritten Text ◽

Kannada Language ◽

System Separation ◽

Line Segmentation

Segmentation is division of something into smaller parts and one of the Component of character recognition system. Separation of characters, words and lines are done in Segmentation from text documents. character recognition is a process which allows computers to recognize written or printed characters such as numbers or letters and to change them into a form that the computer can use. the accuracy of OCR system is done by taking the output of an OCR run for an image and comparing it to the original version of the same text. The main aim of this paper is to find out the various text line segmentations are Projection profiles, Weighted Bucket Method. Proposed method is horizontal projection profile and connected component method on Handwritten Kannada language. These methods are used for experimentation and finally comparing their accuracy and results.

Get full-text (via PubEx)

Towards a scientific workflow featuring Natural Language Processing for the digitisation of natural history collections

Research Ideas and Outcomes ◽

10.3897/rio.6.e55789 ◽

2020 ◽

Vol 6 ◽

Cited By ~ 3

Author(s):

David Owen ◽

Laurence Livermore ◽

Quentin Groom ◽

Alex Hardisty ◽

Thijs Leegwater ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural History ◽

Natural Language ◽

Language Processing ◽

Scientific Workflow ◽

Entity Recognition ◽

Research Activities ◽

Handwritten Text ◽

Segmented Images

We describe an effective approach to automated text digitisation with respect to natural history specimen labels. These labels contain much useful data about the specimen including its collector, country of origin, and collection date. Our approach to automatically extracting these data takes the form of a pipeline. Recommendations are made for the pipeline's component parts based on some of the state-of-the-art technologies. Optical Character Recognition (OCR) can be used to digitise text on images of specimens. However, recognising text quickly and accurately from these images can be a challenge for OCR. We show that OCR performance can be improved by prior segmentation of specimen images into their component parts. This ensures that only text-bearing labels are submitted for OCR processing as opposed to whole specimen images, which inevitably contain non-textual information that may lead to false positive readings. In our testing Tesseract OCR version 4.0.0 offers promising text recognition accuracy with segmented images. Not all the text on specimen labels is printed. Handwritten text varies much more and does not conform to standard shapes and sizes of individual characters, which poses an additional challenge for OCR. Recently, deep learning has allowed for significant advances in this area. Google's Cloud Vision, which is based on deep learning, is trained on large-scale datasets, and is shown to be quite adept at this task. This may take us some way towards negating the need for humans to routinely transcribe handwritten text. Determining the countries and collectors of specimens has been the goal of previous automated text digitisation research activities. Our approach also focuses on these two pieces of information. An area of Natural Language Processing (NLP) known as Named Entity Recognition (NER) has matured enough to semi-automate this task. Our experiments demonstrated that existing approaches can accurately recognise location and person names within the text extracted from segmented images via Tesseract version 4.0.0. Potentially, NER could be used in conjunction with other online services, such as those of the Biodiversity Heritage Library to map the named entities to entities in the biodiversity literature (https://www.biodiversitylibrary.org/docs/api3.html). We have highlighted the main recommendations for potential pipeline components. The document also provides guidance on selecting appropriate software solutions. These include automatic language identification, terminology extraction, and integrating all pipeline components into a scientific workflow to automate the overall digitisation process.

Get full-text (via PubEx)

Using Hidden Markov Models as a Tool for Handwritten Text Line Segmentation

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2 ◽

10.1109/icdar.2007.4378666 ◽

2007 ◽

Cited By ~ 13

Author(s):

F. Luthy ◽

T. Varga ◽

H. Bunke

Keyword(s):

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Text Line ◽

Handwritten Text ◽

Text Line Segmentation ◽

Line Segmentation

Get full-text (via PubEx)

Combining diverse on-line and off-line systems for handwritten text line recognition

Pattern Recognition ◽

10.1016/j.patcog.2008.10.030 ◽

2009 ◽

Vol 42 (12) ◽

pp. 3254-3263 ◽

Cited By ~ 11

Author(s):

Marcus Liwicki ◽

Horst Bunke

Keyword(s):

Text Line ◽

Handwritten Text ◽

On Line

Get full-text (via PubEx)

Multiple Handwritten Text Line Recognition Systems Derived from Specific Integration of a Language Model

Eighth International Conference on Document Analysis and Recognition (ICDAR'05) ◽

10.1109/icdar.2005.167 ◽

2005 ◽

Cited By ~ 6

Author(s):

R. Bertolami ◽

H. Bunke

Keyword(s):

Language Model ◽

Text Line ◽

Handwritten Text ◽

Recognition Systems

Get full-text (via PubEx)

Ensemble Methods to Improve the Performance of an English Handwritten Text Line Recognizer

Arabic and Chinese Handwriting Recognition - Lecture Notes in Computer Science ◽

10.1007/978-3-540-78199-8_16 ◽

2008 ◽

pp. 265-277 ◽

Cited By ~ 1

Author(s):

Roman Bertolami ◽

Horst Bunke

Keyword(s):

Ensemble Methods ◽

Text Line ◽

Handwritten Text

Get full-text (via PubEx)