scholarly journals Evaluation of Text Legibility in Alternative Imaging Approaches to Microfiche Digitization

2021 ◽  
Vol 2021 (1) ◽  
pp. 96-101
Author(s):  
Hilda Deborah ◽  
Dipendra J. Mandal

Microfiche was a common format used in microforms reproductions of documents, extensively used for archival storage before the move to digital formats. While contemporary documents are still available for digitization, others from older historical periods are no longer physically accessible for various reasons. In some cases, their microfiche copies are available, making microfiche digitization a must. However, a microfiche reader is not always available and, even then, it is a machine made for the purpose of reading and not for data collection. In this work, the performance two imaging devices are evaluated as alternatives to the traditional microfiche reader, by means of optical character recognition (OCR). Results show that this alternative surpasses the performance of a microfiche reader in terms of text legibility.

2021 ◽  
Author(s):  
Michael Schwartz ◽  

Many companies have tried to automate data collection for handheld Digital Multimeters (DMM) using Optical Character Recognition (OCR). Only recently have companies tried to perform this task using Artificial Intelligence (AI) technology, Cal Lab Solutions being one of them in 2020. But when we developed our first prototype application, we discovered the difficulties of getting a good value with every measurement and test point.A year later, lessons learned and equipped with better software, this paper is a continuation of that AI project. In Beta-,1 we learned the difficulties of AI reading segmented displays. There are no pre-trained models for this type of display, so we needed to train a model. This required the testing of thousands of images, so we changed the scope of the project to a continual learning AI project. This paper will cover how we built our continuous learning AI model to show how any lab with a webcam can start automating those handheld DMMS with software that gets smarter over time.


2021 ◽  
pp. 56-61
Author(s):  
Rachna Tewani ◽  
◽  
◽  
◽  
◽  
...  

In today's world, everything is getting digitized, and widespread use of data scanning tools and photography. When we have a lot of image data, it becomes important to accumulate data in a form that is useful for the company/organization. Doing it manually is a tedious task and takes an ample amount of time. Hence to simplify the job, we have developed a FLASK API that takes an image folder as an object and returns an excel sheet of relevant data from the image data. We have used optical character recognition and software like pytesseract to extract data from images. Further in the process, we have used natural language processing, and finally, we have found relevant data using the globe and regex module. This model is helpful in data collection from Registration certificates which helps us store data like chassis number, owner name, car number, etc., easily and can be applied to Aadhaar cards and pan cards.


1997 ◽  
Vol 9 (1-3) ◽  
pp. 58-77
Author(s):  
Vitaly Kliatskine ◽  
Eugene Shchepin ◽  
Gunnar Thorvaldsen ◽  
Konstantin Zingerman ◽  
Valery Lazarev

In principle, printed source material should be made machine-readable with systems for Optical Character Recognition, rather than being typed once more. Offthe-shelf commercial OCR programs tend, however, to be inadequate for lists with a complex layout. The tax assessment lists that assess most nineteenth century farms in Norway, constitute one example among a series of valuable sources which can only be interpreted successfully with specially designed OCR software. This paper considers the problems involved in the recognition of material with a complex table structure, outlining a new algorithmic model based on ‘linked hierarchies’. Within the scope of this model, a variety of tables and layouts can be described and recognized. The ‘linked hierarchies’ model has been implemented in the ‘CRIPT’ OCR software system, which successfully reads tables with a complex structure from several different historical sources.


2020 ◽  
Vol 2020 (1) ◽  
pp. 78-81
Author(s):  
Simone Zini ◽  
Simone Bianco ◽  
Raimondo Schettini

Rain removal from pictures taken under bad weather conditions is a challenging task that aims to improve the overall quality and visibility of a scene. The enhanced images usually constitute the input for subsequent Computer Vision tasks such as detection and classification. In this paper, we present a Convolutional Neural Network, based on the Pix2Pix model, for rain streaks removal from images, with specific interest in evaluating the results of the processing operation with respect to the Optical Character Recognition (OCR) task. In particular, we present a way to generate a rainy version of the Street View Text Dataset (R-SVTD) for "text detection and recognition" evaluation in bad weather conditions. Experimental results on this dataset show that our model is able to outperform the state of the art in terms of two commonly used image quality metrics, and that it is capable to improve the performances of an OCR model to detect and recognise text in the wild.


2014 ◽  
Vol 6 (1) ◽  
pp. 36-39
Author(s):  
Kevin Purwito

This paper describes about one of the many extension of Optical Character Recognition (OCR), that is Optical Music Recognition (OMR). OMR is used to recognize musical sheets into digital format, such as MIDI or MusicXML. There are many musical symbols that usually used in musical sheets and therefore needs to be recognized by OMR, such as staff; treble, bass, alto and tenor clef; sharp, flat and natural; beams, staccato, staccatissimo, dynamic, tenuto, marcato, stopped note, harmonic and fermata; notes; rests; ties and slurs; and also mordent and turn. OMR usually has four main processes, namely Preprocessing, Music Symbol Recognition, Musical Notation Reconstruction and Final Representation Construction. Each of those four main processes uses different methods and algorithms and each of those processes still needs further development and research. There are already many application that uses OMR to date, but none gives the perfect result. Therefore, besides the development and research for each OMR process, there is also a need to a development and research for combined recognizer, that combines the results from different OMR application to increase the final result’s accuracy. Index Terms—Music, optical character recognition, optical music recognition, musical symbol, image processing, combined recognizer  


Sign in / Sign up

Export Citation Format

Share Document