scholarly journals Recognition of Images Formed in Pho on the Eyes of different Subjects

2021 ◽  
Vol 11 (4) ◽  
pp. 3023-3029
Author(s):  
Muhammad Junaid ◽  
Luqman Shah ◽  
Ali Imran Jehangiri ◽  
Fahad Ali Khan ◽  
Yousaf Saeed ◽  
...  

With each passing day resolutions of still image/video cameras are on the rise. This amelioration in resolutions has the potential to extract useful information on the view opposite the photographed subjects from their reflecting parts. Especially important is the idea to capture images formed on the eyes of photographed people and animals. The motivation behind this research is to explore the forensic importance of the images/videos to especially analyze the reflections of the background of the camera. This analysis may include extraction/ detection/recognition of the objects in front of the subjects but on the back of the camera. In the national context such videos/photographs are not rare and, specifically speaking, an abductee’s video footage at a good resolution may give some important clues to the identity of the person who kidnapped him/her. Our aim would be to extract visual information formed in human eyes from still images as well as from video clips. After extraction, our next task would be to recognize the extracted visual information. Initially our experiments would be limited on characters’ extraction and recognition, including characters of different styles and font sizes (computerized) as well as hand written. Although varieties of Optical Character Recognition (OCR) tools are available for characters’ extraction and recognition but, the problem is that they only provide results for clear images (zoomed).

2020 ◽  
Vol 10 (8) ◽  
pp. 2794 ◽  
Author(s):  
Uduak Edet ◽  
Daniel Mann

A study to determine the visual requirements for a remote supervisor of an autonomous sprayer was conducted. Observation of a sprayer operator identified 9 distinct “look zones” that occupied his visual attention, with 39% of his time spent viewing the look zone ahead of the sprayer. While observation of the sprayer operator was being completed, additional GoPro cameras were used to record video of the sprayer in operation from 10 distinct perspectives (some look zones were visible from the operator’s seat, but other look zones were selected to display other regions of the sprayer that might be of interest to a sprayer operator). In a subsequent laboratory study, 29 experienced sprayer operators were recruited to view and comment on video clips selected from the video footage collected during the initial ride-along. Only the two views from the perspective of the operator’s seat were rated highly as providing important information even though participants were able to identify relevant information from all ten of the video clips. Generally, participants used the video clips to obtain information about the boom status, the location and movement of the sprayer within the field, the weather conditions (especially the wind), obstacles to be avoided, crop conditions, and field conditions. Sprayer operators with more than 15 years of experience provided more insightful descriptions of the video clips than their less experienced peers. Designers can influence which features the user will perceive by positioning the camera such that those specific features are prominent in the camera’s field of view. Overall, experienced sprayer operators preferred the concept of presenting visual information on an automation interface using live video rather than presenting that same information using some type of graphical display using icons or symbols.


Author(s):  
Jiapeng Wang ◽  
Tianwei Wang ◽  
Guozhi Tang ◽  
Lianwen Jin ◽  
Weihong Ma ◽  
...  

Visual information extraction (VIE) has attracted increasing attention in recent years. The existing methods usually first organized optical character recognition (OCR) results in plain texts and then utilized token-level category annotations as supervision to train a sequence tagging model. However, it expends great annotation costs and may be exposed to label confusion, the OCR errors will also significantly affect the final performance. In this paper, we propose a unified weakly-supervised learning framework called TCPNet (Tag, Copy or Predict Network), which introduces 1) an efficient encoder to simultaneously model the semantic and layout information in 2D OCR results, 2) a weakly-supervised training method that utilizes only sequence-level supervision; and 3) a flexible and switchable decoder which contains two inference modes: one (Copy or Predict Mode) is to output key information sequences of different categories by copying a token from the input or predicting one in each time step, and the other (Tag Mode) is to directly tag the input sequence in a single forward pass. Our method shows new state-of-the-art performance on several public benchmarks, which fully proves its effectiveness.


1997 ◽  
Vol 9 (1-3) ◽  
pp. 58-77
Author(s):  
Vitaly Kliatskine ◽  
Eugene Shchepin ◽  
Gunnar Thorvaldsen ◽  
Konstantin Zingerman ◽  
Valery Lazarev

In principle, printed source material should be made machine-readable with systems for Optical Character Recognition, rather than being typed once more. Offthe-shelf commercial OCR programs tend, however, to be inadequate for lists with a complex layout. The tax assessment lists that assess most nineteenth century farms in Norway, constitute one example among a series of valuable sources which can only be interpreted successfully with specially designed OCR software. This paper considers the problems involved in the recognition of material with a complex table structure, outlining a new algorithmic model based on ‘linked hierarchies’. Within the scope of this model, a variety of tables and layouts can be described and recognized. The ‘linked hierarchies’ model has been implemented in the ‘CRIPT’ OCR software system, which successfully reads tables with a complex structure from several different historical sources.


2020 ◽  
Vol 2020 (1) ◽  
pp. 78-81
Author(s):  
Simone Zini ◽  
Simone Bianco ◽  
Raimondo Schettini

Rain removal from pictures taken under bad weather conditions is a challenging task that aims to improve the overall quality and visibility of a scene. The enhanced images usually constitute the input for subsequent Computer Vision tasks such as detection and classification. In this paper, we present a Convolutional Neural Network, based on the Pix2Pix model, for rain streaks removal from images, with specific interest in evaluating the results of the processing operation with respect to the Optical Character Recognition (OCR) task. In particular, we present a way to generate a rainy version of the Street View Text Dataset (R-SVTD) for "text detection and recognition" evaluation in bad weather conditions. Experimental results on this dataset show that our model is able to outperform the state of the art in terms of two commonly used image quality metrics, and that it is capable to improve the performances of an OCR model to detect and recognise text in the wild.


2014 ◽  
Vol 6 (1) ◽  
pp. 36-39
Author(s):  
Kevin Purwito

This paper describes about one of the many extension of Optical Character Recognition (OCR), that is Optical Music Recognition (OMR). OMR is used to recognize musical sheets into digital format, such as MIDI or MusicXML. There are many musical symbols that usually used in musical sheets and therefore needs to be recognized by OMR, such as staff; treble, bass, alto and tenor clef; sharp, flat and natural; beams, staccato, staccatissimo, dynamic, tenuto, marcato, stopped note, harmonic and fermata; notes; rests; ties and slurs; and also mordent and turn. OMR usually has four main processes, namely Preprocessing, Music Symbol Recognition, Musical Notation Reconstruction and Final Representation Construction. Each of those four main processes uses different methods and algorithms and each of those processes still needs further development and research. There are already many application that uses OMR to date, but none gives the perfect result. Therefore, besides the development and research for each OMR process, there is also a need to a development and research for combined recognizer, that combines the results from different OMR application to increase the final result’s accuracy. Index Terms—Music, optical character recognition, optical music recognition, musical symbol, image processing, combined recognizer  


Sign in / Sign up

Export Citation Format

Share Document