Algorithm for choosing the best frame in a video stream in the task of identity document recognition

During the process of document recognition in a video stream using a mobile device camera, the image quality of the document varies greatly from frame to frame. Sometimes recognition system is required not only to recognize all the specified attributes of the document, but also to select final document image of the best quality. This is necessary, for example, for archiving or providing various services; in some countries it can be required by law. In this case, recognition system needs to assess the quality of frames in the video stream and choose the “best” frame. In this paper we considered the solution to such a problem where the “best” frame means the presence of all specified attributes in a readable form in the document image. The method was set up on a private dataset, and then tested on documents from the open MIDV-2019 dataset. A practically applicable result was obtained for use in recognition systems.

Download Full-text

Blind Quality Assessment of Iris Images Acquired in Visible Light for Biometric Recognition

Sensors ◽

10.3390/s20051308 ◽

2020 ◽

Vol 20 (5) ◽

pp. 1308 ◽

Cited By ~ 1

Author(s):

Mohsen Jenadeleh ◽

Marius Pedersen ◽

Dietmar Saupe

Keyword(s):

Image Quality ◽

Visible Light ◽

Iris Recognition ◽

Recognition Performance ◽

Rejection Rate ◽

Recognition System ◽

Poor Quality ◽

Iris Image ◽

Recognition Systems

Image quality is a key issue affecting the performance of biometric systems. Ensuring the quality of iris images acquired in unconstrained imaging conditions in visible light poses many challenges to iris recognition systems. Poor-quality iris images increase the false rejection rate and decrease the performance of the systems by quality filtering. Methods that can accurately predict iris image quality can improve the efficiency of quality-control protocols in iris recognition systems. We propose a fast blind/no-reference metric for predicting iris image quality. The proposed metric is based on statistical features of the sign and the magnitude of local image intensities. The experiments, conducted with a reference iris recognition system and three datasets of iris images acquired in visible light, showed that the quality of iris images strongly affects the recognition performance and is highly correlated with the iris matching scores. Rejecting poor-quality iris images improved the performance of the iris recognition system. In addition, we analyzed the effect of iris image quality on the accuracy of the iris segmentation module in the iris recognition system.

Download Full-text

Arabic Cursive Text Recognition from Natural Scene Images

Applied Sciences ◽

10.3390/app9020236 ◽

2019 ◽

Vol 9 (2) ◽

pp. 236 ◽

Cited By ~ 6

Author(s):

Saad Ahmed ◽

Saeeda Naz ◽

Muhammad Razzak ◽

Rubiyah Yusof

Keyword(s):

Recognition System ◽

Document Image ◽

Text Recognition ◽

Chinese Script ◽

Challenging Problem ◽

Future Directions ◽

Scene Text ◽

Comprehensive Survey ◽

Recognition Systems ◽

Scene Text Recognition

This paper presents a comprehensive survey on Arabic cursive scene text recognition. The recent years’ publications in this field have witnessed the interest shift of document image analysis researchers from recognition of optical characters to recognition of characters appearing in natural images. Scene text recognition is a challenging problem due to the text having variations in font styles, size, alignment, orientation, reflection, illumination change, blurriness and complex background. Among cursive scripts, Arabic scene text recognition is contemplated as a more challenging problem due to joined writing, same character variations, a large number of ligatures, the number of baselines, etc. Surveys on the Latin and Chinese script-based scene text recognition system can be found, but the Arabic like scene text recognition problem is yet to be addressed in detail. In this manuscript, a description is provided to highlight some of the latest techniques presented for text classification. The presented techniques following a deep learning architecture are equally suitable for the development of Arabic cursive scene text recognition systems. The issues pertaining to text localization and feature extraction are also presented. Moreover, this article emphasizes the importance of having benchmark cursive scene text dataset. Based on the discussion, future directions are outlined, some of which may provide insight about cursive scene text to researchers.

Download Full-text

Weighted combination of per-frame recognition results for text recognition in a video stream

Computer Optics ◽

10.18287/2412-6179-co-795 ◽

2021 ◽

Vol 45 (1) ◽

pp. 77-89

Author(s):

O. Petrova ◽

K. Bulatov ◽

V.V. Arlazarov ◽

V.L. Arlazarov

Keyword(s):

Video Stream ◽

Input Image ◽

Document Image ◽

Text Recognition ◽

Weighting Method ◽

Document Recognition ◽

Perspective Distortion ◽

Character Weighting ◽

Specialized Equipment ◽

Weighted Combination

The scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure controlled capturing conditions and, consequentially, high quality of input images. Unlike specialized scanners, mobile cameras allow using a video stream as an input, thus obtaining several images of the recognized object, captured with various characteristics. In this case, a problem of combining the information from multiple input frames arises. In this paper, we propose a weighing model for the process of combining the per-frame recognition results, two approaches to the weighted combination of the text recognition results, and two weighing criteria. The effectiveness of the proposed approaches is tested using datasets of identity documents captured with a mobile device camera in different conditions, including perspective distortion of the document image and low lighting conditions. The experimental results show that the weighting combination can improve the text recognition result quality in the video stream, and the per-character weighting method with input image focus estimation as a base criterion allows one to achieve the best results on the datasets analyzed.

Download Full-text

MIDV-500: a dataset for identity document analysis and recognition on mobile devices in video stream

Computer Optics ◽

10.18287/2412-6179-2019-43-5-818-824 ◽

2019 ◽

Vol 43 (5) ◽

pp. 818-824 ◽

Cited By ~ 7

Author(s):

V.V. Arlazarov ◽

K. Bulatov ◽

T. Chernov ◽

V.L. Arlazarov

Keyword(s):

Mobile Devices ◽

Face Detection ◽

Data Extraction ◽

Personal Data ◽

Ground Truth ◽

Document Analysis ◽

Video Stream ◽

Text Line ◽

Document Recognition ◽

Identity Document

A lot of research has been devoted to identity documents analysis and recognition on mobile devices. However, no publicly available datasets designed for this particular problem currently exist. There are a few datasets which are useful for associated subtasks but in order to facilitate a more comprehensive scientific and technical approach to identity document recognition more specialized datasets are required. In this paper we present a Mobile Identity Document Video dataset (MIDV-500) consisting of 500 video clips for 50 different identity document types with ground truth which allows to perform research in a wide scope of document analysis problems. The paper presents characteristics of the dataset and evaluation results for existing methods of face detection, text line recognition, and document fields data extraction. Since an important feature of identity documents is their sensitiveness as they contain personal data, all source document images used in MIDV-500 are either in public domain or distributed under public copyright licenses. The main goal of this paper is to present a dataset. However, in addition and as a baseline, we present evaluation results for existing methods for face detection, text line recognition, and document data extraction, using the presented dataset.

Download Full-text

The Challenges of Developing a Living Arabic Phonetic Dictionary for Speech Recognition System: A Literature Review

Advanced Journal of Social Science ◽

10.21467/ajss.8.1.164-170 ◽

2021 ◽

Vol 8 (1) ◽

pp. 164-170

Author(s):

Mohammad Husam Alhumsi ◽

Saleh Belhassen

Keyword(s):

Speech Recognition ◽

Literature Review ◽

Recognition System ◽

Speech Recognition System ◽

System A ◽

Normal Human ◽

Recognition Systems ◽

Translation Systems ◽

Arabic Speech Recognition

Phonetic dictionaries are regarded as pivotal components of speech recognition systems. The function of speech recognition research is to generate a machine which will accurately identify and distinguish the normal human speech from any other speaker. Literature affirmed that Arabic phonetics is one of the major problems in Arabic speech recognition. Therefore, this paper reviews previous studies tackling the challenges faced by initiating an Arabic phonetic dictionary with respect to Arabic speech recognition. It has been found that the system of speech recognition investigated areas of differences concerning Arabic phonetics. In addition, an Arabic phonetic dictionary should be initiated where the Arabic vowels’ phonemes should be considered as a component of the consonants’ phonemes. Thus, the incorporation of developed machine translation systems may enhance the quality of the system. The current paper concludes with the existing challenges faced by Arabic phonetic dictionary.

Download Full-text

Image quality assessment for video stream recognition systems

Tenth International Conference on Machine Vision (ICMV 2017) ◽

10.1117/12.2309628 ◽

2018 ◽

Cited By ~ 1

Author(s):

Nikita Razumnuy ◽

Alexander Kozharinov ◽

Vladimir Arlazarov ◽

Dmitry P. Nikolaev ◽

Timofey Chernov

Keyword(s):

Image Quality ◽

Quality Assessment ◽

Image Quality Assessment ◽

Video Stream ◽

Recognition Systems

Download Full-text

Comparative analysis of face recognition systems built using standard architecture blocks

Telecom IT ◽

10.31854/2307-1303-2020-8-3-94-101 ◽

2020 ◽

Vol 8 (3) ◽

pp. 94-101

Author(s):

E. Kalyashov

Keyword(s):

Face Recognition ◽

Comparative Analysis ◽

Original Data ◽

Recognition System ◽

Practical Relevance ◽

Research Subject ◽

Face Recognition System ◽

Recognition Quality ◽

Recognition Systems

Research subject. The article reviews ways of constructing face recognition systems based on standard modules. Method. The study is based on comparison of performance and recognition quality of various pipelines. Core results. Values of reached recognition quality and dependencies from a type of original data are presented. Practical relevance. The results could be used while implementing various face recognition system pipelines.

Download Full-text

Evaluation and image quality comparison of ultra-thin fibre endoscopes for vascular endoscopy

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2017-0048 ◽

2017 ◽

Vol 3 (2) ◽

pp. 231-233

Author(s):

Axel Boese ◽

Akhil Karthasseril Sivankutty ◽

Michael Friebe

Keyword(s):

Image Quality ◽

Diagnostic Value ◽

Medical Applications ◽

Test Set ◽

Vascular Endoscopy ◽

Set Up ◽

Different Diameters ◽

Test Objects

AbstractMedical applications like vascular endoscopy can provide additional diagnostic value for future procedures [1]. For these applications an optimal compromise between image quality and fibre flexibility has to be identified. Image quality of endoscopes is normally estimated using flat test objects or charts. For the application of vascular endoscopy a tubular test set up seems to be more beneficial. We compare three fibre endoscopes with different diameters and number of integrated fibre according to image quality and flexibility. Based on the results a recommendation for possible applications in vascular endoscopy is given.

Download Full-text

Application of dynamic saliency maps to the video stream recognition systems with image quality assessment

Eleventh International Conference on Machine Vision (ICMV 2018) ◽

10.1117/12.2522768 ◽

2019 ◽

Author(s):

Timofey Chernov ◽

Sergey Ilyuhin ◽

Vladimir V. Arlazarov

Keyword(s):

Image Quality ◽

Quality Assessment ◽

Image Quality Assessment ◽

Video Stream ◽

Saliency Maps ◽

Recognition Systems

Download Full-text

Observability of Very Thick Specimen by Using Transmission Scanning Electron Microscope

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100064189 ◽

1971 ◽

Vol 29 ◽

pp. 26-27

Author(s):

K. Shibatomi ◽

T. Yamanoto ◽

H. Koike

Keyword(s):

Electron Microscope ◽

Image Quality ◽

Scanning Electron Microscope ◽

Chromatic Aberration ◽

Image Contrast ◽

Objective Lens ◽

Thick Specimen ◽

Transmission Electron ◽

Scanning Electron

In the observation of a thick specimen by means of a transmission electron microscope, the intensity of electrons passing through the objective lens aperture is greatly reduced. So that the image is almost invisible. In addition to this fact, it have been reported that a chromatic aberration causes the deterioration of the image contrast rather than that of the resolution. The scanning electron microscope is, however, capable of electrically amplifying the signal of the decreasing intensity, and also free from a chromatic aberration so that the deterioration of the image contrast due to the aberration can be prevented. The electrical improvement of the image quality can be carried out by using the fascionating features of the SEM, that is, the amplification of a weak in-put signal forming the image and the descriminating action of the heigh level signal of the background. This paper reports some of the experimental results about the thickness dependence of the observability and quality of the image in the case of the transmission SEM.

Download Full-text