scholarly journals A survey on Deep Learning Based Eye Gaze Estimation Methods

2021 ◽  
Vol 3 (3) ◽  
pp. 190-207
Author(s):  
S. K. B. Sangeetha

In recent years, deep-learning systems have made great progress, particularly in the disciplines of computer vision and pattern recognition. Deep-learning technology can be used to enable inference models to do real-time object detection and recognition. Using deep-learning-based designs, eye tracking systems could determine the position of eyes or pupils, regardless of whether visible-light or near-infrared image sensors were utilized. For growing electronic vehicle systems, such as driver monitoring systems and new touch screens, accurate and successful eye gaze estimates are critical. In demanding, unregulated, low-power situations, such systems must operate efficiently and at a reasonable cost. A thorough examination of the different deep learning approaches is required to take into consideration all of the limitations and opportunities of eye gaze tracking. The goal of this research is to learn more about the history of eye gaze tracking, as well as how deep learning contributed to computer vision-based tracking. Finally, this research presents a generalized system model for deep learning-driven eye gaze direction diagnostics, as well as a comparison of several approaches.

2021 ◽  
Vol 11 (2) ◽  
pp. 851
Author(s):  
Wei-Liang Ou ◽  
Tzu-Ling Kuo ◽  
Chin-Chieh Chang ◽  
Chih-Peng Fan

In this study, for the application of visible-light wearable eye trackers, a pupil tracking methodology based on deep-learning technology is developed. By applying deep-learning object detection technology based on the You Only Look Once (YOLO) model, the proposed pupil tracking method can effectively estimate and predict the center of the pupil in the visible-light mode. By using the developed YOLOv3-tiny-based model to test the pupil tracking performance, the detection accuracy is as high as 80%, and the recall rate is close to 83%. In addition, the average visible-light pupil tracking errors of the proposed YOLO-based deep-learning design are smaller than 2 pixels for the training mode and 5 pixels for the cross-person test, which are much smaller than those of the previous ellipse fitting design without using deep-learning technology under the same visible-light conditions. After the combination of calibration process, the average gaze tracking errors by the proposed YOLOv3-tiny-based pupil tracking models are smaller than 2.9 and 3.5 degrees at the training and testing modes, respectively, and the proposed visible-light wearable gaze tracking system performs up to 20 frames per second (FPS) on the GPU-based software embedded platform.


Author(s):  
Jun-Li Xu ◽  
Cecilia Riccioli ◽  
Ana Herrero-Langreo ◽  
Aoife Gowen

Deep learning (DL) has recently achieved considerable successes in a wide range of applications, such as speech recognition, machine translation and visual recognition. This tutorial provides guidelines and useful strategies to apply DL techniques to address pixel-wise classification of spectral images. A one-dimensional convolutional neural network (1-D CNN) is used to extract features from the spectral domain, which are subsequently used for classification. In contrast to conventional classification methods for spectral images that examine primarily the spectral context, a three-dimensional (3-D) CNN is applied to simultaneously extract spatial and spectral features to enhance classificationaccuracy. This tutorial paper explains, in a stepwise manner, how to develop 1-D CNN and 3-D CNN models to discriminate spectral imaging data in a food authenticity context. The example image data provided consists of three varieties of puffed cereals imaged in the NIR range (943–1643 nm). The tutorial is presented in the MATLAB environment and scripts and dataset used are provided. Starting from spectral image pre-processing (background removal and spectral pre-treatment), the typical steps encountered in development of CNN models are presented. The example dataset provided demonstrates that deep learning approaches can increase classification accuracy compared to conventional approaches, increasing the accuracy of the model tested on an independent image from 92.33 % using partial least squares-discriminant analysis to 99.4 % using 3-CNN model at pixel level. The paper concludes with a discussion on the challenges and suggestions in the application of DL techniques for spectral image classification.


2019 ◽  
Vol 9 (7) ◽  
pp. 1385 ◽  
Author(s):  
Luca Donati ◽  
Eleonora Iotti ◽  
Giulio Mordonini ◽  
Andrea Prati

Visual classification of commercial products is a branch of the wider fields of object detection and feature extraction in computer vision, and, in particular, it is an important step in the creative workflow in fashion industries. Automatically classifying garment features makes both designers and data experts aware of their overall production, which is fundamental in order to organize marketing campaigns, avoid duplicates, categorize apparel products for e-commerce purposes, and so on. There are many different techniques for visual classification, ranging from standard image processing to machine learning approaches: this work, made by using and testing the aforementioned approaches in collaboration with Adidas AG™, describes a real-world study aimed at automatically recognizing and classifying logos, stripes, colors, and other features of clothing, solely from final rendering images of their products. Specifically, both deep learning and image processing techniques, such as template matching, were used. The result is a novel system for image recognition and feature extraction that has a high classification accuracy and which is reliable and robust enough to be used by a company like Adidas. This paper shows the main problems and proposed solutions in the development of this system, and the experimental results on the Adidas AG™ dataset.


Author(s):  
M. Sester ◽  
Y. Feng ◽  
F. Thiemann

<p><strong>Abstract.</strong> Cartographic generalization is a problem, which poses interesting challenges to automation. Whereas plenty of algorithms have been developed for the different sub-problems of generalization (e.g. simplification, displacement, aggregation), there are still cases, which are not generalized adequately or in a satisfactory way. The main problem is the interplay between different operators. In those cases the benchmark is the human operator, who is able to design an aesthetic and correct representation of the physical reality.</p><p>Deep Learning methods have shown tremendous success for interpretation problems for which algorithmic methods have deficits. A prominent example is the classification and interpretation of images, where deep learning approaches outperform the traditional computer vision methods. In both domains &amp;ndash; computer vision and cartography &amp;ndash; humans are able to produce a solution; a prerequisite for this is, that there is the possibility to generate many training examples for the different cases. Thus, the idea in this paper is to employ Deep Learning for cartographic generalizations tasks, especially for the task of building generalization. An advantage of this task is the fact that many training data sets are available from given map series. The approach is a first attempt using an existing network.</p><p>In the paper, the details of the implementation will be reported, together with an in depth analysis of the results. An outlook on future work will be given.</p>


2021 ◽  
Author(s):  
Antoine Bouziat ◽  
Sylvain Desroziers ◽  
Abdoulaye Koroko ◽  
Antoine Lechevallier ◽  
Mathieu Feraille ◽  
...  

&lt;p&gt;Automation and robotics raise growing interests in the mining industry. If not already a reality, it is no more science fiction to imagine autonomous robots routinely participating in the exploration and extraction of mineral raw materials in the near future. Among the various scientific and technical issues to be addressed towards this objective, this study focuses on the automation of real-time characterisation of rock images captured on the field, either to discriminate rock types and mineral species or to detect small elements such as mineral grains or metallic nuggets. To do so, we investigate the potential of methods from the Computer Vision community, a subfield of Artificial Intelligence dedicated to image processing. In particular, we aim at assessing the potential of Deep Learning approaches and convolutional neuronal networks (CNN) for the analysis of field samples pictures, highlighting key challenges before an industrial use in operational contexts.&lt;/p&gt;&lt;p&gt;In a first initiative, we appraise Deep Learning methods to classify photographs of macroscopic rock samples between 12 lithological families. Using the architecture of reference CNN and a collection of 2700 images, we achieve a prediction accuracy above 90% for new pictures of good photographic quality. Nonetheless we then seek to improve the robustness of the method for on-the-fly field photographs. To do so, we train an additional CNN to automatically separate the rock sample from the background, with a detection algorithm. We also introduce a more sophisticated classification method combining a set of several CNN with a decision tree. The CNN are specifically trained to recognise petrological features such as textures, structures or mineral species, while the decision tree mimics the naturalist methodology for lithological identification.&lt;/p&gt;&lt;p&gt;In a second initiative, we evaluate Deep Learning techniques to spot and delimitate specific elements in finer-scale images. We use a data set of carbonate thin sections with various species of microfossils. The data comes from a sedimentology study but analogies can be drawn with igneous geology use cases. We train four state-of-the-art Deep Learning methods for object detection with a limited data set of 15 annotated images. The results on 130 other thin section images are then qualitatively assessed by expert geologists, and precisions and inference times quantitatively measured. The four models show good capabilities in detecting and categorising the microfossils. However differences in accuracy and performance are underlined, leading to recommendations for comparable projects in a mining context.&lt;/p&gt;&lt;p&gt;Altogether, this study illustrates the power of Computer Vision and Deep Learning approaches to automate rock image analysis. However, to make the most of these technologies in mining activities, stimulating research opportunities lies in adapting the algorithms to the geological use cases, embedding as much geological knowledge as possible in the statistical models, and mitigating the number of training data to be manually interpreted beforehand.&amp;#160; &amp;#160;&lt;/p&gt;


2019 ◽  
Vol 8 (6) ◽  
pp. 258 ◽  
Author(s):  
Yu Feng ◽  
Frank Thiemann ◽  
Monika Sester

Cartographic generalization is a problem, which poses interesting challenges to automation. Whereas plenty of algorithms have been developed for the different sub-problems of generalization (e.g., simplification, displacement, aggregation), there are still cases, which are not generalized adequately or in a satisfactory way. The main problem is the interplay between different operators. In those cases the human operator is the benchmark, who is able to design an aesthetic and correct representation of the physical reality. Deep learning methods have shown tremendous success for interpretation problems for which algorithmic methods have deficits. A prominent example is the classification and interpretation of images, where deep learning approaches outperform traditional computer vision methods. In both domains-computer vision and cartography-humans are able to produce good solutions. A prerequisite for the application of deep learning is the availability of many representative training examples for the situation to be learned. As this is given in cartography (there are many existing map series), the idea in this paper is to employ deep convolutional neural networks (DCNNs) for cartographic generalizations tasks, especially for the task of building generalization. Three network architectures, namely U-net, residual U-net and generative adversarial network (GAN), are evaluated both quantitatively and qualitatively in this paper. They are compared based on their performance on this task at target map scales 1:10,000, 1:15,000 and 1:25,000, respectively. The results indicate that deep learning models can successfully learn cartographic generalization operations in one single model in an implicit way. The residual U-net outperforms the others and achieved the best generalization performance.


Electronics ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 932
Author(s):  
Yueh-Peng Chen ◽  
Tzuo-Yau Fan ◽  
Her-Chang Chao

Traditional watermarking techniques extract the watermark from a suspected image, allowing the copyright information regarding the image owner to be identified by the naked eye or by similarity estimation methods such as bit error rate and normalized correlation. However, this process should be more objective. In this paper, we implemented a model based on deep learning technology that can accurately identify the watermark copyright, known as WMNet. In the past, when establishing deep learning models, a large amount of training data needed to be collected. While constructing WMNet, we implemented a simulated process to generate a large number of distorted watermarks, and then collected them to form a training dataset. However, not all watermarks in the training dataset could properly provide copyright information. Therefore, according to the set restrictions, we divided the watermarks in the training dataset into two categories; consequently, WMNet could learn and identify the copyright information that the watermarks contained, so as to assist in the copyright verification process. Even if the retrieved watermark information was incomplete, the copyright information it contained could still be interpreted objectively and accurately. The results show that the method proposed by this study is relatively effective.


Author(s):  
Giovanna Castellano ◽  
Gennaro Vessio

AbstractThis paper provides an overview of some of the most relevant deep learning approaches to pattern extraction and recognition in visual arts, particularly painting and drawing. Recent advances in deep learning and computer vision, coupled with the growing availability of large digitized visual art collections, have opened new opportunities for computer science researchers to assist the art community with automatic tools to analyse and further understand visual arts. Among other benefits, a deeper understanding of visual arts has the potential to make them more accessible to a wider population, ultimately supporting the spread of culture.


Sign in / Sign up

Export Citation Format

Share Document