scholarly journals A Deep Learning-Based Approach to Video-Based Eye Tracking for Human Psychophysics

2021 ◽  
Vol 15 ◽  
Author(s):  
Niklas Zdarsky ◽  
Stefan Treue ◽  
Moein Esghaei

Real-time gaze tracking provides crucial input to psychophysics studies and neuromarketing applications. Many of the modern eye-tracking solutions are expensive mainly due to the high-end processing hardware specialized for processing infrared-camera pictures. Here, we introduce a deep learning-based approach which uses the video frames of low-cost web cameras. Using DeepLabCut (DLC), an open-source toolbox for extracting points of interest from videos, we obtained facial landmarks critical to gaze location and estimated the point of gaze on a computer screen via a shallow neural network. Tested for three extreme poses, this architecture reached a median error of about one degree of visual angle. Our results contribute to the growing field of deep-learning approaches to eye-tracking, laying the foundation for further investigation by researchers in psychophysics or neuromarketing.

Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2984
Author(s):  
Yue Mu ◽  
Tai-Shen Chen ◽  
Seishi Ninomiya ◽  
Wei Guo

Automatic detection of intact tomatoes on plants is highly expected for low-cost and optimal management in tomato farming. Mature tomato detection has been wildly studied, while immature tomato detection, especially when occluded with leaves, is difficult to perform using traditional image analysis, which is more important for long-term yield prediction. Therefore, tomato detection that can generalize well in real tomato cultivation scenes and is robust to issues such as fruit occlusion and variable lighting conditions is highly desired. In this study, we build a tomato detection model to automatically detect intact green tomatoes regardless of occlusions or fruit growth stage using deep learning approaches. The tomato detection model used faster region-based convolutional neural network (R-CNN) with Resnet-101 and transfer learned from the Common Objects in Context (COCO) dataset. The detection on test dataset achieved high average precision of 87.83% (intersection over union ≥ 0.5) and showed a high accuracy of tomato counting (R2 = 0.87). In addition, all the detected boxes were merged into one image to compile the tomato location map and estimate their size along one row in the greenhouse. By tomato detection, counting, location and size estimation, this method shows great potential for ripeness and yield prediction.


2021 ◽  
Vol 3 (3) ◽  
pp. 190-207
Author(s):  
S. K. B. Sangeetha

In recent years, deep-learning systems have made great progress, particularly in the disciplines of computer vision and pattern recognition. Deep-learning technology can be used to enable inference models to do real-time object detection and recognition. Using deep-learning-based designs, eye tracking systems could determine the position of eyes or pupils, regardless of whether visible-light or near-infrared image sensors were utilized. For growing electronic vehicle systems, such as driver monitoring systems and new touch screens, accurate and successful eye gaze estimates are critical. In demanding, unregulated, low-power situations, such systems must operate efficiently and at a reasonable cost. A thorough examination of the different deep learning approaches is required to take into consideration all of the limitations and opportunities of eye gaze tracking. The goal of this research is to learn more about the history of eye gaze tracking, as well as how deep learning contributed to computer vision-based tracking. Finally, this research presents a generalized system model for deep learning-driven eye gaze direction diagnostics, as well as a comparison of several approaches.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Rajshree Varma ◽  
Yugandhara Verma ◽  
Priya Vijayvargiya ◽  
Prathamesh P. Churi

PurposeThe rapid advancement of technology in online communication and fingertip access to the Internet has resulted in the expedited dissemination of fake news to engage a global audience at a low cost by news channels, freelance reporters and websites. Amid the coronavirus disease 2019 (COVID-19) pandemic, individuals are inflicted with these false and potentially harmful claims and stories, which may harm the vaccination process. Psychological studies reveal that the human ability to detect deception is only slightly better than chance; therefore, there is a growing need for serious consideration for developing automated strategies to combat fake news that traverses these platforms at an alarming rate. This paper systematically reviews the existing fake news detection technologies by exploring various machine learning and deep learning techniques pre- and post-pandemic, which has never been done before to the best of the authors’ knowledge.Design/methodology/approachThe detailed literature review on fake news detection is divided into three major parts. The authors searched papers no later than 2017 on fake news detection approaches on deep learning and machine learning. The papers were initially searched through the Google scholar platform, and they have been scrutinized for quality. The authors kept “Scopus” and “Web of Science” as quality indexing parameters. All research gaps and available databases, data pre-processing, feature extraction techniques and evaluation methods for current fake news detection technologies have been explored, illustrating them using tables, charts and trees.FindingsThe paper is dissected into two approaches, namely machine learning and deep learning, to present a better understanding and a clear objective. Next, the authors present a viewpoint on which approach is better and future research trends, issues and challenges for researchers, given the relevance and urgency of a detailed and thorough analysis of existing models. This paper also delves into fake new detection during COVID-19, and it can be inferred that research and modeling are shifting toward the use of ensemble approaches.Originality/valueThe study also identifies several novel automated web-based approaches used by researchers to assess the validity of pandemic news that have proven to be successful, although currently reported accuracy has not yet reached consistent levels in the real world.


2016 ◽  
Vol 2016 ◽  
pp. 1-14 ◽  
Author(s):  
Onur Ferhat ◽  
Fernando Vilariño

Despite the availability of accurate, commercial gaze tracker devices working with infrared (IR) technology, visible light gaze tracking constitutes an interesting alternative by allowing scalability and removing hardware requirements. Over the last years, this field has seen examples of research showing performance comparable to the IR alternatives. In this work, we survey the previous work on remote, visible light gaze trackers and analyze the explored techniques from various perspectives such as calibration strategies, head pose invariance, and gaze estimation techniques. We also provide information on related aspects of research such as public datasets to test against, open source projects to build upon, and gaze tracking services to directly use in applications. With all this information, we aim to provide the contemporary and future researchers with a map detailing previously explored ideas and the required tools.


2020 ◽  
Author(s):  
Hanan Alghamdi ◽  
Ghada Amoudi ◽  
Salma Elhag ◽  
Kawther Saeedi ◽  
Jomanah Nasser

UNSTRUCTURED Chest X-ray (CXR) imaging is a standard and crucial examination method used for suspected cases of coronavirus disease (COVID-19). In profoundly affected or limited resource areas, CXR imaging is preferable owing to its availability, low cost, and rapid results. However, given the rapidly spreading nature of COVID-19, such tests could limit the efficiency of pandemic control and prevention. In response to this issue, artificial intelligence methods such as deep learning are promising options for automatic diagnosis because they have achieved state-of-the-art performance in the analysis of visual information and a wide range of medical images. This paper reviews and critically assesses the preprint and published reports between March and May 2020 for the diagnosis of COVID-19 via CXR images using convolutional neural networks and other deep learning architectures. Despite the encouraging results, there is an urgent need for public, comprehensive, and diverse datasets. Further investigations in terms of explainable and justifiable decisions are also required for more robust, transparent, and accurate predictions


Author(s):  
Gajanan Tudavekar ◽  
Santosh S. Saraf ◽  
Sanjay R. Patil

Video inpainting aims to complete in a visually pleasing way the missing regions in video frames. Video inpainting is an exciting task due to the variety of motions across different frames. The existing methods usually use attention models to inpaint videos by seeking the damaged content from other frames. Nevertheless, these methods suffer due to irregular attention weight from spatio-temporal dimensions, thus giving rise to artifacts in the inpainted video. To overcome the above problem, Spatio-Temporal Inference Transformer Network (STITN) has been proposed. The STITN aligns the frames to be inpainted and concurrently inpaints all the frames, and a spatio-temporal adversarial loss function improves the STITN. Our method performs considerably better than the existing deep learning approaches in quantitative and qualitative evaluation.


Over the last decade, many disciplines have made great strides in deep learning technologies, especially in computer vision and image processing. However, video coding based on deep training is still in its initial stage. This research work discusses the representative's work on deep learning for image/video coding, a research area since 2015. With the number of devices increasing on the Internet, we face low-cost transmission over a network and security and safety. We can't determine the accurate data size with encryption and decryption cost and amount of noise in communication. Our proposed unified framework for encryption and decryption of images based on an autoencoder (UFED) can control the cost during encryption and decryption using modern techniques like deep learning and neural network. The Autoencoder is worked as close to CNN and is trained on images and video frames to extract the image's feature. In this framework, the encoder changes the image into latent space or compressed form in a small size. We achieved the best image-compression ratio with Autoencoder over JPEG; JPEG typically achieves 10:1 compression with little perceptible loss in image quality. This research observed the accuracy of image reshaping from latent space as well. We have achieved over 97.8% accuracy on the standard quantity evaluation measure in our proposed deep learning technique, far better than previously implemented models.


2012 ◽  
Vol 263-266 ◽  
pp. 2399-2402
Author(s):  
Chi Wu Huang ◽  
Zong Sian Jiang ◽  
Wei Fan Kao ◽  
Yen Lin Huang

This paper presents the developing of a low-cost eye-tracking system by modifying the commercial-over-the-shelf camera to integrate with the proper-tuned open source drivers and the user-defined application programs. The system configuration is proposed and the gaze-tracking approximated by the least square polynomial mapping is described. Comparisons between other low-cost systems as well as commercial system are provided. Our system obtained the highest image capturing rate of 180 frames per second, and the ISO 9241-Part 9 test performance favored our system, in terms of Response time and Correct response rate. Currently, we are developing gaze-tracking accuracy application. The real time gaze-tracking and the Head Movement Estimation are the issues in future work.


2019 ◽  
Vol 2019 (1) ◽  
pp. 360-368
Author(s):  
Mekides Assefa Abebe ◽  
Jon Yngve Hardeberg

Different whiteboard image degradations highly reduce the legibility of pen-stroke content as well as the overall quality of the images. Consequently, different researchers addressed the problem through different image enhancement techniques. Most of the state-of-the-art approaches applied common image processing techniques such as background foreground segmentation, text extraction, contrast and color enhancements and white balancing. However, such types of conventional enhancement methods are incapable of recovering severely degraded pen-stroke contents and produce artifacts in the presence of complex pen-stroke illustrations. In order to surmount such problems, the authors have proposed a deep learning based solution. They have contributed a new whiteboard image data set and adopted two deep convolutional neural network architectures for whiteboard image quality enhancement applications. Their different evaluations of the trained models demonstrated their superior performances over the conventional methods.


Sign in / Sign up

Export Citation Format

Share Document