scholarly journals POnline: An Online Pupil Annotation Tool Employing Crowd-sourcing and Engagement Mechanisms

2019 ◽  
Vol 6 ◽  
pp. 176-191
Author(s):  
David Gil de Gómez Pérez ◽  
Roman Bednarik

Pupil center and pupil contour are two of the most important features in the eye-image used for video-based eye-tracking. Well annotated databases are needed in order to allow benchmarking of the available- and new pupil detection and gaze estimation algorithms. Unfortunately, creation of such a data set is costly and requires a lot of efforts, including manual work of the annotators. In addition, reliability of manual annotations is hard to establish with a low number of annotators. In order to facilitate progress of the gaze tracking algorithm research, we created an online pupil annotation tool that engages many users to interact through gamification and allows utilization of the crowd power to create reliable annotations \cite{artstein2005bias}. We describe the tool and the mechanisms employed, and report results on the annotation of a publicly available data set. Finally, we demonstrate an example utilization of the new high-quality annotation on a comparison of two state-of-the-art pupil center algorithms.

2018 ◽  
Vol 9 (1) ◽  
pp. 6-18 ◽  
Author(s):  
Dario Cazzato ◽  
Fabio Dominio ◽  
Roberto Manduchi ◽  
Silvia M. Castro

Abstract Automatic gaze estimation not based on commercial and expensive eye tracking hardware solutions can enable several applications in the fields of human computer interaction (HCI) and human behavior analysis. It is therefore not surprising that several related techniques and methods have been investigated in recent years. However, very few camera-based systems proposed in the literature are both real-time and robust. In this work, we propose a real-time user-calibration-free gaze estimation system that does not need person-dependent calibration, can deal with illumination changes and head pose variations, and can work with a wide range of distances from the camera. Our solution is based on a 3-D appearance-based method that processes the images from a built-in laptop camera. Real-time performance is obtained by combining head pose information with geometrical eye features to train a machine learning algorithm. Our method has been validated on a data set of images of users in natural environments, and shows promising results. The possibility of a real-time implementation, combined with the good quality of gaze tracking, make this system suitable for various HCI applications.


2006 ◽  
Vol 5 (3) ◽  
pp. 41-45 ◽  
Author(s):  
Yong-Moo Kwon ◽  
Kyeong-Won Jeon ◽  
Jeongseok Ki ◽  
Qonita M. Shahab ◽  
Sangwoo Jo ◽  
...  

There are several researches on 2D gaze tracking techniques to the 2D screen for the Human-Computer Interaction. However, the researches for the gaze-based interaction to the stereo images or 3D contents are not reported. The stereo display techniques are emerging now for the reality service. Moreover, the 3D interaction techniques are needed in the 3D contents service environments. This paper presents 3D gaze estimation technique and its application to gaze-based interaction in the parallax barrier stereo display


Author(s):  
Sinh Huynh ◽  
Rajesh Krishna Balan ◽  
JeongGil Ko

Gaze tracking is a key building block used in many mobile applications including entertainment, personal productivity, accessibility, medical diagnosis, and visual attention monitoring. In this paper, we present iMon, an appearance-based gaze tracking system that is both designed for use on mobile phones and has significantly greater accuracy compared to prior state-of-the-art solutions. iMon achieves this by comprehensively considering the gaze estimation pipeline and then overcoming three different sources of errors. First, instead of assuming that the user's gaze is fixed to a single 2D coordinate, we construct each gaze label using a probabilistic 2D heatmap gaze representation input to overcome errors caused by microsaccade eye motions that cause the exact gaze point to be uncertain. Second, we design an image enhancement model to refine visual details and remove motion blur effects of input eye images. Finally, we apply a calibration scheme to correct for differences between the perceived and actual gaze points caused by individual Kappa angle differences. With all these improvements, iMon achieves a person-independent per-frame tracking error of 1.49 cm (on smartphones) and 1.94 cm (on tablets) when tested with the GazeCapture dataset and 2.01 cm with the TabletGaze dataset. This outperforms the previous state-of-the-art solutions by ~22% to 28%. By averaging multiple per-frame estimations that belong to the same fixation point and applying personal calibration, the tracking error is further reduced to 1.11 cm (smartphones) and 1.59 cm (tablets). Finally, we built implementations that run on an iPhone 12 Pro and show that our mobile implementation of iMon can run at up to 60 frames per second - thus making gaze-based control of applications possible.


2017 ◽  
Vol 2017 ◽  
pp. 1-10 ◽  
Author(s):  
Warapon Chinsatit ◽  
Takeshi Saitoh

This paper presents a convolutional neural network- (CNN-) based pupil center detection method for a wearable gaze estimation system using infrared eye images. Potentially, the pupil center position of a user’s eye can be used in various applications, such as human-computer interaction, medical diagnosis, and psychological studies. However, users tend to blink frequently; thus, estimating gaze direction is difficult. The proposed method uses two CNN models. The first CNN model is used to classify the eye state and the second is used to estimate the pupil center position. The classification model filters images with closed eyes and terminates the gaze estimation process when the input image shows a closed eye. In addition, this paper presents a process to create an eye image dataset using a wearable camera. This dataset, which was used to evaluate the proposed method, has approximately 20,000 images and a wide variation of eye states. We evaluated the proposed method from various perspectives. The result shows that the proposed method obtained good accuracy and has the potential for application in wearable device-based gaze estimation.


Sensors ◽  
2018 ◽  
Vol 18 (7) ◽  
pp. 2292 ◽  
Author(s):  
Zijing Wan ◽  
Xiangjun Wang ◽  
Kai Zhou ◽  
Xiaoyun Chen ◽  
Xiaoqing Wang

In this paper, a novel 3D gaze estimation method for a wearable gaze tracking device is proposed. This method is based on the pupillary accommodation reflex of human vision. Firstly, a 3D gaze measurement model is built. By uniting the line-of-sight convergence point and the size of the pupil, this model can be used to measure the 3D Point-of-Regard in free space. Secondly, a gaze tracking device is described. By using four cameras and semi-transparent mirrors, the gaze tracking device can accurately extract the spatial coordinates of the pupil and eye corner of the human eye from images. Thirdly, a simple calibration process of the measuring system is proposed. This method can be sketched as follows: (1) each eye is imaged by a pair of binocular stereo cameras, and the setting of semi-transparent mirrors can support a better field of view; (2) the spatial coordinates of the pupil center and the inner corner of the eye in the images of the stereo cameras are extracted, and the pupil size is calculated with the features of the gaze estimation method; (3) the pupil size and the line-of-sight convergence point when watching the calibration target at different distances are computed, and the parameters of the gaze estimation model are determined. Fourthly, an algorithm for searching the line-of-sight convergence point is proposed, and the 3D Point-of-Regard is estimated by using the obtained line-of-sight measurement model. Three groups of experiments were conducted to prove the effectiveness of the proposed method. This approach enables people to obtain the spatial coordinates of the Point-of-Regard in free space, which has great potential in the application of wearable devices.


2021 ◽  
Vol 11 (19) ◽  
pp. 9068
Author(s):  
Mohd Faizan Ansari ◽  
Pawel Kasprowski ◽  
Marcin Obetkal

Gaze estimation plays a significant role in understating human behavior and in human–computer interaction. Currently, there are many methods accessible for gaze estimation. However, most approaches need additional hardware for data acquisition which adds an extra cost to gaze tracking. The classic gaze tracking approaches usually require systematic prior knowledge or expertise for practical operations. Moreover, they are fundamentally based on the characteristics of the eye region, utilizing infrared light and iris glint to track the gaze point. It requires high-quality images with particular environmental conditions and another light source. Recent studies on appearance-based gaze estimation have demonstrated the capability of neural networks, especially convolutional neural networks (CNN), to decode gaze information present in eye images and achieved significantly simplified gaze estimation. In this paper, a gaze estimation method that utilizes a CNN for gaze estimation that can be applied to various platforms without additional hardware is presented. An easy and fast data collection method is used for collecting face and eyes images from an unmodified desktop camera. The proposed method registered good results; it proves that it is possible to predict the gaze with reasonable accuracy without any additional tools.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 699
Author(s):  
David Romero-Bascones ◽  
Maitane Barrenechea ◽  
Ane Murueta-Goyena ◽  
Marta Galdós ◽  
Juan Carlos Gómez-Esteban ◽  
...  

Disentangling the cellular anatomy that gives rise to human visual perception is one of the main challenges of ophthalmology. Of particular interest is the foveal pit, a concave depression located at the center of the retina that captures light from the gaze center. In recent years, there has been a growing interest in studying the morphology of the foveal pit by extracting geometrical features from optical coherence tomography (OCT) images. Despite this, research has devoted little attention to comparing existing approaches for two key methodological steps: the location of the foveal center and the mathematical modelling of the foveal pit. Building upon a dataset of 185 healthy subjects imaged twice, in the present paper the image alignment accuracy of four different foveal center location methods is studied in the first place. Secondly, state-of-the-art foveal pit mathematical models are compared in terms of fitting error, repeatability, and bias. The results indicate the importance of using a robust foveal center location method to align images. Moreover, we show that foveal pit models can improve the agreement between different acquisition protocols. Nevertheless, they can also introduce important biases in the parameter estimates that should be considered.


Healthcare ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 885
Author(s):  
Yoanda Alim Syahbana ◽  
Yokota Yasunari ◽  
Morita Hiroyuki ◽  
Aoki Mitsuhiro ◽  
Suzuki Kanade ◽  
...  

The detection of nystagmus using video oculography experiences accuracy problems when patients who complain of dizziness have difficulty in fully opening their eyes. Pupil detection and tracking in this condition affect the accuracy of the nystagmus waveform. In this research, we design a pupil detection method using a pattern matching approach that approximates the pupil using a Mexican hat-type ellipse pattern, in order to deal with the aforementioned problem. We evaluate the performance of the proposed method, in comparison with that of a conventional Hough transform method, for eye movement videos retrieved from Gifu University Hospital. The performance results show that the proposed method can detect and track the pupil position, even when only 20% of the pupil is visible. In comparison, the conventional Hough transform only indicates good performance when 90% of the pupil is visible. We also evaluate the proposed method using the Labelled Pupil in the Wild (LPW) data set. The results show that the proposed method has an accuracy of 1.47, as evaluated using the Mean Square Error (MSE), which is much lower than that of the conventional Hough transform method, with an MSE of 9.53. We conduct expert validation by consulting three medical specialists regarding the nystagmus waveform. The medical specialists agreed that the waveform can be evaluated clinically, without contradicting their diagnoses.


Sign in / Sign up

Export Citation Format

Share Document