scholarly journals Predicting Visual Search Task Success from Eye Gaze Data as a Basis for User-Adaptive Information Visualization Systems

2021 ◽  
Vol 11 (2) ◽  
pp. 1-25
Author(s):  
Moritz Spiller ◽  
Ying-Hsang Liu ◽  
Md Zakir Hossain ◽  
Tom Gedeon ◽  
Julia Geissler ◽  
...  

Information visualizations are an efficient means to support the users in understanding large amounts of complex, interconnected data; user comprehension, however, depends on individual factors such as their cognitive abilities. The research literature provides evidence that user-adaptive information visualizations positively impact the users’ performance in visualization tasks. This study attempts to contribute toward the development of a computational model to predict the users’ success in visual search tasks from eye gaze data and thereby drive such user-adaptive systems. State-of-the-art deep learning models for time series classification have been trained on sequential eye gaze data obtained from 40 study participants’ interaction with a circular and an organizational graph. The results suggest that such models yield higher accuracy than a baseline classifier and previously used models for this purpose. In particular, a Multivariate Long Short Term Memory Fully Convolutional Network shows encouraging performance for its use in online user-adaptive systems. Given this finding, such a computational model can infer the users’ need for support during interaction with a graph and trigger appropriate interventions in user-adaptive information visualization systems. This facilitates the design of such systems since further interaction data like mouse clicks is not required.

2018 ◽  
Vol 09 (03) ◽  
pp. 511-518 ◽  
Author(s):  
Dawn Dowding ◽  
Jacqueline Merrill

Background Heuristic evaluation is used in human–computer interaction studies to assess the usability of information systems. Nielsen's widely used heuristics, first developed in 1990, are appropriate for general usability but do not specifically address usability in systems that produce information visualizations. Objective This article develops a heuristic evaluation checklist that can be used to evaluate systems that produce information visualizations. Principles from Nielsen's heuristics were combined with heuristic principles developed by prior researchers specifically to evaluate information visualization. Methods We used nominal group technique to determine an appropriate final set. The combined existing usability principles and associated factors were distributed via email to a group of 12 informatics experts from a range of health care disciplines. Respondents were asked to rate each factor on its importance as an evaluation heuristic for visualization systems on a scale from 1 (definitely don't include) to 10 (definitely include). The distribution of scores for each item were calculated. A median score of ≥8 represented consensus for inclusion in the final checklist. Results Ten of 12 experts responded with rankings and written comments. The final checklist consists of 10 usability principles (7 general and 3 specific to information visualization) substantiated by 49 usability factors. Three nursing informatics experts then used the checklist to evaluate a vital sign dashboard developed for home care nurses, using a task list designed to explore the full functionality of the dashboard. The experts used the checklist without difficulty, and indicated that it covered all major usability problems encountered during task completion. Conclusion The growing capacity to generate and electronically process health data suggests that data visualization will be increasingly important. A checklist of usability heuristics for evaluating information visualization systems can contribute to assuring high quality in electronic data systems developed for health care.


2020 ◽  
Vol 2 ◽  
Author(s):  
Ben Steichen ◽  
Bo Fu

Information visualizations can be regarded as one of the most powerful cognitive tools to significantly amplify human cognition. However, traditional information visualization systems have been designed in a manner that does not consider individual user differences, even though human cognitive abilities and styles have been shown to differ significantly. In order to address this research gap, novel adaptive systems need to be developed that are able to (1) infer individual user characteristics and (2) provide an adaptation mechanism to personalize the system to the inferred characteristic. This paper presents a first step toward this goal by investigating the extent to which a user's cognitive style can be inferred from their behavior with an information visualization system. In particular, this paper presents a series of experiments that utilize features calculated from user eye gaze data in order to infer a user's cognitive style. Several different data and feature sets are presented, and results overall show that a user's eye gaze data can be used successfully to infer a user's cognitive style during information visualization usage.


Author(s):  
Stanislav Dornic ◽  
Ragnar Hagdahl ◽  
Gote Hanson

2020 ◽  
Author(s):  
Julian Jara-Ettinger ◽  
Paula Rubio-Fernandez

A foundational assumption of human communication is that speakers ought to say as much as necessary, but no more. How speakers determine what is necessary in a given context, however, is unclear. In studies of referential communication, this expectation is often formalized as the idea that speakers should construct reference by selecting the shortest, sufficiently informative, description. Here we propose that reference production is, instead, a process whereby speakers adopt listeners’ perspectives to facilitate their visual search, without concern for utterance length. We show that a computational model of our proposal predicts graded acceptability judgments with quantitative accuracy, systematically outperforming brevity models. Our model also explains crosslinguistic differences in speakers’ propensity to over-specify in different visual contexts. Our findings suggest that reference production is best understood as driven by a cooperative goal to help the listener understand the intended message, rather than by an egocentric effort to minimize utterance length.


2021 ◽  
Vol 11 (10) ◽  
pp. 4426
Author(s):  
Chunyan Ma ◽  
Ji Fan ◽  
Jinghao Yao ◽  
Tao Zhang

Computer vision-based action recognition of basketball players in basketball training and competition has gradually become a research hotspot. However, owing to the complex technical action, diverse background, and limb occlusion, it remains a challenging task without effective solutions or public dataset benchmarks. In this study, we defined 32 kinds of atomic actions covering most of the complex actions for basketball players and built the dataset NPU RGB+D (a large scale dataset of basketball action recognition with RGB image data and Depth data captured in Northwestern Polytechnical University) for 12 kinds of actions of 10 professional basketball players with 2169 RGB+D videos and 75 thousand frames, including RGB frame sequences, depth maps, and skeleton coordinates. Through extracting the spatial features of the distances and angles between the joint points of basketball players, we created a new feature-enhanced skeleton-based method called LSTM-DGCN for basketball player action recognition based on the deep graph convolutional network (DGCN) and long short-term memory (LSTM) methods. Many advanced action recognition methods were evaluated on our dataset and compared with our proposed method. The experimental results show that the NPU RGB+D dataset is very competitive with the current action recognition algorithms and that our LSTM-DGCN outperforms the state-of-the-art action recognition methods in various evaluation criteria on our dataset. Our action classifications and this NPU RGB+D dataset are valuable for basketball player action recognition techniques. The feature-enhanced LSTM-DGCN has a more accurate action recognition effect, which improves the motion expression ability of the skeleton data.


Information ◽  
2020 ◽  
Vol 12 (1) ◽  
pp. 3
Author(s):  
Shuang Chen ◽  
Zengcai Wang ◽  
Wenxin Chen

The effective detection of driver drowsiness is an important measure to prevent traffic accidents. Most existing drowsiness detection methods only use a single facial feature to identify fatigue status, ignoring the complex correlation between fatigue features and the time information of fatigue features, and this reduces the recognition accuracy. To solve these problems, we propose a driver sleepiness estimation model based on factorized bilinear feature fusion and a long- short-term recurrent convolutional network to detect driver drowsiness efficiently and accurately. The proposed framework includes three models: fatigue feature extraction, fatigue feature fusion, and driver drowsiness detection. First, we used a convolutional neural network (CNN) to effectively extract the deep representation of eye and mouth-related fatigue features from the face area detected in each video frame. Then, based on the factorized bilinear feature fusion model, we performed a nonlinear fusion of the deep feature representations of the eyes and mouth. Finally, we input a series of fused frame-level features into a long-short-term memory (LSTM) unit to obtain the time information of the features and used the softmax classifier to detect sleepiness. The proposed framework was evaluated with the National Tsing Hua University drowsy driver detection (NTHU-DDD) video dataset. The experimental results showed that this method had better stability and robustness compared with other methods.


Author(s):  
Sophia Bano ◽  
Francisco Vasconcelos ◽  
Emmanuel Vander Poorten ◽  
Tom Vercauteren ◽  
Sebastien Ourselin ◽  
...  

Abstract Purpose Fetoscopic laser photocoagulation is a minimally invasive surgery for the treatment of twin-to-twin transfusion syndrome (TTTS). By using a lens/fibre-optic scope, inserted into the amniotic cavity, the abnormal placental vascular anastomoses are identified and ablated to regulate blood flow to both fetuses. Limited field-of-view, occlusions due to fetus presence and low visibility make it difficult to identify all vascular anastomoses. Automatic computer-assisted techniques may provide better understanding of the anatomical structure during surgery for risk-free laser photocoagulation and may facilitate in improving mosaics from fetoscopic videos. Methods We propose FetNet, a combined convolutional neural network (CNN) and long short-term memory (LSTM) recurrent neural network architecture for the spatio-temporal identification of fetoscopic events. We adapt an existing CNN architecture for spatial feature extraction and integrated it with the LSTM network for end-to-end spatio-temporal inference. We introduce differential learning rates during the model training to effectively utilising the pre-trained CNN weights. This may support computer-assisted interventions (CAI) during fetoscopic laser photocoagulation. Results We perform quantitative evaluation of our method using 7 in vivo fetoscopic videos captured from different human TTTS cases. The total duration of these videos was 5551 s (138,780 frames). To test the robustness of the proposed approach, we perform 7-fold cross-validation where each video is treated as a hold-out or test set and training is performed using the remaining videos. Conclusion FetNet achieved superior performance compared to the existing CNN-based methods and provided improved inference because of the spatio-temporal information modelling. Online testing of FetNet, using a Tesla V100-DGXS-32GB GPU, achieved a frame rate of 114 fps. These results show that our method could potentially provide a real-time solution for CAI and automating occlusion and photocoagulation identification during fetoscopic procedures.


1982 ◽  
Vol 54 (3_suppl) ◽  
pp. 1299-1302 ◽  
Author(s):  
Douglas Cellar ◽  
Gerald V. Barrett ◽  
Ralph Alexander ◽  
Dennis Doverspike ◽  
Jay C. Thomas ◽  
...  

To obtain a more precise understanding of the constructs underlying complex monitoring, measures of short-term memory and visual search were administered to 7 male and 13 female college students. The hypothesis was that more rapid short-term memory and visual search would be related to successful monitoring. A correlational analysis indicated that choice reaction time was related to performance ( r = –.38 and –.43) while rate of serial comparisons was not ( r = –.08 and –.28). It was concluded that information-processing measures enhanced the understanding of the underlying processes in monitoring beyond that provided by traditional cognitive tests.


2018 ◽  
Vol 10 (11) ◽  
pp. 1827 ◽  
Author(s):  
Ahram Song ◽  
Jaewan Choi ◽  
Youkyung Han ◽  
Yongil Kim

Hyperspectral change detection (CD) can be effectively performed using deep-learning networks. Although these approaches require qualified training samples, it is difficult to obtain ground-truth data in the real world. Preserving spatial information during training is difficult due to structural limitations. To solve such problems, our study proposed a novel CD method for hyperspectral images (HSIs), including sample generation and a deep-learning network, called the recurrent three-dimensional (3D) fully convolutional network (Re3FCN), which merged the advantages of a 3D fully convolutional network (FCN) and a convolutional long short-term memory (ConvLSTM). Principal component analysis (PCA) and the spectral correlation angle (SCA) were used to generate training samples with high probabilities of being changed or unchanged. The strategy assisted in training fewer samples of representative feature expression. The Re3FCN was mainly comprised of spectral–spatial and temporal modules. Particularly, a spectral–spatial module with a 3D convolutional layer extracts the spectral–spatial features from the HSIs simultaneously, whilst a temporal module with ConvLSTM records and analyzes the multi-temporal HSI change information. The study first proposed a simple and effective method to generate samples for network training. This method can be applied effectively to cases with no training samples. Re3FCN can perform end-to-end detection for binary and multiple changes. Moreover, Re3FCN can receive multi-temporal HSIs directly as input without learning the characteristics of multiple changes. Finally, the network could extract joint spectral–spatial–temporal features and it preserved the spatial structure during the learning process through the fully convolutional structure. This study was the first to use a 3D FCN and a ConvLSTM for the remote-sensing CD. To demonstrate the effectiveness of the proposed CD method, we performed binary and multi-class CD experiments. Results revealed that the Re3FCN outperformed the other conventional methods, such as change vector analysis, iteratively reweighted multivariate alteration detection, PCA-SCA, FCN, and the combination of 2D convolutional layers-fully connected LSTM.


Sign in / Sign up

Export Citation Format

Share Document