Unsupervised Temporal Consistency Metric for Video Segmentation in Highly-Automated Driving

Author(s):  
Serin Varghese ◽  
Yasin Bayzidi ◽  
Andreas Bar ◽  
Nikhil Kapoor ◽  
Sounak Lahiri ◽  
...  
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jordan Navarro ◽  
Otto Lappi ◽  
François Osiurak ◽  
Emma Hernout ◽  
Catherine Gabaude ◽  
...  

AbstractActive visual scanning of the scene is a key task-element in all forms of human locomotion. In the field of driving, steering (lateral control) and speed adjustments (longitudinal control) models are largely based on drivers’ visual inputs. Despite knowledge gained on gaze behaviour behind the wheel, our understanding of the sequential aspects of the gaze strategies that actively sample that input remains restricted. Here, we apply scan path analysis to investigate sequences of visual scanning in manual and highly automated simulated driving. Five stereotypical visual sequences were identified under manual driving: forward polling (i.e. far road explorations), guidance, backwards polling (i.e. near road explorations), scenery and speed monitoring scan paths. Previously undocumented backwards polling scan paths were the most frequent. Under highly automated driving backwards polling scan paths relative frequency decreased, guidance scan paths relative frequency increased, and automation supervision specific scan paths appeared. The results shed new light on the gaze patterns engaged while driving. Methodological and empirical questions for future studies are discussed.


2020 ◽  
Vol 34 (07) ◽  
pp. 10713-10720
Author(s):  
Mingyu Ding ◽  
Zhe Wang ◽  
Bolei Zhou ◽  
Jianping Shi ◽  
Zhiwu Lu ◽  
...  

A major challenge for video semantic segmentation is the lack of labeled data. In most benchmark datasets, only one frame of a video clip is annotated, which makes most supervised methods fail to utilize information from the rest of the frames. To exploit the spatio-temporal information in videos, many previous works use pre-computed optical flows, which encode the temporal consistency to improve the video segmentation. However, the video segmentation and optical flow estimation are still considered as two separate tasks. In this paper, we propose a novel framework for joint video semantic segmentation and optical flow estimation. Semantic segmentation brings semantic information to handle occlusion for more robust optical flow estimation, while the non-occluded optical flow provides accurate pixel-level temporal correspondences to guarantee the temporal consistency of the segmentation. Moreover, our framework is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference. Extensive experiments show that the proposed model makes the video semantic segmentation and optical flow estimation benefit from each other and outperforms existing methods under the same settings in both tasks.


Author(s):  
Natasha Merat ◽  
A. Hamish Jamson ◽  
Frank C. H. Lai ◽  
Oliver Carsten

Author(s):  
Fabienne Roche ◽  
Anna Somieski ◽  
Stefan Brandenburg

Objective: We investigated drivers’ behavior and subjective experience when repeatedly taking over their vehicles’ control depending on the design of the takeover request (TOR) and the modality of the nondriving-related task (NDRT). Background: Previous research has shown that taking over vehicle control after highly automated driving provides several problems for drivers. There is evidence that the TOR design and the NDRT modality may influence takeover behavior and that driver behavior changes with more experience. Method: Forty participants were requested to resume control of their simulated vehicle six times. The TOR design (auditory or visual-auditory) and the NDRT modality (auditory or visual) were varied. Drivers’ takeover behavior, gaze patterns, and subjective workload were recorded and analyzed. Results: Results suggest that drivers change their behavior to the repeated experience of takeover situations. An auditory TOR leads to safer takeover behavior than a visual-auditory TOR. And with an auditory TOR, the takeover behavior improves with experience. Engaging in the visually demanding NDRT leads to fewer gazes on the road than the auditory NDRT. Participants’ fixation duration on the road decreased over the three takeovers with the visually demanding NDRT. Conclusions: The results imply that (a) drivers change their behavior to repeated takeovers, (b) auditory TOR designs might be preferable over visual-auditory TOR designs, and (c) auditory demanding NDRTs allow drivers to focus more on the driving scene. Application: The results of the present study can be used to design TORs and determine allowed NDRTs in highly automated driving.


Sign in / Sign up

Export Citation Format

Share Document