A Face Tracking Method in Videos Based on Convolutional Neural Networks

Author(s):  
Zihan Ren ◽  
Jianwei Li ◽  
Xiaoying Zhang ◽  
Shuangyuan Yang ◽  
Fuhao Zou

Face tracking in surveillance videos is one of the important issues in the field of computer vision and has realistic significance. In this paper, a new face tracking framework in videos based on convolutional neural networks (CNNs) and Kalman filter algorithm is proposed. The framework uses a rough-to-fine CNN to detect faces in each frame of the video. The rough-to-fine CNN method has a higher accuracy in complex scenes such as face rotation, light change and occlusion. When face tracking fails due to severe occlusion or significant rotation, the framework uses Kalman filter to predict face position. The experimental results show that the proposed method has high precision and fast processing speed.

2021 ◽  
pp. 147592172110537
Author(s):  
Dong H Kang ◽  
Young-Jin Cha

Recently, crack segmentation studies have been investigated using deep convolutional neural networks. However, significant deficiencies remain in the preparation of ground truth data, consideration of complex scenes, development of an object-specific network for crack segmentation, and use of an evaluation method, among other issues. In this paper, a novel semantic transformer representation network (STRNet) is developed for crack segmentation at the pixel level in complex scenes in a real-time manner. STRNet is composed of a squeeze and excitation attention-based encoder, a multi head attention-based decoder, coarse upsampling, a focal-Tversky loss function, and a learnable swish activation function to design the network concisely by keeping its fast-processing speed. A method for evaluating the level of complexity of image scenes was also proposed. The proposed network is trained with 1203 images with further extensive synthesis-based augmentation, and it is investigated with 545 testing images (1280 × 720, 1024 × 512); it achieves 91.7%, 92.7%, 92.2%, and 92.6% in terms of precision, recall, F1 score, and mIoU (mean intersection over union), respectively. Its performance is compared with those of recently developed advanced networks (Attention U-net, CrackSegNet, Deeplab V3+, FPHBN, and Unet++), with STRNet showing the best performance in the evaluation metrics-it achieves the fastest processing at 49.2 frames per second.


Sign in / Sign up

Export Citation Format

Share Document