video recognition
Recently Published Documents


TOTAL DOCUMENTS

144
(FIVE YEARS 89)

H-INDEX

11
(FIVE YEARS 5)

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yanghong Liu ◽  
Jintao Liu

In this paper, a three-dimensional anisotropic diffusion equation is used to conduct an in-depth study and analysis of students’ concentration in video recognition in English teaching classrooms. A multifeature fusion face live detection method based on diffusion model extracts Diffusion Kernel (DK) features and depth features from diffusion-processed face images, respectively. DK features provide a nonlinear description of the correlation between successive face images and express face image sequences in the temporal dimension; depth features are extracted by a pretrained depth neural network model that can express the complex nonlinear mapping relationships of images and reflect the more abstract implicit information inside face images. To improve the effectiveness of the face image features, the extracted DK features and depth features are fused using a multicore learning method to obtain the best combination and the corresponding weights. The two features complement each other, and the fused features are more discriminative, which provides a strong basis for the live determination of face images. Experiments show that the method has excellent performance and can effectively discriminate the live nature of faces in images and resist forged face attacks. Based on the above face detection and expression recognition algorithms, the classroom concentration analysis system based on expression recognition is designed to achieve real-time acquisition and processing of classroom images, complete student classroom attendance records using face detection and face recognition methods, and analyze students’ concentration from the face integrity and facial expression of students facing the blackboard by combining face detection and expression recognition to visualize and display students’ classroom data for teachers, students, and parents with more data support and help.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Zhiyuan Wang ◽  
Chongyuan Bi ◽  
Songhui You ◽  
Junjie Yao

In this paper, we conduct an in-depth study and analysis of sports video recognition by improved hidden Markov model. The feature module is a complex gesture recognition module based on hidden Markov model gesture features, which applies the hidden Markov model features to gesture recognition and performs the recognition of complex gestures made by combining simple gestures based on simple gesture recognition. The combination of the two modules forms the overall technology of this paper, which can be applied to many scenarios, including some special scenarios with high-security levels that require real-time feedback and some public indoor scenarios, which can achieve different prevention and services for different age groups. With the increase of the depth of the feature extraction network, the experimental effect is enhanced; however, the two-dimensional convolutional neural network loses temporal information when extracting features, so the three-dimensional convolutional network is used in this paper to extract features from the video in time and space. Multiple binary classifications of the extracted features are performed to achieve the goal of multilabel classification. A multistream residual neural network is used to extract features from video data of three modalities, and the extracted feature vectors are fed into the attention mechanism network, then, the more critical information for video recognition is selected from a large amount of spatiotemporal information, further learning the temporal dependencies existing between consecutive video frames, and finally fusing the multistream network outputs to obtain the final prediction category. By training and optimizing the model in an end-to-end manner, recognition accuracies of 92.7% and 64.4% are achieved on the dataset, respectively.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Yumeng Sun

With the development of urban economic construction and urban planning, higher requirements are put forward for the government community in the corresponding community management, community service, and other related things. As an important technical means to assist the government and community in management, video recognition technology plays an important role in the accurate management and service of the government and community. Traditional algorithms based on partial differential equations will destroy image edges and image details in video recognition. Based on this, this paper improves the traditional partial differential equation algorithm of image recognition, selects the GAC model based on image segmentation in the main function, and innovatively optimizes the stop function of its equation function, so as to improve the effect of community case image segmentation. In the image smoothing layer, this paper innovatively selects the second derivative based on image processing as the inherent feature of image recognition, so as to solve the rough problem of image edge and improve the processing efficiency of the algorithm. In order to further maintain the details of the relevant images of community cases, this paper integrates the Gaussian curvature driving function on the improved partial differential equation algorithm, so as to protect the details of the smooth region of the relevant recognition video and solve the disadvantages of the traditional algorithm. The experimental results show that the improved partial differential equation algorithm proposed in this paper improves the accuracy of video recognition by about 5% compared with the traditional algorithm. At the same time, the new algorithm can well ensure the detail integrity of the recognized video.


Author(s):  
Zuxuan Wu ◽  
Hengduo Li ◽  
Yingbin Zheng ◽  
Caiming Xiong ◽  
Yu-Gang Jiang ◽  
...  

Author(s):  
Dabbara Praveen

Intelligent video recognition with in-depth learning concept will create a self-paced video analytics program. CCTV cameras are used in all areas where safety is paramount. Manual monitoring seems tedious and time-consuming. Security can be defined by different words in different contexts such as identity theft, violence, explosions etc. Security monitoring is a tedious and time-consuming task. In this project we will analyse video feeds in real time and identify any unusual items such as violence or theft. The concept of in-depth learning simulates the functioning of the human brain in processing data for use in acquisition, speech recognition, decision making, etc. This will depend without human guidance, from unstructured and unlabelled data.


2021 ◽  
Author(s):  
Na Li ◽  
Kuangang Fan ◽  
Ouyang Qinghua ◽  
Yahui Liu

Author(s):  
Zeyuan Wang ◽  
Chaofeng Sha ◽  
Su Yang

We explore the black-box adversarial attack on video recognition models. Attacks are only performed on selected key regions and key frames to reduce the high computation cost of searching adversarial perturbations on a video due to its high dimensionality. To select key frames, one way is to use heuristic algorithms to evaluate the importance of each frame and choose the essential ones. However, it is time inefficient on sorting and searching. In order to speed up the attack process, we propose a reinforcement learning based frame selection strategy. Specifically, the agent explores the difference between the original class and the target class of videos to make selection decisions. It receives rewards from threat models which indicate the quality of the decisions. Besides, we also use saliency detection to select key regions and only estimate the sign of gradient instead of the gradient itself in zeroth order optimization to further boost the attack process. We can use the trained model directly in the untargeted attack or with little fine-tune in the targeted attack, which saves computation time. A range of empirical results on real datasets demonstrate the effectiveness and efficiency of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document