scholarly journals A Crowd Sensing Approach to Video Classification of Traffic Accident Hotspots

Author(s):  
Bernhard Gahr ◽  
Benjamin Ryder ◽  
André Dahlinger ◽  
Felix Wortmann
Author(s):  
Hehe Fan ◽  
Zhongwen Xu ◽  
Linchao Zhu ◽  
Chenggang Yan ◽  
Jianjun Ge ◽  
...  

We aim to significantly reduce the computational cost for classification of temporally untrimmed videos while retaining similar accuracy. Existing video classification methods sample frames with a predefined frequency over entire video. Differently, we propose an end-to-end deep reinforcement approach which enables an agent to classify videos by watching a very small portion of frames like what we do. We make two main contributions. First, information is not equally distributed in video frames along time. An agent needs to watch more carefully when a clip is informative and skip the frames if they are redundant or irrelevant. The proposed approach enables the agent to adapt sampling rate to video content and skip most of the frames without the loss of information. Second, in order to have a confident decision, the number of frames that should be watched by an agent varies greatly from one video to another. We incorporate an adaptive stop network to measure confidence score and generate timely trigger to stop the agent watching videos, which improves efficiency without loss of accuracy. Our approach reduces the computational cost significantly for the large-scale YouTube-8M dataset, while the accuracy remains the same.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7184
Author(s):  
Kunyoung Lee ◽  
Eui Chul Lee

Clinical studies have demonstrated that spontaneous and posed smiles have spatiotemporal differences in facial muscle movements, such as laterally asymmetric movements, which use different facial muscles. In this study, a model was developed in which video classification of the two types of smile was performed using a 3D convolutional neural network (CNN) applying a Siamese network, and using a neutral expression as reference input. The proposed model makes the following contributions. First, the developed model solves the problem caused by the differences in appearance between individuals, because it learns the spatiotemporal differences between the neutral expression of an individual and spontaneous and posed smiles. Second, using a neutral expression as an anchor improves the model accuracy, when compared to that of the conventional method using genuine and imposter pairs. Third, by using a neutral expression as an anchor image, it is possible to develop a fully automated classification system for spontaneous and posed smiles. In addition, visualizations were designed for the Siamese architecture-based 3D CNN to analyze the accuracy improvement, and to compare the proposed and conventional methods through feature analysis, using principal component analysis (PCA).


Healthcare ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 1579
Author(s):  
Wansuk Choi ◽  
Seoyoon Heo

The purpose of this study was to classify ULTT videos through transfer learning with pre-trained deep learning models and compare the performance of the models. We conducted transfer learning by combining a pre-trained convolution neural network (CNN) model into a Python-produced deep learning process. Videos were processed on YouTube and 103,116 frames converted from video clips were analyzed. In the modeling implementation, the process of importing the required modules, performing the necessary data preprocessing for training, defining the model, compiling, model creation, and model fit were applied in sequence. Comparative models were Xception, InceptionV3, DenseNet201, NASNetMobile, DenseNet121, VGG16, VGG19, and ResNet101, and fine tuning was performed. They were trained in a high-performance computing environment, and validation and loss were measured as comparative indicators of performance. Relatively low validation loss and high validation accuracy were obtained from Xception, InceptionV3, and DenseNet201 models, which is evaluated as an excellent model compared with other models. On the other hand, from VGG16, VGG19, and ResNet101, relatively high validation loss and low validation accuracy were obtained compared with other models. There was a narrow range of difference between the validation accuracy and the validation loss of the Xception, InceptionV3, and DensNet201 models. This study suggests that training applied with transfer learning can classify ULTT videos, and that there is a difference in performance between models.


2021 ◽  
Vol 18 (1) ◽  
pp. 102-120
Author(s):  
Aprilia Lutviana Dewi ◽  
Budyanra Budyanra

Traffic accidents among students are one of the problems experienced in the Greater Jakarta area. World Health Organization (WHO) stated that younger drivers are the most vulnerable group to experiencing traffic accidents, including the students. According to Badan Pusat Statistik (BPS), it was estimated that as many as 301,120 Jabodetabek commuters had experienced a traffic accident in 2019. Moreover, 13 to 14 out of the 100 commuters who had experienced traffic accidents are student commuters or commuters with the main activities going to school. Therefore, this study was conducted to determine the factors that affect the accident status of Jabodetabek student commuters in 2019 and their odds ratios by using the 2019 Jabodetabek Commuter Survey data. The analytical method used is a binary logistic regression with the parameter estimation method using penalized maximum likelihood estimation (PMLE). And the results showed that the variables of age, gender, last education, mode of transportation, classification of the area of residence, distance traveled, and the area of the activity had a significant influence on the accident status of Jabodetabek student commuters. Furthermore, student commuters who live in rural areas have the highest tendency to experience a traffic accident.


Sign in / Sign up

Export Citation Format

Share Document