Reply to comment about The Video Classification of Intubation score: a new description tool for tracheal intubation using videolaryngoscopy

2022 ◽  
Vol 39 (2) ◽  
pp. 183-184
Author(s):  
Rajinder Singh Chaggar ◽  
Sneh Vinu Shah ◽  
Michael Berry ◽  
Rajan Saini ◽  
Sanooj Soni ◽  
...  
2021 ◽  
Vol 38 (3) ◽  
pp. 324-326
Author(s):  
Rajinder Singh Chaggar ◽  
Sneh Vinu Shah ◽  
Michael Berry ◽  
Rajan Saini ◽  
Sanooj Soni ◽  
...  

Author(s):  
Hehe Fan ◽  
Zhongwen Xu ◽  
Linchao Zhu ◽  
Chenggang Yan ◽  
Jianjun Ge ◽  
...  

We aim to significantly reduce the computational cost for classification of temporally untrimmed videos while retaining similar accuracy. Existing video classification methods sample frames with a predefined frequency over entire video. Differently, we propose an end-to-end deep reinforcement approach which enables an agent to classify videos by watching a very small portion of frames like what we do. We make two main contributions. First, information is not equally distributed in video frames along time. An agent needs to watch more carefully when a clip is informative and skip the frames if they are redundant or irrelevant. The proposed approach enables the agent to adapt sampling rate to video content and skip most of the frames without the loss of information. Second, in order to have a confident decision, the number of frames that should be watched by an agent varies greatly from one video to another. We incorporate an adaptive stop network to measure confidence score and generate timely trigger to stop the agent watching videos, which improves efficiency without loss of accuracy. Our approach reduces the computational cost significantly for the large-scale YouTube-8M dataset, while the accuracy remains the same.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7184
Author(s):  
Kunyoung Lee ◽  
Eui Chul Lee

Clinical studies have demonstrated that spontaneous and posed smiles have spatiotemporal differences in facial muscle movements, such as laterally asymmetric movements, which use different facial muscles. In this study, a model was developed in which video classification of the two types of smile was performed using a 3D convolutional neural network (CNN) applying a Siamese network, and using a neutral expression as reference input. The proposed model makes the following contributions. First, the developed model solves the problem caused by the differences in appearance between individuals, because it learns the spatiotemporal differences between the neutral expression of an individual and spontaneous and posed smiles. Second, using a neutral expression as an anchor improves the model accuracy, when compared to that of the conventional method using genuine and imposter pairs. Third, by using a neutral expression as an anchor image, it is possible to develop a fully automated classification system for spontaneous and posed smiles. In addition, visualizations were designed for the Siamese architecture-based 3D CNN to analyze the accuracy improvement, and to compare the proposed and conventional methods through feature analysis, using principal component analysis (PCA).


Healthcare ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 1579
Author(s):  
Wansuk Choi ◽  
Seoyoon Heo

The purpose of this study was to classify ULTT videos through transfer learning with pre-trained deep learning models and compare the performance of the models. We conducted transfer learning by combining a pre-trained convolution neural network (CNN) model into a Python-produced deep learning process. Videos were processed on YouTube and 103,116 frames converted from video clips were analyzed. In the modeling implementation, the process of importing the required modules, performing the necessary data preprocessing for training, defining the model, compiling, model creation, and model fit were applied in sequence. Comparative models were Xception, InceptionV3, DenseNet201, NASNetMobile, DenseNet121, VGG16, VGG19, and ResNet101, and fine tuning was performed. They were trained in a high-performance computing environment, and validation and loss were measured as comparative indicators of performance. Relatively low validation loss and high validation accuracy were obtained from Xception, InceptionV3, and DenseNet201 models, which is evaluated as an excellent model compared with other models. On the other hand, from VGG16, VGG19, and ResNet101, relatively high validation loss and low validation accuracy were obtained compared with other models. There was a narrow range of difference between the validation accuracy and the validation loss of the Xception, InceptionV3, and DensNet201 models. This study suggests that training applied with transfer learning can classify ULTT videos, and that there is a difference in performance between models.


2017 ◽  
Vol 10 (2) ◽  
pp. 413-416
Author(s):  
H. B Basanth

Digital images are widespread today. The use of digital images is classified into natural images and computer graphic images. Discrimination of natural images and computer graphic (CG) images are used in the applications which include flower classification, indexing of images, video classification and many more. With the rapid growth in the image rendering technology, the user can produce very high realistic computer graphic images using sophisticated graphics software packages. Due to high realism in CG images, it is very difficult for the user to distinguish it from natural images by a naked eye. This paper presents comparative study of the existing schemes used to classify digital images.


Sign in / Sign up

Export Citation Format

Share Document