scholarly journals Development of 3D convolutional neural network to recognize human activities using moderate computation machine

2021 ◽  
Vol 10 (6) ◽  
pp. 3137-3146
Author(s):  
Malik A. Alsaedi ◽  
Abdulrahman Saeed Mohialdeen ◽  
Baraa Munqith Albaker

Human activity recognition (HAR) is recently used in numerous applications including smart homes to monitor human behavior, automate homes according to human activities, entertainment, falling detection, violence detection, and people care. Vision-based recognition is the most powerful method widely used in HAR systems implementation due to its characteristics in recognizing complex human activities. This paper addresses the design of a 3D convolutional neural network (3D-CNN) model that can be used in smart homes to identify several numbers of activities. The model is trained using KTH dataset that contains activities like (walking, running, jogging, handwaving handclapping, boxing). Despite the challenges of this method due to the effectiveness of the lamination, background variation, and human body variety, the proposed model reached an accuracy of 93.33%. The model was implemented, trained and tested using moderate computation machine and the results show that the proposal was successfully capable to recognize human activities with reasonable computations.

Sensors ◽  
2019 ◽  
Vol 19 (11) ◽  
pp. 2472 ◽  
Author(s):  
Fath U Min Ullah ◽  
Amin Ullah ◽  
Khan Muhammad ◽  
Ijaz Ul Haq ◽  
Sung Wook Baik

The worldwide utilization of surveillance cameras in smart cities has enabled researchers to analyze a gigantic volume of data to ensure automatic monitoring. An enhanced security system in smart cities, schools, hospitals, and other surveillance domains is mandatory for the detection of violent or abnormal activities to avoid any casualties which could cause social, economic, and ecological damages. Automatic detection of violence for quick actions is very significant and can efficiently assist the concerned departments. In this paper, we propose a triple-staged end-to-end deep learning violence detection framework. First, persons are detected in the surveillance video stream using a light-weight convolutional neural network (CNN) model to reduce and overcome the voluminous processing of useless frames. Second, a sequence of 16 frames with detected persons is passed to 3D CNN, where the spatiotemporal features of these sequences are extracted and fed to the Softmax classifier. Furthermore, we optimized the 3D CNN model using an open visual inference and neural networks optimization toolkit developed by Intel, which converts the trained model into intermediate representation and adjusts it for optimal execution at the end platform for the final prediction of violent activity. After detection of a violent activity, an alert is transmitted to the nearest police station or security department to take prompt preventive actions. We found that our proposed method outperforms the existing state-of-the-art methods for different benchmark datasets.


2021 ◽  
Vol 11 (6) ◽  
pp. 2838
Author(s):  
Nikitha Johnsirani Venkatesan ◽  
Dong Ryeol Shin ◽  
Choon Sung Nam

In the pharmaceutical field, early detection of lung nodules is indispensable for increasing patient survival. We can enhance the quality of the medical images by intensifying the radiation dose. High radiation dose provokes cancer, which forces experts to use limited radiation. Using abrupt radiation generates noise in CT scans. We propose an optimal Convolutional Neural Network model in which Gaussian noise is removed for better classification and increased training accuracy. Experimental demonstration on the LUNA16 dataset of size 160 GB shows that our proposed method exhibit superior results. Classification accuracy, specificity, sensitivity, Precision, Recall, F1 measurement, and area under the ROC curve (AUC) of the model performance are taken as evaluation metrics. We conducted a performance comparison of our proposed model on numerous platforms, like Apache Spark, GPU, and CPU, to depreciate the training time without compromising the accuracy percentage. Our results show that Apache Spark, integrated with a deep learning framework, is suitable for parallel training computation with high accuracy.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2648
Author(s):  
Muhammad Aamir ◽  
Tariq Ali ◽  
Muhammad Irfan ◽  
Ahmad Shaf ◽  
Muhammad Zeeshan Azam ◽  
...  

Natural disasters not only disturb the human ecological system but also destroy the properties and critical infrastructures of human societies and even lead to permanent change in the ecosystem. Disaster can be caused by naturally occurring events such as earthquakes, cyclones, floods, and wildfires. Many deep learning techniques have been applied by various researchers to detect and classify natural disasters to overcome losses in ecosystems, but detection of natural disasters still faces issues due to the complex and imbalanced structures of images. To tackle this problem, we propose a multilayered deep convolutional neural network. The proposed model works in two blocks: Block-I convolutional neural network (B-I CNN), for detection and occurrence of disasters, and Block-II convolutional neural network (B-II CNN), for classification of natural disaster intensity types with different filters and parameters. The model is tested on 4428 natural images and performance is calculated and expressed as different statistical values: sensitivity (SE), 97.54%; specificity (SP), 98.22%; accuracy rate (AR), 99.92%; precision (PRE), 97.79%; and F1-score (F1), 97.97%. The overall accuracy for the whole model is 99.92%, which is competitive and comparable with state-of-the-art algorithms.


Author(s):  
Young Hyun Kim ◽  
Eun-Gyu Ha ◽  
Kug Jin Jeon ◽  
Chena Lee ◽  
Sang-Sun Han

Objectives: This study aimed to develop a fully automated human identification method based on a convolutional neural network (CNN) with a large-scale dental panoramic radiograph (DPR) dataset. Methods: In total, 2,760 DPRs from 746 subjects who had 2 to 17 DPRs with various changes in image characteristics due to various dental treatments (tooth extraction, oral surgery, prosthetics, orthodontics, or tooth development) were collected. The test dataset included the latest DPR of each subject (746 images) and the other DPRs (2,014 images) were used for model training. A modified VGG16 model with two fully connected layers was applied for human identification. The proposed model was evaluated with rank-1, –3, and −5 accuracies, running time, and gradient-weighted class activation mapping (Grad-CAM)–applied images. Results: This model had rank-1,–3, and −5 accuracies of 82.84%, 89.14%, and 92.23%, respectively. All rank-1 accuracy values of the proposed model were above 80% regardless of changes in image characteristics. The average running time to train the proposed model was 60.9 sec per epoch, and the prediction time for 746 test DPRs was short (3.2 sec/image). The Grad-CAM technique verified that the model automatically identified humans by focusing on identifiable dental information. Conclusion: The proposed model showed good performance in fully automatic human identification despite differing image characteristics of DPRs acquired from the same patients. Our model is expected to assist in the fast and accurate identification by experts by comparing large amounts of images and proposing identification candidates at high speed.


2021 ◽  
Vol 16 ◽  
Author(s):  
Di Gai ◽  
Xuanjing Shen ◽  
Haipeng Chen

Background: The effective classification of the melting curve is conducive to measure the specificity of the amplified products and the influence of invalid data on subsequent experiments is excluded. Objective: In this paper, a convolutional neural network (CNN) classification model based on dynamic filter is proposed, which can categorize the number of peaks in the melting curve image and distinguish the pollution data represented by the noise peaks. Method: The main advantage of the proposed model is that it adopts the filter which changes with the input and uses the dynamic filter to capture more information in the image, making the network learning more accurate. In addition, the residual module is used to extract the characteristics of the melting curve, and the pooling operation is replaced with an atrous convolution to prevent the loss of context information. Result: In order to train the proposed model, a novel melting curve dataset is created, which includes a balanced dataset and an unbalanced dataset. The proposed method uses six classification-based assessment criteria to compare with seven representative methods based on deep learning. Experimental results show that proposed method is not only markedly outperforms the other state-of-the-art methods in accuracy, but also has much less running time. Conclusion: It evidently proves that the proposed method is suitable for judging the specificity of amplification products according to the melting curve. Simultaneously, it overcomes the difficulties of manual selection with low efficiency and artificial bias.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Defeng Lv ◽  
Huawei Wang ◽  
Changchang Che

Purpose The purpose of this study is to achieve an accurate intelligent fault diagnosis of rolling bearing. Design/methodology/approach To extract deep features of the original vibration signal and improve the generalization ability and robustness of the fault diagnosis model, this paper proposes a fault diagnosis method of rolling bearing based on multiscale convolutional neural network (MCNN) and decision fusion. The original vibration signals are normalized and matrixed to form grayscale image samples. In addition, multiscale samples can be achieved by convoluting these samples with different convolution kernels. Subsequently, MCNN is constructed for fault diagnosis. The results of MCNN are put into a data fusion model to obtain comprehensive fault diagnosis results. Findings The bearing data sets with multiple multivariate time series are used to testify the effectiveness of the proposed method. The proposed model can achieve 99.8% accuracy of fault diagnosis. Based on MCNN and decision fusion, the accuracy can be improved by 0.7%–3.4% compared with other models. Originality/value The proposed model can extract deep general features of vibration signals by MCNN and obtained robust fault diagnosis results based on the decision fusion model. For a long time series of vibration signals with noise, the proposed model can still achieve accurate fault diagnosis.


2020 ◽  
Vol 38 (5) ◽  
pp. 5615-5626
Author(s):  
Junsuo Qu ◽  
Ning Qiao ◽  
Haonan Shi ◽  
Chang Su ◽  
Abolfazl Razi

2019 ◽  
Vol 2019 ◽  
pp. 1-12 ◽  
Author(s):  
Jianfang Cao ◽  
Chenyan Wu ◽  
Lichao Chen ◽  
Hongyan Cui ◽  
Guoqing Feng

In today’s society, image resources are everywhere, and the number of available images can be overwhelming. Determining how to rapidly and effectively query, retrieve, and organize image information has become a popular research topic, and automatic image annotation is the key to text-based image retrieval. If the semantic images with annotations are not balanced among the training samples, the low-frequency labeling accuracy can be poor. In this study, a dual-channel convolution neural network (DCCNN) was designed to improve the accuracy of automatic labeling. The model integrates two convolutional neural network (CNN) channels with different structures. One channel is used for training based on the low-frequency samples and increases the proportion of low-frequency samples in the model, and the other is used for training based on all training sets. In the labeling process, the outputs of the two channels are fused to obtain a labeling decision. We verified the proposed model on the Caltech-256, Pascal VOC 2007, and Pascal VOC 2012 standard datasets. On the Pascal VOC 2012 dataset, the proposed DCCNN model achieves an overall labeling accuracy of up to 93.4% after 100 training iterations: 8.9% higher than the CNN and 15% higher than the traditional method. A similar accuracy can be achieved by the CNN only after 2,500 training iterations. On the 50,000-image dataset from Caltech-256 and Pascal VOC 2012, the performance of the DCCNN is relatively stable; it achieves an average labeling accuracy above 93%. In contrast, the CNN reaches an accuracy of only 91% even after extended training. Furthermore, the proposed DCCNN achieves a labeling accuracy for low-frequency words approximately 10% higher than that of the CNN, which further verifies the reliability of the proposed model in this study.


2020 ◽  
Vol 10 (15) ◽  
pp. 5333
Author(s):  
Anam Manzoor ◽  
Waqar Ahmad ◽  
Muhammad Ehatisham-ul-Haq ◽  
Abdul Hannan ◽  
Muhammad Asif Khan ◽  
...  

Emotions are a fundamental part of human behavior and can be stimulated in numerous ways. In real-life, we come across different types of objects such as cake, crab, television, trees, etc., in our routine life, which may excite certain emotions. Likewise, object images that we see and share on different platforms are also capable of expressing or inducing human emotions. Inferring emotion tags from these object images has great significance as it can play a vital role in recommendation systems, image retrieval, human behavior analysis and, advertisement applications. The existing schemes for emotion tag perception are based on the visual features, like color and texture of an image, which are poorly affected by lightning conditions. The main objective of our proposed study is to address this problem by introducing a novel idea of inferring emotion tags from the images based on object-related features. In this aspect, we first created an emotion-tagged dataset from the publicly available object detection dataset (i.e., “Caltech-256”) using subject evaluation from 212 users. Next, we used a convolutional neural network-based model to automatically extract the high-level features from object images for recognizing nine (09) emotion categories, such as amusement, awe, anger, boredom, contentment, disgust, excitement, fear, and sadness. Experimental results on our emotion-tagged dataset endorse the success of our proposed idea in terms of accuracy, precision, recall, specificity, and F1-score. Overall, the proposed scheme achieved an accuracy rate of approximately 85% and 79% using top-level and bottom-level emotion tagging, respectively. We also performed a gender-based analysis for inferring emotion tags and observed that male and female subjects have discernment in emotions perception concerning different object categories.


Sign in / Sign up

Export Citation Format

Share Document