DeepScene: Scene classification via convolutional neural network with spatial pyramid pooling

2021 ◽  
pp. 116382
Author(s):  
Pui Sin Yee ◽  
Kian Ming Lim ◽  
Chin Poo Lee
Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 1058
Author(s):  
Zhanghui Liu ◽  
Yudong Zhang ◽  
Yuzhong Chen ◽  
Xinwen Fan ◽  
Chen Dong

Domain generation algorithms (DGAs) use specific parameters as random seeds to generate a large number of random domain names to prevent malicious domain name detection. This greatly increases the difficulty of detecting and defending against botnets and malware. Traditional models for detecting algorithmically generated domain names generally rely on manually extracting statistical characteristics from the domain names or network traffic and then employing classifiers to distinguish the algorithmically generated domain names. These models always require labor intensive manual feature engineering. In contrast, most state-of-the-art models based on deep neural networks are sensitive to imbalance in the sample distribution and cannot fully exploit the discriminative class features in domain names or network traffic, leading to decreased detection accuracy. To address these issues, we employ the borderline synthetic minority over-sampling algorithm (SMOTE) to improve sample balance. We also propose a recurrent convolutional neural network with spatial pyramid pooling (RCNN-SPP) to extract discriminative and distinctive class features. The recurrent convolutional neural network combines a convolutional neural network (CNN) and a bi-directional long short-term memory network (Bi-LSTM) to extract both the semantic and contextual information from domain names. We then employ the spatial pyramid pooling strategy to refine the contextual representation by capturing multi-scale contextual information from domain names. The experimental results from different domain name datasets demonstrate that our model can achieve 92.36% accuracy, an 89.55% recall rate, a 90.46% F1-score, and 95.39% AUC in identifying DGA and legitimate domain names, and it can achieve 92.45% accuracy rate, a 90.12% recall rate, a 90.86% F1-score, and 96.59% AUC in multi-classification problems. It achieves significant improvement over existing models in terms of accuracy and robustness.


2020 ◽  
Vol 10 (21) ◽  
pp. 7898
Author(s):  
Akm Ashiquzzaman ◽  
Hyunmin Lee ◽  
Kwangki Kim ◽  
Hye-Young Kim ◽  
Jaehyung Park ◽  
...  

Current deep learning convolutional neural network (DCNN) -based hand gesture detectors with acute precision demand incredibly high-performance computing power. Although DCNN-based detectors are capable of accurate classification, the sheer computing power needed for this form of classification makes it very difficult to run with lower computational power in remote environments. Moreover, classical DCNN architectures have a fixed number of input dimensions, which forces preprocessing, thus making it impractical for real-world applications. In this research, a practical DCNN with an optimized architecture is proposed with DCNN filter/node pruning, and spatial pyramid pooling (SPP) is introduced in order to make the model input dimension-invariant. This compact SPP-DCNN module uses 65% fewer parameters than traditional classifiers and operates almost 3× faster than classical models. Moreover, the new improved proposed algorithm, which decodes gestures or sign language finger-spelling from videos, gave a benchmark highest accuracy with the fastest processing speed. This proposed method paves the way for various practical and applied hand gesture input-based human-computer interaction (HCI) applications.


2021 ◽  
Vol 2083 (4) ◽  
pp. 042030
Author(s):  
Ziang Xu

Abstract This paper presents a light-weight Hierarchical Fusion Convolutional Neural Network (HF-CNN) which can be used for grasping detection. The network mainly employs residual structures, atrous spatial pyramid pooling (ASPP) and coding-decoding based feature fusion. Compared with the usual grasping detection, the network in this paper greatly improves the robustness and generalizability on detecting tasks by extensively extracting feature information of the images. In our test with the Cornell University dataset, we achieve 85% accuracy when detecting the unknown objects.


Sign in / Sign up

Export Citation Format

Share Document