scholarly journals CNN and RNN mixed model for image classification

2019 ◽  
Vol 277 ◽  
pp. 02001 ◽  
Author(s):  
Qiwei Yin ◽  
Ruixun Zhang ◽  
XiuLi Shao

In this paper, we propose a CNN(Convolutional neural networks) and RNN(recurrent neural networks) mixed model for image classification, the proposed network, called CNN-RNN model. Image data can be viewed as two-dimensional wave data, and convolution calculation is a filtering process. It can filter non-critical band information in an image, leaving behind important features of image information. The CNN-RNN model can use the RNN to Calculate the Dependency and Continuity Features of the Intermediate Layer Output of the CNN Model, connect the characteristics of these middle tiers to the final full-connection network for classification prediction, which will result in better classification accuracy. At the same time, in order to satisfy the restriction of the length of the input sequence by the RNN model and prevent the gradient explosion or gradient disappearing in the network, this paper combines the wavelet transform (WT) method in the Fourier transform to filter the input data. We will test the proposed CNN-RNN model on a widely-used datasets CIFAR-10. The results prove the proposed method has a better classification effect than the original CNN network, and that further investigation is needed.

2021 ◽  
Vol 11 (15) ◽  
pp. 6721
Author(s):  
Jinyeong Wang ◽  
Sanghwan Lee

In increasing manufacturing productivity with automated surface inspection in smart factories, the demand for machine vision is rising. Recently, convolutional neural networks (CNNs) have demonstrated outstanding performance and solved many problems in the field of computer vision. With that, many machine vision systems adopt CNNs to surface defect inspection. In this study, we developed an effective data augmentation method for grayscale images in CNN-based machine vision with mono cameras. Our method can apply to grayscale industrial images, and we demonstrated outstanding performance in the image classification and the object detection tasks. The main contributions of this study are as follows: (1) We propose a data augmentation method that can be performed when training CNNs with industrial images taken with mono cameras. (2) We demonstrate that image classification or object detection performance is better when training with the industrial image data augmented by the proposed method. Through the proposed method, many machine-vision-related problems using mono cameras can be effectively solved by using CNNs.


Mathematics ◽  
2021 ◽  
Vol 9 (6) ◽  
pp. 624
Author(s):  
Stefan Rohrmanstorfer ◽  
Mikhail Komarov ◽  
Felix Mödritscher

With the always increasing amount of image data, it has become a necessity to automatically look for and process information in these images. As fashion is captured in images, the fashion sector provides the perfect foundation to be supported by the integration of a service or application that is built on an image classification model. In this article, the state of the art for image classification is analyzed and discussed. Based on the elaborated knowledge, four different approaches will be implemented to successfully extract features out of fashion data. For this purpose, a human-worn fashion dataset with 2567 images was created, but it was significantly enlarged by the performed image operations. The results show that convolutional neural networks are the undisputed standard for classifying images, and that TensorFlow is the best library to build them. Moreover, through the introduction of dropout layers, data augmentation and transfer learning, model overfitting was successfully prevented, and it was possible to incrementally improve the validation accuracy of the created dataset from an initial 69% to a final validation accuracy of 84%. More distinct apparel like trousers, shoes and hats were better classified than other upper body clothes.


2018 ◽  
Vol 12 (04) ◽  
pp. 481-500 ◽  
Author(s):  
Naifan Zhuang ◽  
The Duc Kieu ◽  
Jun Ye ◽  
Kien A. Hua

With the growth of crowd phenomena in the real world, crowd scene understanding is becoming an important task in anomaly detection and public security. Visual ambiguities and occlusions, high density, low mobility, and scene semantics, however, make this problem a great challenge. In this paper, we propose an end-to-end deep architecture, convolutional nonlinear differential recurrent neural networks (CNDRNNs), for crowd scene understanding. CNDRNNs consist of GoogleNet Inception V3 convolutional neural networks (CNNs) and nonlinear differential recurrent neural networks (RNNs). Different from traditional non-end-to-end solutions which separate the steps of feature extraction and parameter learning, CNDRNN utilizes a unified deep model to optimize the parameters of CNN and RNN hand in hand. It thus has the potential of generating a more harmonious model. The proposed architecture takes sequential raw image data as input, and does not rely on tracklet or trajectory detection. It thus has clear advantages over the traditional flow-based and trajectory-based methods, especially in challenging crowd scenarios of high density and low mobility. Taking advantage of CNN and RNN, CNDRNN can effectively analyze the crowd semantics. Specifically, CNN is good at modeling the semantic crowd scene information. On the other hand, nonlinear differential RNN models the motion information. The individual and increasing orders of derivative of states (DoS) in differential RNN can progressively build up the ability of the long short-term memory (LSTM) gates to detect different levels of salient dynamical patterns in deeper stacked layers modeling higher orders of DoS. Lastly, existing LSTM-based crowd scene solutions explore deep temporal information and are claimed to be “deep in time.” Our proposed method CNDRNN, however, models the spatial and temporal information in a unified architecture and achieves “deep in space and time.” Extensive performance studies on the Violent-Flows, CUHK Crowd, and NUS-HGA datasets show that the proposed technique significantly outperforms state-of-the-art methods.


2021 ◽  
Author(s):  
Daisuke Matsuoka

Abstract Image data classification using machine learning is one of the effective methods for detecting atmospheric phenomena. However, extreme weather events with a small number of cases cause a decrease in classification prediction accuracy owing to the imbalance of data between the target class and the other classes. In order to build a highly accurate classification model, we held a data analysis competition to determine the best classification performance for two classes of cloud image data: tropical cyclones including precursors and other classes. For the top models in the competition, minority data oversampling, majority data undersampling, ensemble learning, deep layer neural networks, and cost-effective loss functions were used to improve the imbalanced classification performance. In particular, the best model out of 209 submissions succeeded in improving the classification capability by 65.4% over similar conventional methods in a measure of low false alarm ratio.


2020 ◽  
Vol 397 ◽  
pp. 48-59 ◽  
Author(s):  
Riqiang Gao ◽  
Yuankai Huo ◽  
Shunxing Bao ◽  
Yucheng Tang ◽  
Sanja L. Antic ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document