Research of object recognition using neural network Inception-v3 model operating on Raspberry Pi B3+

2021 ◽  
Vol 15 (1) ◽  
pp. 13-22
Author(s):  
An Toan Nguyen ◽  
◽  
Ngoc Thien Nguyen ◽  
Thanh Truc Nguyen

Image Classification is the most important problem in the field of computer vision. It is very simple and has many practical applications, the image classifier is responsible for assigning a label to the input image from a fixed category group. This article has applied image classification to identify objects by giving the image of the object to be identified, then labeling the image and announcing the label name (object name) through the audio channel. The classification is based on the neural network Inception-v3 model that has been trained on Tensorflow and used Raspberian operating system running on the Raspberry Pi 3 B+ to create a device capable of recognizing objects which compact size and convenient to apply in many fields.

Author(s):  
Hongguo Su ◽  
Mingyuan Zhang ◽  
Shengyuan Li ◽  
Xuefeng Zhao

In the last couple of years, advancements in the deep learning, especially in convolutional neural networks, proved to be a boon for the image classification and recognition tasks. One of the important practical applications of object detection and image classification can be for security enhancement. If dangerous objects or scenes can be identified automatically, then a lot of accidents can be prevented. For this purpose, in this paper we made use of state-of-the-art implementation of Faster Region-based Convolutional Neural Network (Faster R-CNN) based on the monitoring video of hoisting sites to train a model to detect the dangerous object and the worker. By extracting the locations of them, object-human interactions during hoisting, mainly for changes in their spatial location relationship, can be understood whereby estimating whether the scene is safe or dangerous. Experimental results showed that the pre-trained model achieved good performance with a high mean average precision of 97.66% on object detection and the proposed method fulfilled the goal of dangerous scenes recognition perfectly.


Author(s):  
T.K. Biryukova

Classic neural networks suppose trainable parameters to include just weights of neurons. This paper proposes parabolic integrodifferential splines (ID-splines), developed by author, as a new kind of activation function (AF) for neural networks, where ID-splines coefficients are also trainable parameters. Parameters of ID-spline AF together with weights of neurons are vary during the training in order to minimize the loss function thus reducing the training time and increasing the operation speed of the neural network. The newly developed algorithm enables software implementation of the ID-spline AF as a tool for neural networks construction, training and operation. It is proposed to use the same ID-spline AF for neurons in the same layer, but different for different layers. In this case, the parameters of the ID-spline AF for a particular layer change during the training process independently of the activation functions (AFs) of other network layers. In order to comply with the continuity condition for the derivative of the parabolic ID-spline on the interval (x x0, n) , its parameters fi (i= 0,...,n) should be calculated using the tridiagonal system of linear algebraic equations: To solve the system it is necessary to use two more equations arising from the boundary conditions for specific problems. For exam- ple the values of the grid function (if they are known) in the points (x x0, n) may be used for solving the system above: f f x0 = ( 0) , f f xn = ( n) . The parameters Iii+1 (i= 0,...,n−1 ) are used as trainable parameters of neural networks. The grid boundaries and spacing of the nodes of ID-spline AF are best chosen experimentally. The optimal selection of grid nodes allows improving the quality of results produced by the neural network. The formula for a parabolic ID-spline is such that the complexity of the calculations does not depend on whether the grid of nodes is uniform or non-uniform. An experimental comparison of the results of image classification from the popular FashionMNIST dataset by convolutional neural 0, x< 0 networks with the ID-spline AFs and the well-known ReLUx( ) =AF was carried out. The results reveal that the usage x x, ≥ 0 of the ID-spline AFs provides better accuracy of neural network operation than the ReLU AF. The training time for two convolutional layers network with two ID-spline AFs is just about 2 times longer than with two instances of ReLU AF. Doubling of the training time due to complexity of the ID-spline formula is the acceptable price for significantly better accuracy of the network. Wherein the difference of an operation speed of the networks with ID-spline and ReLU AFs will be negligible. The use of trainable ID-spline AFs makes it possible to simplify the architecture of neural networks without losing their efficiency. The modification of the well-known neural networks (ResNet etc.) by replacing traditional AFs with ID-spline AFs is a promising approach to increase the neural network operation accuracy. In a majority of cases, such a substitution does not require to train the network from scratch because it allows to use pre-trained on large datasets neuron weights supplied by standard software libraries for neural network construction thus substantially shortening training time.


2020 ◽  
Vol 6 (4) ◽  
pp. 467-476
Author(s):  
Xinxin Liu ◽  
Yunfeng Zhang ◽  
Fangxun Bao ◽  
Kai Shao ◽  
Ziyi Sun ◽  
...  

AbstractThis paper proposes a kernel-blending connection approximated by a neural network (KBNN) for image classification. A kernel mapping connection structure, guaranteed by the function approximation theorem, is devised to blend feature extraction and feature classification through neural network learning. First, a feature extractor learns features from the raw images. Next, an automatically constructed kernel mapping connection maps the feature vectors into a feature space. Finally, a linear classifier is used as an output layer of the neural network to provide classification results. Furthermore, a novel loss function involving a cross-entropy loss and a hinge loss is proposed to improve the generalizability of the neural network. Experimental results on three well-known image datasets illustrate that the proposed method has good classification accuracy and generalizability.


2020 ◽  
Vol 23 (6) ◽  
pp. 1172-1191
Author(s):  
Artem Aleksandrovich Elizarov ◽  
Evgenii Viktorovich Razinkov

Recently, such a direction of machine learning as reinforcement learning has been actively developing. As a consequence, attempts are being made to use reinforcement learning for solving computer vision problems, in particular for solving the problem of image classification. The tasks of computer vision are currently one of the most urgent tasks of artificial intelligence. The article proposes a method for image classification in the form of a deep neural network using reinforcement learning. The idea of ​​the developed method comes down to solving the problem of a contextual multi-armed bandit using various strategies for achieving a compromise between exploitation and research and reinforcement learning algorithms. Strategies such as -greedy, -softmax, -decay-softmax, and the UCB1 method, and reinforcement learning algorithms such as DQN, REINFORCE, and A2C are considered. The analysis of the influence of various parameters on the efficiency of the method is carried out, and options for further development of the method are proposed.


2021 ◽  
Vol 8 (3) ◽  
pp. 15-27
Author(s):  
Mohamed N. Sweilam ◽  
Nikolay Tolstokulakov

Depth estimation has made great progress in the last few years due to its applications in robotics science and computer vision. Various methods have been implemented and enhanced to estimate the depth without flickers and missing holes. Despite this progress, it is still one of the main challenges for researchers, especially for the video applications which have more complexity of the neural network which af ects the run time. Moreover to use such input like monocular video for depth estimation is considered an attractive idea, particularly for hand-held devices such as mobile phones, they are very popular for capturing pictures and videos, in addition to having a limited amount of RAM. Here in this work, we focus on enhancing the existing consistent depth estimation for monocular videos approach to be with less usage of RAM and with using less number of parameters without having a significant reduction in the quality of the depth estimation.


2021 ◽  
Author(s):  
Ghassan Mohammed Halawani

The main purpose of this project is to modify a convolutional neural network for image classification, based on a deep-learning framework. A transfer learning technique is used by the MATLAB interface to Alex-Net to train and modify the parameters in the last two fully connected layers of Alex-Net with a new dataset to perform classifications of thousands of images. First, the general common architecture of most neural networks and their benefits are presented. The mathematical models and the role of each part in the neural network are explained in detail. Second, different neural networks are studied in terms of architecture, application, and the working method to highlight the strengths and weaknesses of each of neural network. The final part conducts a detailed study on one of the most powerful deep-learning networks in image classification – i.e. the convolutional neural network – and how it can be modified to suit different classification tasks by using transfer learning technique in MATLAB.


Author(s):  
Zecong Ye ◽  
Zhiqiang Gao ◽  
Xiaolong Cui ◽  
Yaojie Wang ◽  
Nanliang Shan

AbstractIn image classification field, existing work tends to modify the network structure to obtain higher accuracy or faster speed. However, some studies have found that the neural network usually has texture bias effect, which means that the neural network is more sensitive to the texture information than the shape information. Based on such phenomenon, we propose a new way to improve network performance by making full use of gradient information. The dual features network (DuFeNet) is proposed in this paper. In DuFeNet, one sub-network is used to learn the information of gradient features, and the other is a traditional neural network with texture bias. The structure of DuFeNet is easy to implement in the original neural network structure. The experimental results clearly show that DuFeNet can achieve better accuracy in image classification and detection. It can increase the shape bias of the network adapted to human visual perception. Besides, DuFeNet can be used without modifying the structure of the original network at lower additional parameters cost.


2020 ◽  
Vol 2 (2) ◽  
pp. 23
Author(s):  
Lei Wang

<p>As an important research achievement in the field of brain like computing, deep convolution neural network has been widely used in many fields such as computer vision, natural language processing, information retrieval, speech recognition, semantic understanding and so on. It has set off a wave of neural network research in industry and academia and promoted the development of artificial intelligence. At present, the deep convolution neural network mainly simulates the complex hierarchical cognitive laws of the human brain by increasing the number of layers of the network, using a larger training data set, and improving the network structure or training learning algorithm of the existing neural network, so as to narrow the gap with the visual system of the human brain and enable the machine to acquire the capability of "abstract concepts". Deep convolution neural network has achieved great success in many computer vision tasks such as image classification, target detection, face recognition, pedestrian recognition, etc. Firstly, this paper reviews the development history of convolutional neural networks. Then, the working principle of the deep convolution neural network is analyzed in detail. Then, this paper mainly introduces the representative achievements of convolution neural network from the following two aspects, and shows the improvement effect of various technical methods on image classification accuracy through examples. From the aspect of adding network layers, the structures of classical convolutional neural networks such as AlexNet, ZF-Net, VGG, GoogLeNet and ResNet are discussed and analyzed. From the aspect of increasing the size of data set, the difficulties of manually adding labeled samples and the effect of using data amplification technology on improving the performance of neural network are introduced. This paper focuses on the latest research progress of convolution neural network in image classification and face recognition. Finally, the problems and challenges to be solved in future brain-like intelligence research based on deep convolution neural network are proposed.</p>


2020 ◽  
Vol 10 (14) ◽  
pp. 4806 ◽  
Author(s):  
Ho-Hyoung Choi ◽  
Hyun-Soo Kang ◽  
Byoung-Ju Yun

For more than a decade, both academia and industry have focused attention on the computer vision and in particular the computational color constancy (CVCC). The CVCC is used as a fundamental preprocessing task in a wide range of computer vision applications. While our human visual system (HVS) has the innate ability to perceive constant surface colors of objects under varying illumination spectra, the computer vision is facing the color constancy challenge in nature. Accordingly, this article proposes novel convolutional neural network (CNN) architecture based on the residual neural network which consists of pre-activation, atrous or dilated convolution and batch normalization. The proposed network can automatically decide what to learn from input image data and how to pool without supervision. When receiving input image data, the proposed network crops each image into image patches prior to training. Once the network begins learning, local semantic information is automatically extracted from the image patches and fed to its novel pooling layer. As a result of the semantic pooling, a weighted map or a mask is generated. Simultaneously, the extracted information is estimated and combined to form global information during training. The use of the novel pooling layer enables the proposed network to distinguish between useful data and noisy data, and thus efficiently remove noisy data during learning and evaluating. The main contribution of the proposed network is taking CVCC to higher accuracy and efficiency by adopting the novel pooling method. The experimental results demonstrate that the proposed network outperforms its conventional counterparts in estimation accuracy.


Author(s):  
S Gopi Naik

Abstract: The plan is to establish an integrated system that can manage high-quality visual information and also detect weapons quickly and efficiently. It is obtained by integrating ARM-based computer vision and optimization algorithms with deep neural networks able to detect the presence of a threat. The whole system is connected to a Raspberry Pi module, which will capture live broadcasting and evaluate it using a deep convolutional neural network. Due to the intimate interaction between object identification and video and image analysis in real-time objects, By generating sophisticated ensembles that incorporate various low-level picture features with high-level information from object detection and scenario classifiers, their performance can quickly plateau. Deep learning models, which can learn semantic, high-level, deeper features, have been developed to overcome the issues that are present in optimization algorithms. It presents a review of deep learning based object detection frameworks that use Convolutional Neural Network layers for better understanding of object detection. The Mobile-Net SSD model behaves differently in network design, training methods, and optimization functions, among other things. The crime rate in suspicious areas has been reduced as a consequence of weapon detection. However, security is always a major concern in human life. The Raspberry Pi module, or computer vision, has been extensively used in the detection and monitoring of weapons. Due to the growing rate of human safety protection, privacy and the integration of live broadcasting systems which can detect and analyse images, suspicious areas are becoming indispensable in intelligence. This process uses a Mobile-Net SSD algorithm to achieve automatic weapons and object detection. Keywords: Computer Vision, Weapon and Object Detection, Raspberry Pi Camera, RTSP, SMTP, Mobile-Net SSD, CNN, Artificial Intelligence.


Sign in / Sign up

Export Citation Format

Share Document