scholarly journals Designing a Supermarket Service Robot Based on Deep Convolutional Neural Networks

Symmetry ◽  
2020 ◽  
Vol 12 (3) ◽  
pp. 360
Author(s):  
Aihua Chen ◽  
Benquan Yang ◽  
Yueli Cui ◽  
Yuefen Chen ◽  
Shiqing Zhang ◽  
...  

In order to save people’s shopping time and reduce labor cost of supermarket operations, this paper proposes to design a supermarket service robot based on deep convolutional neural networks (DCNNs). Firstly, according to the shopping environment and needs of supermarket, the hardware and software structure of supermarket service robot is designed. The robot uses a robot operating system (ROS) middleware on Raspberry PI as a control kernel to implement wireless communication with customers and staff. So as to move flexibly, the omnidirectional wheels symmetrically installed under the robot chassis are adopted for tracking. The robot uses an infrared detection module to detect whether there are commodities in the warehouse or shelves or not, thereby grasping and placing commodities accurately. Secondly, the recently-developed single shot multibox detector (SSD), as a typical DCNN model, is employed to detect and identify objects. Finally, in order to verify robot performance, a supermarket environment is designed to simulate real-world scenario for experiments. Experimental results show that the designed supermarket service robot can automatically complete the procurement and replenishment of commodities well and present promising performance on commodity detection and recognition tasks.

Sensors ◽  
2020 ◽  
Vol 20 (13) ◽  
pp. 3718 ◽  
Author(s):  
Hieu Nguyen ◽  
Yuzeng Wang ◽  
Zhaoyang Wang

Single-shot 3D imaging and shape reconstruction has seen a surge of interest due to the ever-increasing evolution in sensing technologies. In this paper, a robust single-shot 3D shape reconstruction technique integrating the structured light technique with the deep convolutional neural networks (CNNs) is proposed. The input of the technique is a single fringe-pattern image, and the output is the corresponding depth map for 3D shape reconstruction. The essential training and validation datasets with high-quality 3D ground-truth labels are prepared by using a multi-frequency fringe projection profilometry technique. Unlike the conventional 3D shape reconstruction methods which involve complex algorithms and intensive computation to determine phase distributions or pixel disparities as well as depth map, the proposed approach uses an end-to-end network architecture to directly carry out the transformation of a 2D image to its corresponding 3D depth map without extra processing. In the approach, three CNN-based models are adopted for comparison. Furthermore, an accurate structured-light-based 3D imaging dataset used in this paper is made publicly available. Experiments have been conducted to demonstrate the validity and robustness of the proposed technique. It is capable of satisfying various 3D shape reconstruction demands in scientific research and engineering applications.


Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2393 ◽  
Author(s):  
Daniel Octavian Melinte ◽  
Luige Vladareanu

The interaction between humans and an NAO robot using deep convolutional neural networks (CNN) is presented in this paper based on an innovative end-to-end pipeline method that applies two optimized CNNs, one for face recognition (FR) and another one for the facial expression recognition (FER) in order to obtain real-time inference speed for the entire process. Two different models for FR are considered, one known to be very accurate, but has low inference speed (faster region-based convolutional neural network), and one that is not as accurate but has high inference speed (single shot detector convolutional neural network). For emotion recognition transfer learning and fine-tuning of three CNN models (VGG, Inception V3 and ResNet) has been used. The overall results show that single shot detector convolutional neural network (SSD CNN) and faster region-based convolutional neural network (Faster R-CNN) models for face detection share almost the same accuracy: 97.8% for Faster R-CNN on PASCAL visual object classes (PASCAL VOCs) evaluation metrics and 97.42% for SSD Inception. In terms of FER, ResNet obtained the highest training accuracy (90.14%), while the visual geometry group (VGG) network had 87% accuracy and Inception V3 reached 81%. The results show improvements over 10% when using two serialized CNN, instead of using only the FER CNN, while the recent optimization model, called rectified adaptive moment optimization (RAdam), lead to a better generalization and accuracy improvement of 3%-4% on each emotion recognition CNN.


2021 ◽  
Vol 10 (2) ◽  
pp. 153-162
Author(s):  
Gibson Kimutai ◽  
Alexander Ngenzi ◽  
Said Rutabayiro Ngoga ◽  
Rose C. Ramkat ◽  
Anna Förster

Abstract. Tea (Camellia sinensis) is one of the most consumed drinks across the world. Based on processing techniques, there are more than 15 000 categories of tea, but the main categories include yellow tea, Oolong tea, Illex tea, black tea, matcha tea, green tea, and sencha tea, among others. Black tea is the most popular among the categories worldwide. During black tea processing, the following stages occur: plucking, withering, cutting, tearing, curling, fermentation, drying, and sorting. Although all these stages affect the quality of the processed tea, fermentation is the most vital as it directly defines the quality. Fermentation is a time-bound process, and its optimum is currently manually detected by tea tasters monitoring colour change, smelling the tea, and tasting the tea as fermentation progresses. This paper explores the use of the internet of things (IoT), deep convolutional neural networks, and image processing with majority voting techniques in detecting the optimum fermentation of black tea. The prototype was made up of Raspberry Pi 3 models with a Pi camera to take real-time images of tea as fermentation progresses. We deployed the prototype in the Sisibo Tea Factory for training, validation, and evaluation. When the deep learner was evaluated on offline images, it had a perfect precision and accuracy of 1.0 each. The deep learner recorded the highest precision and accuracy of 0.9589 and 0.8646, respectively, when evaluated on real-time images. Additionally, the deep learner recorded an average precision and accuracy of 0.9737 and 0.8953, respectively, when a majority voting technique was applied in decision-making. From the results, it is evident that the prototype can be used to monitor the fermentation of various categories of tea that undergo fermentation, including Oolong and black tea, among others. Additionally, the prototype can also be scaled up by retraining it for use in monitoring the fermentation of other crops, including coffee and cocoa.


Author(s):  
Pouria Asadi ◽  
Hamid Mehrabi ◽  
Alireza Asadi ◽  
Melody Ahmadi

Pavement distress assessment is a significant aspect of pavement management. Automated pavement crack detection is a challenging task that has been researched for decades in response to complicated pavement conditions. Current pavement condition assessment procedures are extensively time consuming, expensive, and labor-intensive. The primary goal of this paper is to develop a cost-effective and reliable platform using a red, green, blue, depth (RGB-D) sensor and deep learning detection models for automated pavement crack detection on a single-board ARM-based computer. To the best of our knowledge, for the first time, a pavement crack data set is prepared using a global shutter RGB-D sensor mounted on a car and annotated according to the Pascal visual object classes protocol, named PAVDIS2020. The proposed data set comprises 2,085 pavement crack images that are captured in a wide variety of weather and illuminance conditions with 5,587 instances of pavement cracks included in these images. A unified implementation of the Faster region-based convolutional neural networks and single shot multibox detector meta-architecture-based models is implemented to evaluate the accuracy, speed, and memory usage trade-off by using various convolutional neural networks-based backbones and various other training parameters on PAVDIS2020. The proposed pavement crack detection model was able to classify the cracks with 97.6% accuracy on PAVDIS2020 data set. The detection model is able to locate pavement crack patterns at the speed of 12 frames per second on a passively cooled Raspberry Pi 4 single-board computer.


2020 ◽  
Vol 2020 (10) ◽  
pp. 28-1-28-7 ◽  
Author(s):  
Kazuki Endo ◽  
Masayuki Tanaka ◽  
Masatoshi Okutomi

Classification of degraded images is very important in practice because images are usually degraded by compression, noise, blurring, etc. Nevertheless, most of the research in image classification only focuses on clean images without any degradation. Some papers have already proposed deep convolutional neural networks composed of an image restoration network and a classification network to classify degraded images. This paper proposes an alternative approach in which we use a degraded image and an additional degradation parameter for classification. The proposed classification network has two inputs which are the degraded image and the degradation parameter. The estimation network of degradation parameters is also incorporated if degradation parameters of degraded images are unknown. The experimental results showed that the proposed method outperforms a straightforward approach where the classification network is trained with degraded images only.


Sign in / Sign up

Export Citation Format

Share Document