Road Scene Recognition of Forklift AGV Equipment Based on Deep Learning

Gang Liu; Rongxu Zhang; Yanyan Wang; Rongjun Man

doi:10.3390/pr9111955

Road Scene Recognition of Forklift AGV Equipment Based on Deep Learning

Processes ◽

10.3390/pr9111955 ◽

2021 ◽

Vol 9 (11) ◽

pp. 1955

Author(s):

Gang Liu ◽

Rongxu Zhang ◽

Yanyan Wang ◽

Rongjun Man

Keyword(s):

Neural Network ◽

Deep Learning ◽

Path Planning ◽

Large Scale ◽

Semantic Segmentation ◽

Mobile Terminal ◽

Activation Function ◽

Scene Recognition ◽

Convolution Neural Network ◽

Distribution Centers

The application of scene recognition in intelligent robots to forklift AGV equipment is of great significance in order to improve the automation and intelligence level of distribution centers. At present, using the camera to collect image information to obtain environmental information can break through the limitation of traditional guideway and positioning equipment, and is beneficial to the path planning and system expansion in the later stage of warehouse construction. Taking the forklift AGV equipment in the distribution center as the research object, this paper explores the scene recognition and path planning of forklift AGV equipment based on a deep convolution neural network. On the basis of the characteristics of the warehouse environment, a semantic segmentation network applied to the scene recognition of the warehouse environment is established, and a scene recognition method suitable for the warehouse environment is proposed, so that the equipment can use the deep learning method to learn the environment features and achieve accurate recognition in the large-scale environment, without adding environmental landmarks, which provides an effective convolution neural network model for the scene recognition of forklift AGV equipment in the warehouse environment. The activation function layer of the model is studied by using the activation function with better gradient performance. The results show that the performance of the H-Swish activation function is better than that of the ReLU function in recognition accuracy and computational complexity, and it can save costs as a calculation form of the mobile terminal.

Identification of Microcontroller Based Objects Using Image Classification in Python

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37920 ◽

2021 ◽

Vol 9 (9) ◽

pp. 87-92

Author(s):

Zenith Nandy

Keyword(s):

Neural Network ◽

Deep Learning ◽

Image Classification ◽

Programming Language ◽

Multiple Testing ◽

Activation Function ◽

Convolution Neural Network ◽

Output Layer ◽

Deep Learning Neural Network ◽

Percent Accuracy

Abstract: In this paper, I built an AI model using deep learning, which identifies whether a given image is of an Arduino, a Beaglebone Black or a Jetson Nano. The identification of the object is based on prediction. The model is trained using 300 to 350 datasets of each category and is tested multiple times using different images at different angles, background colour and size. After multiple testing, the model is found to have 95 percent accuracy. Model used is Sequential and uses Convolution Neural Network (CNN) as its architecture. The activation function of each layer is RELU and for the output layer is Softmax. The output is a prediction and hence it is of probability type. This is a type of an application based project. The entire scripting is done using Python 3 programming language. Keywords: image classification, microcontroller boards, python, AI, deep learning, neural network

Using Convolution Neural Network for Defective Image Classification of Industrial Components

Mobile Information Systems ◽

10.1155/2021/9092589 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Hao Wu ◽

Zhi Zhou

Keyword(s):

Neural Network ◽

Experimental Study ◽

Deep Learning ◽

Large Scale ◽

Intelligent System ◽

Convolution Neural Network ◽

Lens Distortion ◽

Proposed Model ◽

Industrial Cameras

Computer vision provides effective solutions in many imaging relation problems, including automatic image segmentation and classification. Artificially trained models can be employed to tag images and identify objects spontaneously. In large-scale manufacturing, industrial cameras are utilized to take constant images of components for several reasons. Due to the limitations caused by motion, lens distortion, and noise, some defective images are captured, which are to be identified and separated. One common way to address this problem is by looking into these images manually. However, this solution is not only very time-consuming but is also inaccurate. The paper proposes a deep learning-based artificially intelligent system that can quickly train and identify faulty images. For this purpose, a pretrained convolution neural network based on the PyTorch framework is employed to extract discriminating features from the dataset, which is then used for the classification task. In order to eliminate the chances of overfitting, the proposed model also employed Dropout technology to adjust the network. The experimental study reveals that the system can precisely classify the normal and defective images with an accuracy of over 91%.

Deep Convolutional Neural Network for Large-Scale Date Palm Tree Mapping from UAV-Based Images

Remote Sensing ◽

10.3390/rs13142787 ◽

2021 ◽

Vol 13 (14) ◽

pp. 2787

Author(s):

Mohamed Barakat A. Gibril ◽

Helmi Zulhaidi Mohd Shafri ◽

Abdallah Shanableh ◽

Rami Al-Ruzouq ◽

Aimrun Wayayok ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Large Scale ◽

Date Palm ◽

Semantic Segmentation ◽

Palm Tree ◽

Convolutional Networks ◽

Fully Convolutional Networks ◽

Palm Trees

Large-scale mapping of date palm trees is vital for their consistent monitoring and sustainable management, considering their substantial commercial, environmental, and cultural value. This study presents an automatic approach for the large-scale mapping of date palm trees from very-high-spatial-resolution (VHSR) unmanned aerial vehicle (UAV) datasets, based on a deep learning approach. A U-Shape convolutional neural network (U-Net), based on a deep residual learning framework, was developed for the semantic segmentation of date palm trees. A comprehensive set of labeled data was established to enable the training and evaluation of the proposed segmentation model and increase its generalization capability. The performance of the proposed approach was compared with those of various state-of-the-art fully convolutional networks (FCNs) with different encoder architectures, including U-Net (based on VGG-16 backbone), pyramid scene parsing network, and two variants of DeepLab V3+. Experimental results showed that the proposed model outperformed other FCNs in the validation and testing datasets. The generalizability evaluation of the proposed approach on a comprehensive and complex testing dataset exhibited higher classification accuracy and showed that date palm trees could be automatically mapped from VHSR UAV images with an F-score, mean intersection over union, precision, and recall of 91%, 85%, 0.91, and 0.92, respectively. The proposed approach provides an efficient deep learning architecture for the automatic mapping of date palm trees from VHSR UAV-based images.

Orchard Mapping with Deep Learning Semantic Segmentation

Sensors ◽

10.3390/s21113813 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3813

Author(s):

Athanasios Anagnostis ◽

Aristotelis C. Tagarakis ◽

Dimitrios Kateris ◽

Vasileios Moysiadis ◽

Claus Grøn Sørensen ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Semantic Segmentation ◽

Automated Detection ◽

Aerial Images ◽

Training Dataset ◽

Field Boundary ◽

Different Seasons ◽

Detection And Localization ◽

Different Levels

This study aimed to propose an approach for orchard trees segmentation using aerial images based on a deep learning convolutional neural network variant, namely the U-net network. The purpose was the automated detection and localization of the canopy of orchard trees under various conditions (i.e., different seasons, different tree ages, different levels of weed coverage). The implemented dataset was composed of images from three different walnut orchards. The achieved variability of the dataset resulted in obtaining images that fell under seven different use cases. The best-trained model achieved 91%, 90%, and 87% accuracy for training, validation, and testing, respectively. The trained model was also tested on never-before-seen orthomosaic images or orchards based on two methods (oversampling and undersampling) in order to tackle issues with out-of-the-field boundary transparent pixels from the image. Even though the training dataset did not contain orthomosaic images, it achieved performance levels that reached up to 99%, demonstrating the robustness of the proposed approach.

Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images

Mobile Networks and Applications ◽

10.1007/s11036-020-01703-3 ◽

2021 ◽

Vol 26 (1) ◽

pp. 200-215

Author(s):

Muhammad Alam ◽

Jian-Feng Wang ◽

Cong Guangpei ◽

LV Yunrong ◽

Yuanfang Chen

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Neural Networks ◽

Image Processing ◽

Deep Learning ◽

Semantic Segmentation ◽

Natural Scene ◽

Remote Sensing Images ◽

Advantages And Disadvantages ◽

Target Segmentation

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.

SketchGNN: Semantic Sketch Segmentation with Graph Neural Networks

ACM Transactions on Graphics ◽

10.1145/3450284 ◽

2021 ◽

Vol 40 (3) ◽

pp. 1-13

Author(s):

Lumin Yang ◽

Jiajie Zhuang ◽

Hongbo Fu ◽

Xiangzhi Wei ◽

Kun Zhou ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Large Scale ◽

State Of The Art ◽

Semantic Segmentation ◽

Structure Information ◽

Graph Neural Networks ◽

Node Labels ◽

Point Level

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.

Classification of Skin Disease Using Deep Learning Neural Networks with MobileNet V2 and LSTM

Sensors ◽

10.3390/s21082852 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2852

Author(s):

Parvathaneni Naga Srinivasu ◽

Jalluri Gnana SivaSai ◽

Muhammad Fazal Ijaz ◽

Akash Kumar Bhoi ◽

Wonjoon Kim ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Skin Disease ◽

Network Architecture ◽

Large Scale ◽

Short Term Memory ◽

Convolutional Networks ◽

Occurrence Matrix

Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region’s image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.

Real-time deep learning semantic segmentation during intra-operative surgery for 3D augmented reality assistance

International Journal of Computer Assisted Radiology and Surgery ◽

10.1007/s11548-021-02432-y ◽

2021 ◽

Author(s):

Leonardo Tanzi ◽

Pietro Piazzolla ◽

Francesco Porpiglia ◽

Enrico Vezzetti

Keyword(s):

Neural Network ◽

Deep Learning ◽

Augmented Reality ◽

Ad Hoc ◽

Geodesic Distance ◽

Semantic Segmentation ◽

Endoscopic Image ◽

Operative Surgery ◽

Endoscopic Videos

Abstract Purpose The current study aimed to propose a Deep Learning (DL) and Augmented Reality (AR) based solution for a in-vivo robot-assisted radical prostatectomy (RARP), to improve the precision of a published work from our group. We implemented a two-steps automatic system to align a 3D virtual ad-hoc model of a patient’s organ with its 2D endoscopic image, to assist surgeons during the procedure. Methods This approach was carried out using a Convolutional Neural Network (CNN) based structure for semantic segmentation and a subsequent elaboration of the obtained output, which produced the needed parameters for attaching the 3D model. We used a dataset obtained from 5 endoscopic videos (A, B, C, D, E), selected and tagged by our team’s specialists. We then evaluated the most performing couple of segmentation architecture and neural network and tested the overlay performances. Results U-Net stood out as the most effecting architectures for segmentation. ResNet and MobileNet obtained similar Intersection over Unit (IoU) results but MobileNet was able to elaborate almost twice operations per seconds. This segmentation technique outperformed the results from the former work, obtaining an average IoU for the catheter of 0.894 (σ = 0.076) compared to 0.339 (σ = 0.195). This modifications lead to an improvement also in the 3D overlay performances, in particular in the Euclidean Distance between the predicted and actual model’s anchor point, from 12.569 (σ= 4.456) to 4.160 (σ = 1.448) and in the Geodesic Distance between the predicted and actual model’s rotations, from 0.266 (σ = 0.131) to 0.169 (σ = 0.073). Conclusion This work is a further step through the adoption of DL and AR in the surgery domain. In future works, we will overcome the limits of this approach and finally improve every step of the surgical procedure.

SHEDR: An End-to-End Deep Neural Event Detection and Recommendation Framework for Hyperlocal News Using Social Media

INFORMS Journal on Computing ◽

10.1287/ijoc.2021.1112 ◽

2021 ◽

Author(s):

Yuheng Hu ◽

Yili Hong

Keyword(s):

Neural Network ◽

Social Media ◽

Deep Learning ◽

Event Detection ◽

Large Scale ◽

Short Term Memory ◽

State Of The Art ◽

Neural Network Models ◽

Neural Event ◽

End To End

Residents often rely on newspapers and television to gather hyperlocal news for community awareness and engagement. More recently, social media have emerged as an increasingly important source of hyperlocal news. Thus far, the literature on using social media to create desirable societal benefits, such as civic awareness and engagement, is still in its infancy. One key challenge in this research stream is to timely and accurately distill information from noisy social media data streams to community members. In this work, we develop SHEDR (social media–based hyperlocal event detection and recommendation), an end-to-end neural event detection and recommendation framework with a particular use case for Twitter to facilitate residents’ information seeking of hyperlocal events. The key model innovation in SHEDR lies in the design of the hyperlocal event detector and the event recommender. First, we harness the power of two popular deep neural network models, the convolutional neural network (CNN) and long short-term memory (LSTM), in a novel joint CNN-LSTM model to characterize spatiotemporal dependencies for capturing unusualness in a region of interest, which is classified as a hyperlocal event. Next, we develop a neural pairwise ranking algorithm for recommending detected hyperlocal events to residents based on their interests. To alleviate the sparsity issue and improve personalization, our algorithm incorporates several types of contextual information covering topic, social, and geographical proximities. We perform comprehensive evaluations based on two large-scale data sets comprising geotagged tweets covering Seattle and Chicago. We demonstrate the effectiveness of our framework in comparison with several state-of-the-art approaches. We show that our hyperlocal event detection and recommendation models consistently and significantly outperform other approaches in terms of precision, recall, and F-1 scores. Summary of Contribution: In this paper, we focus on a novel and important, yet largely underexplored application of computing—how to improve civic engagement in local neighborhoods via local news sharing and consumption based on social media feeds. To address this question, we propose two new computational and data-driven methods: (1) a deep learning–based hyperlocal event detection algorithm that scans spatially and temporally to detect hyperlocal events from geotagged Twitter feeds; and (2) A personalized deep learning–based hyperlocal event recommender system that systematically integrates several contextual cues such as topical, geographical, and social proximity to recommend the detected hyperlocal events to potential users. We conduct a series of experiments to examine our proposed models. The outcomes demonstrate that our algorithms are significantly better than the state-of-the-art models and can provide users with more relevant information about the local neighborhoods that they live in, which in turn may boost their community engagement.

Strong-Structural Convolution Neural Network for Semantic Segmentation

Pattern Recognition and Image Analysis ◽

10.1134/s1054661819040126 ◽

2019 ◽

Vol 29 (4) ◽

pp. 716-729

Author(s):

Yi Ouyang

Keyword(s):

Neural Network ◽

Semantic Segmentation ◽

Convolution Neural Network