scholarly journals Research on Multi-Object Sorting System Based on Deep Learning

Sensors ◽  
2021 ◽  
Vol 21 (18) ◽  
pp. 6238
Author(s):  
Hongyan Zhang ◽  
Huawei Liang ◽  
Tao Ni ◽  
Lingtao Huang ◽  
Jinsong Yang

As a complex task, robot sorting has become a research hotspot. In order to enable robots to perform simple, efficient, stable and accurate sorting operations for stacked multi-objects in unstructured scenes, a robot multi-object sorting system is built in this paper. Firstly, the training model of rotating target detection is constructed, and the placement state of five common objects in unstructured scenes is collected as the training set for training. The trained model is used to obtain the position, rotation angle and category of the target object. Then, the instance segmentation model is constructed, and the same data set is made, and the instance segmentation network model is trained. Then, the optimized Mask R-CNN instance segmentation network is used to segment the object surface pixels, and the upper surface point cloud is extracted to calculate the normal vector. Then, the angle obtained by the normal vector of the upper surface and the rotation target detection network is fused with the normal vector to obtain the attitude of the object. At the same time, the grasping order is calculated according to the average depth of the surface. Finally, after the obtained object posture, category and grasping sequence are fused, the performance of the rotating target detection network, the instance segmentation network and the robot sorting system are tested on the established experimental platform. Based on this system, this paper carried out an experiment on the success rate of object capture in a single network and an integrated network. The experimental results show that the multi-object sorting system based on deep learning proposed in this paper can sort stacked objects efficiently, accurately and stably in unstructured scenes.

Author(s):  
Kitsuchart Pasupa ◽  
Phongsathorn Kittiworapanya ◽  
Napasin Hongngern ◽  
Kuntpong Woraratpanya

AbstractEvaluation of car damages from an accident is one of the most important processes in the car insurance business. Currently, it still needs a manual examination of every basic part. It is expected that a smart device will be able to do this evaluation more efficiently in the future. In this study, we evaluated and compared five deep learning algorithms for semantic segmentation of car parts. The baseline reference algorithm was Mask R-CNN, and the other algorithms were HTC, CBNet, PANet, and GCNet. Runs of instance segmentation were conducted with those five algorithms. HTC with ResNet-50 was the best algorithm for instance segmentation on various kinds of cars such as sedans, trucks, and SUVs. It achieved a mean average precision at 55.2 on our original data set, that assigned different labels to the left and right sides and 59.1 when a single label was assigned to both sides. In addition, the models from every algorithm were tested for robustness, by running them on images of parts, in a real environment with various weather conditions, including snow, frost, fog and various lighting conditions. GCNet was the most robust; it achieved a mean performance under corruption, mPC = 35.2, and a relative degradation of performance on corrupted data, compared to clean data (rPC), of 64.4%, when left and right sides were assigned different labels, and mPC = 38.1 and rPC = $$69.6\%$$ 69.6 % when left- and right-side parts were considered the same part. The findings from this study may directly benefit developers of automated car damage evaluation system in their quest for the best design.


2021 ◽  
Vol 2021 ◽  
pp. 1-6
Author(s):  
Yi Lv ◽  
Zhengbo Yin ◽  
Zhezhou Yu

In order to improve the accuracy of remote sensing image target detection, this paper proposes a remote sensing image target detection algorithm DFS based on deep learning. Firstly, dimension clustering module, loss function, and sliding window segmentation detection are designed. The data set used in the experiment comes from GoogleEarth, and there are 6 types of objects: airplanes, boats, warehouses, large ships, bridges, and ports. Training set, verification set, and test set contain 73490 images, 22722 images, and 2138 images, respectively. It is assumed that the number of detected positive samples and negative samples is A and B, respectively, and the number of undetected positive samples and negative samples is C and D, respectively. The experimental results show that the precision-recall curve of DFS for six types of targets shows that DFS has the best detection effect for bridges and the worst detection effect for boats. The main reason is that the size of the bridge is relatively large, and it is clearly distinguished from the background in the image, so the detection difficulty is low. However, the target of the boat is very small, and it is easy to be mixed with the background, so it is difficult to detect. The MAP of DFS is improved by 12.82%, the detection accuracy is improved by 13%, and the recall rate is slightly decreased by 1% compared with YOLOv2. According to the number of detection targets, the number of false positives (FPs) of DFS is much less than that of YOLOv2. The false positive rate is greatly reduced. In addition, the average IOU of DFS is 11.84% higher than that of YOLOv2. For small target detection efficiency and large remote sensing image detection, the DFS algorithm has obvious advantages.


2021 ◽  
Vol 2132 (1) ◽  
pp. 012028
Author(s):  
Fan Yang

Abstract With more and more in-depth research on deep learning algorithms in recent years, how to use deep learning method to detect remote sensing images is the key to improving the utilization efficiency of remote sensing data and realizing the transformation from data to knowledge. In this paper, an improved YOLO V3 algorithm is proposed to solve the problems of missed detection and false detection of the original YOLOv3 algorithm in remote sensing image target detection with different size and wide disparity in length and width ratio. first of all, K-means algorithm is used for clustering analysis of data set to obtain the position of anchor box; Secondly, the dilated convolution with expansion rate of 2 is used to replace the general convolution in the feature extraction part; Then four scales are used for prediction; Finally, the improved algorithm is applied to the recognition of bridges, harbors and airports. The results show that the detection performance of the algorithm is improved by about 2% compared with the original algorithm.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Sumin Ma ◽  
Wenhui Huang

Since the breakthrough of deep learning in object classification in 2012, extraordinary achievements have been made in the field of target detection, but the high time and space complexity of the target detection network based on deep learning has hindered the technology from application in actual product. To solve this problem, first of all, this paper uses the MobileNet classification network to optimize the Faster R-CNN target detection network. The experimental results on the rare earth element detection data set show that the MobileNet classification network is not suitable for optimizing the Faster R-CNN network. After that, this paper proposes a classification network that combines VGG16 and MobileNet, and uses the fusion network to optimize the Faster R-CNN target detection network. The experimental results on the rare earth element detection data set show that the Faster R-CNN target detection network optimized by the fusion classification network has the advantages of using VGG16 and MobileNet’s Faster R-CNN target detection network to detect rare earth elements. The innovation of this article is that the results on 5 time series data sets show that CDA-WR has better predictive performance than other ELM variant models. The effect of determining trace cerium elements in rocks and minerals is increased by more than 50%, based on deep learning. The algorithm studies the methods of target detection and recognition and integrates it into the intelligent robot used in this subject, giving the robot the ability to accurately detect the target object in real time.


2019 ◽  
Vol 2019 (1) ◽  
pp. 360-368
Author(s):  
Mekides Assefa Abebe ◽  
Jon Yngve Hardeberg

Different whiteboard image degradations highly reduce the legibility of pen-stroke content as well as the overall quality of the images. Consequently, different researchers addressed the problem through different image enhancement techniques. Most of the state-of-the-art approaches applied common image processing techniques such as background foreground segmentation, text extraction, contrast and color enhancements and white balancing. However, such types of conventional enhancement methods are incapable of recovering severely degraded pen-stroke contents and produce artifacts in the presence of complex pen-stroke illustrations. In order to surmount such problems, the authors have proposed a deep learning based solution. They have contributed a new whiteboard image data set and adopted two deep convolutional neural network architectures for whiteboard image quality enhancement applications. Their different evaluations of the trained models demonstrated their superior performances over the conventional methods.


2020 ◽  
Vol 17 (3) ◽  
pp. 299-305 ◽  
Author(s):  
Riaz Ahmad ◽  
Saeeda Naz ◽  
Muhammad Afzal ◽  
Sheikh Rashid ◽  
Marcus Liwicki ◽  
...  

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.


Author(s):  
Kyungkoo Jun

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Yahya Albalawi ◽  
Jim Buckley ◽  
Nikola S. Nikolov

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.


Author(s):  
Jianping Ju ◽  
Hong Zheng ◽  
Xiaohang Xu ◽  
Zhongyuan Guo ◽  
Zhaohui Zheng ◽  
...  

AbstractAlthough convolutional neural networks have achieved success in the field of image classification, there are still challenges in the field of agricultural product quality sorting such as machine vision-based jujube defects detection. The performance of jujube defect detection mainly depends on the feature extraction and the classifier used. Due to the diversity of the jujube materials and the variability of the testing environment, the traditional method of manually extracting the features often fails to meet the requirements of practical application. In this paper, a jujube sorting model in small data sets based on convolutional neural network and transfer learning is proposed to meet the actual demand of jujube defects detection. Firstly, the original images collected from the actual jujube sorting production line were pre-processed, and the data were augmented to establish a data set of five categories of jujube defects. The original CNN model is then improved by embedding the SE module and using the triplet loss function and the center loss function to replace the softmax loss function. Finally, the depth pre-training model on the ImageNet image data set was used to conduct training on the jujube defects data set, so that the parameters of the pre-training model could fit the parameter distribution of the jujube defects image, and the parameter distribution was transferred to the jujube defects data set to complete the transfer of the model and realize the detection and classification of the jujube defects. The classification results are visualized by heatmap through the analysis of classification accuracy and confusion matrix compared with the comparison models. The experimental results show that the SE-ResNet50-CL model optimizes the fine-grained classification problem of jujube defect recognition, and the test accuracy reaches 94.15%. The model has good stability and high recognition accuracy in complex environments.


Sign in / Sign up

Export Citation Format

Share Document