A robust identification method for nonferrous metal scraps based on deep learning and superpixel optimization

2021 ◽  
pp. 0734242X2098788
Author(s):  
Yifeng Li ◽  
Xunpeng Qin ◽  
Zhenyuan Zhang ◽  
Huanyu Dong

End-of-life vehicles (ELVs) provide a particularly potent source of supply for metals. Hence, the recycling and sorting techniques for ferrous and nonferrous metal scraps from ELVs significantly increase metal resource utilization. However, different kinds of nonferrous metal scraps, such as aluminium (Al) and copper (Cu), are not further automatically classified due to the lack of proper techniques. The purpose of this study is to propose an identification method for different nonferrous metal scraps, facilitate the further separation of nonferrous metal scraps, achieve better management of recycled metal resources and increase sustainability. A convolutional neural network (CNN) and SEEDS (superpixels extracted via energy-driven sampling) were adopted in this study. To build the classifier, 80 training images of randomly chosen Al and Cu scraps were taken, and some practical methods were proposed, including training patch generation with SEEDS, image data augmentation and automatic labelling methods for enormous training data. To obtain more accurate results, SEEDS was also used to optimize the coarse results obtained from the pretrained CNN model. Five indicators were adopted to evaluate the final identification results. Furthermore, 15 test samples concerning different classification environments were tested through the proposed model, and it performed well under all of the employed evaluation indexes, with an average precision of 0.98. The results demonstrate that the proposed model is robust for metal scrap identification, which can be expanded to a complex industrial environment, and it presents new possibilities for highly accurate automatic nonferrous metal scrap classification.

Author(s):  
Peilian Zhao ◽  
Cunli Mao ◽  
Zhengtao Yu

Aspect-Based Sentiment Analysis (ABSA), a fine-grained task of opinion mining, which aims to extract sentiment of specific target from text, is an important task in many real-world applications, especially in the legal field. Therefore, in this paper, we study the problem of limitation of labeled training data required and ignorance of in-domain knowledge representation for End-to-End Aspect-Based Sentiment Analysis (E2E-ABSA) in legal field. We proposed a new method under deep learning framework, named Semi-ETEKGs, which applied E2E framework using knowledge graph (KG) embedding in legal field after data augmentation (DA). Specifically, we pre-trained the BERT embedding and in-domain KG embedding for unlabeled data and labeled data with case elements after DA, and then we put two embeddings into the E2E framework to classify the polarity of target-entity. Finally, we built a case-related dataset based on a popular benchmark for ABSA to prove the efficiency of Semi-ETEKGs, and experiments on case-related dataset from microblog comments show that our proposed model outperforms the other compared methods significantly.


2020 ◽  
Vol 12 (6) ◽  
pp. 1014
Author(s):  
Jingchao Jiang ◽  
Cheng-Zhi Qin ◽  
Juan Yu ◽  
Changxiu Cheng ◽  
Junzhi Liu ◽  
...  

Reference objects in video images can be used to indicate urban waterlogging depths. The detection of reference objects is the key step to obtain waterlogging depths from video images. Object detection models with convolutional neural networks (CNNs) have been utilized to detect reference objects. These models require a large number of labeled images as the training data to ensure the applicability at a city scale. However, it is hard to collect a sufficient number of urban flooding images containing valuable reference objects, and manually labeling images is time-consuming and expensive. To solve the problem, we present a method to synthesize image data as the training data. Firstly, original images containing reference objects and original images with water surfaces are collected from open data sources, and reference objects and water surfaces are cropped from these original images. Secondly, the reference objects and water surfaces are further enriched via data augmentation techniques to ensure the diversity. Finally, the enriched reference objects and water surfaces are combined to generate a synthetic image dataset with annotations. The synthetic image dataset is further used for training an object detection model with CNN. The waterlogging depths are calculated based on the reference objects detected by the trained model. A real video dataset and an artificial image dataset are used to evaluate the effectiveness of the proposed method. The results show that the detection model trained using the synthetic image dataset can effectively detect reference objects from images, and it can achieve acceptable accuracies of waterlogging depths based on the detected reference objects. The proposed method has the potential to monitor waterlogging depths at a city scale.


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Wenting Qiao ◽  
Hongwei Zhang ◽  
Fei Zhu ◽  
Qiande Wu

The traditional method for detecting cracks in concrete bridges has the disadvantages of low accuracy and weak robustness. Combined with the crack digital image data obtained from bending test of reinforced concrete beams, a crack identification method for concrete structures based on improved U-net convolutional neural networks is proposed to improve the accuracy of crack identification in this article. Firstly, a bending test of concrete beams is conducted to collect crack images. Secondly, datasets of crack images are obtained using the data augmentation technology. Selected cracks are marked. Thirdly, based on the U-net neural networks, an improved inception module and an Atrous Spatial Pyramid Pooling module are added in the improved U-net model. Finally, the widths of cracks are identified using the concrete crack binary images obtained from the improved U-net model. The average precision of the test set of the proposed model is 11.7% higher than that of the U-net neural network segmentation model. The average relative error of the crack width of the proposed model is 13.2%, which is 18.6% less than that measured by using the ACTIS system. The results indicate that the proposed method is accurate, robust, and suitable for crack identification in concrete structures.


PLoS ONE ◽  
2021 ◽  
Vol 16 (4) ◽  
pp. e0250093
Author(s):  
Fabian Englbrecht ◽  
Iris E. Ruider ◽  
Andreas R. Bausch

Dataset annotation is a time and labor-intensive task and an integral requirement for training and testing deep learning models. The segmentation of images in life science microscopy requires annotated image datasets for object detection tasks such as instance segmentation. Although the amount of annotated image data has been steadily reduced due to methods such as data augmentation, the process of manual or semi-automated data annotation is the most labor and cost intensive task in the process of cell nuclei segmentation with deep neural networks. In this work we propose a system to fully automate the annotation process of a custom fluorescent cell nuclei image dataset. By that we are able to reduce nuclei labelling time by up to 99.5%. The output of our system provides high quality training data for machine learning applications to identify the position of cell nuclei in microscopy images. Our experiments have shown that the automatically annotated dataset provides coequal segmentation performance compared to manual data annotation. In addition, we show that our system enables a single workflow from raw data input to desired nuclei segmentation and tracking results without relying on pre-trained models or third-party training datasets for neural networks.


2019 ◽  
Vol 2019 ◽  
pp. 1-13 ◽  
Author(s):  
Yunong Tian ◽  
Guodong Yang ◽  
Zhe Wang ◽  
En Li ◽  
Zize Liang

Plant disease is one of the primary causes of crop yield reduction. With the development of computer vision and deep learning technology, autonomous detection of plant surface lesion images collected by optical sensors has become an important research direction for timely crop disease diagnosis. In this paper, an anthracnose lesion detection method based on deep learning is proposed. Firstly, for the problem of insufficient image data caused by the random occurrence of apple diseases, in addition to traditional image augmentation techniques, Cycle-Consistent Adversarial Network (CycleGAN) deep learning model is used in this paper to accomplish data augmentation. These methods effectively enrich the diversity of training data and provide a solid foundation for training the detection model. In this paper, on the basis of image data augmentation, densely connected neural network (DenseNet) is utilized to optimize feature layers of the YOLO-V3 model which have lower resolution. DenseNet greatly improves the utilization of features in the neural network and enhances the detection result of the YOLO-V3 model. It is verified in experiments that the improved model exceeds Faster R-CNN with VGG16 NET, the original YOLO-V3 model, and other three state-of-the-art networks in detection performance, and it can realize real-time detection. The proposed method can be well applied to the detection of anthracnose lesions on apple surfaces in orchards.


2021 ◽  
Vol 11 (12) ◽  
pp. 5586
Author(s):  
Eunkyeong Kim ◽  
Jinyong Kim ◽  
Hansoo Lee ◽  
Sungshin Kim

Artificial intelligence technologies and robot vision systems are core technologies in smart factories. Currently, there is scholarly interest in automatic data feature extraction in smart factories using deep learning networks. However, sufficient training data are required to train these networks. In addition, barely perceptible noise can affect classification accuracy. Therefore, to increase the amount of training data and achieve robustness against noise attacks, a data augmentation method implemented using the adaptive inverse peak signal-to-noise ratio was developed in this study to consider the influence of the color characteristics of the training images. This method was used to automatically determine the optimal perturbation range of the color perturbation method for generating images using weights based on the characteristics of the training images. The experimental results showed that the proposed method could generate new training images from original images, classify noisy images with greater accuracy, and generally improve the classification accuracy. This demonstrates that the proposed method is effective and robust to noise, even when the training data are deficient.


Author(s):  
Boonnatee Sakboonyarat ◽  
Pinyo Taeprasartsit

Objective: Cascaded/attention-based neural network has become common in image segmentation. This work proposes to improve its robustness by adding discriminative image enhancement to its attention mechanism. Unlike prior work, this image enhancement can also be applied as data augmentation and easily adapted for existing models. Its generalization can improve accuracy across multiple segmentation tasks and datasets. Methods: The method first localizes a target organ in a 2D fashion to obtain a tight neighborhood of the organ in each slice. Next, the method computes an HU histogram of a region combined from multiple 2D neighborhoods. This allows the method to adaptively handle HU-range difference among images. Then, HUs are nonlinearly stretched through a parameterized mapping function providing discriminative features for neural network. Varying the function parameters creates different intensity distribution of the target region. This effectively enhances and augments image data at the same time. The HU-reassigned region is then fed to a segmentation model for training. Results: Our experiments on liver and kidney segmentation showed that even a simple cascaded 2D U-Net model could deliver competitive performance in a variety of datasets. In addition, cross-validation and ablation analysis indicated robustness of the method even when the number of original training samples was limited. Conclusion: With the proposed technique, a simple model with limited training data can deliver competitive performance. Significance: The method significantly improves robustness of a trained model and is ready for generalization to other segmentation tasks and attention-based models. Accurate models can be simpler to save computing resources.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Taohong Zhang ◽  
Suli Fan ◽  
Junnan Hu ◽  
Xuxu Guo ◽  
Qianqian Li ◽  
...  

In this paper, a feature fusion method with guiding training (FGT-Net) is constructed to fuse image data and numerical data for some specific recognition tasks which cannot be classified accurately only according to images. The proposed structure is divided into the shared weight network part, the feature fused layer part, and the classification layer part. First, the guided training method is proposed to optimize the training process, the representative images and training images are input into the shared weight network to learn the ability that extracts the image features better, and then the image features and numerical features are fused together in the feature fused layer to input into the classification layer for the classification task. Experiments are carried out to verify the effectiveness of the proposed model. Loss is calculated by the output of both the shared weight network and classification layer. The results of experiments show that the proposed FGT-Net achieves the accuracy of 87.8%, which is 15% higher than the CNN model of ShuffleNetv2 (which can process image data only) and 9.8% higher than the DNN method (which processes structured data only).


2020 ◽  
Vol 34 (6) ◽  
pp. 693-700
Author(s):  
Glen Bennet Hermon ◽  
Durgansh Sharma

Former techniques for the identification of lion individuals (Panthera leo) relied on manual methods of recording data. Such processes have various shortcomings due to the manual nature of recording this data. This research work aims to automate the process of encoding the uniqueness within the whisker spot patterns for each lion individual by non-invasively using photographs. Towards this research work the main bottleneck was the availability of image data for individual lions. The proposed model embeds the uniqueness within the patterns for a specific individual as a unique cluster within its embedding space. This is achieved by using a triplet loss function which, due to its one-shot learning nature trains a deep inception network with less training data. Photographic images are known to have variations in lighting, pose variation, angle variation and other inconsistencies. Since the nature of these issues are nonlinear, it is preferred to create the target model using deep learning techniques. An inception network is trained to generate 128-dimensional vectors unique to each lion. This research paper elaborates on such deep machine learning techniques and other processes that are used to create this model.


Author(s):  
M. S. Mueller ◽  
T. Sattler ◽  
M. Pollefeys ◽  
B. Jutzi

<p><strong>Abstract.</strong> The performance of machine learning and deep learning algorithms for image analysis depends significantly on the quantity and quality of the training data. The generation of annotated training data is often costly, time-consuming and laborious. Data augmentation is a powerful option to overcome these drawbacks. Therefore, we augment training data by rendering images with arbitrary poses from 3D models to increase the quantity of training images. These training images usually show artifacts and are of limited use for advanced image analysis. Therefore, we propose to use image-to-image translation to transform images from a <i>rendered</i> domain to a <i>captured</i> domain. We show that translated images in the <i>captured</i> domain are of higher quality than the rendered images. Moreover, we demonstrate that image-to-image translation based on rendered 3D models enhances the performance of common computer vision tasks, namely feature matching, image retrieval and visual localization. The experimental results clearly show the enhancement on translated images over rendered images for all investigated tasks. In addition to this, we present the advantages utilizing translated images over exclusively captured images for visual localization.</p>


Sign in / Sign up

Export Citation Format

Share Document