scholarly journals Instance Segmentation Method Based on Improved Mask R-CNN for the Stacked Electronic Components

Electronics ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 886 ◽  
Author(s):  
Zhixian Yang ◽  
Ruixia Dong ◽  
Hao Xu ◽  
Jinan Gu

Object-detection methods based on deep learning play an important role in achieving machine automation. In order to achieve fast and accurate autonomous detection of stacked electronic components, an instance segmentation method based on an improved Mask R-CNN algorithm was proposed. By optimizing the feature extraction network, the performance of Mask R-CNN was improved. A dataset of electronic components containing 1200 images (992 × 744 pixels) was developed, and four types of components were included. Experiments on the dataset showed the model was superior in speed while being more lightweight and more accurate. The speed of our model showed promising results, with twice that of Mask R-CNN. In addition, our model was 0.35 times the size of Mask R-CNN, and the average precision (AP) of our model was improved by about two points compared to Mask R-CNN.

Author(s):  
Ferdi Doğan ◽  
◽  
Ibrahim Turkoğlu ◽  

The images obtained by remote sensing contain important data about ground surface. It is an important issue to detect objects on the ground surface with these images. Deep learning models are known to give better results in studies on object detection. However, the superiority of the deep learning models over each other is unknown. For this reason, it should be clarified which model is superior in terms of object detection and which model should be used in studies. In this study, it was aimed to reveal the superiorities of deep learning models by comparing their performance in detecting multiple objects. By using 11 deep learning models that are frequently encountered in the literature, the application of detecting objects of 14 classes in the DOTA dataset were made. 49,053 objects in 888 images were used for training by using AlexNet, Vgg16, Vgg19, GoogleNet, SequezeeNet, Resnet18, Resnet50, Resnet101, Inceptionresnetv2, inceptionv3, DenseNet201 models. After the training, 13,772 objects consisting of 14 classes in 277 images were used for testing with RCNN, which is one of the object detection methods. The performance of each algorithm in 14 classes has been demonstrated by using Average Precision (AP) and Mean Average Precision (mAP) to measure the performance of the models from their metrics. In a particular class of each deep learning model, difference in performance was observed The model with the highest performance varies in each class. In the application, the most successful average mAP value of 14 classes was Vgg16 with 24.64, while the lowest was InceptionResnetV2 with 11.78. In this article, the success of deep learning models in detecting multiple objects has been demonstrated practically and it is thought to be an important resource for researchers who will study on this subject.


Author(s):  
M. N. Favorskaya ◽  
L. C. Jain

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 517
Author(s):  
Seong-heum Kim ◽  
Youngbae Hwang

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.


2021 ◽  
Vol 11 (3) ◽  
pp. 968
Author(s):  
Yingchun Sun ◽  
Wang Gao ◽  
Shuguo Pan ◽  
Tao Zhao ◽  
Yahui Peng

Recently, multi-level feature networks have been extensively used in instance segmentation. However, because not all features are beneficial to instance segmentation tasks, the performance of networks cannot be adequately improved by synthesizing multi-level convolutional features indiscriminately. In order to solve the problem, an attention-based feature pyramid module (AFPM) is proposed, which integrates the attention mechanism on the basis of a multi-level feature pyramid network to efficiently and pertinently extract the high-level semantic features and low-level spatial structure features; for instance, segmentation. Firstly, we adopt a convolutional block attention module (CBAM) into feature extraction, and sequentially generate attention maps which focus on instance-related features along the channel and spatial dimensions. Secondly, we build inter-dimensional dependencies through a convolutional triplet attention module (CTAM) in lateral attention connections, which is used to propagate a helpful semantic feature map and filter redundant informative features irrelevant to instance objects. Finally, we construct branches for feature enhancement to strengthen detailed information to boost the entire feature hierarchy of the network. The experimental results on the Cityscapes dataset manifest that the proposed module outperforms other excellent methods under different evaluation metrics and effectively upgrades the performance of the instance segmentation method.


2021 ◽  
Vol 11 (13) ◽  
pp. 6085
Author(s):  
Jesus Salido ◽  
Vanesa Lomas ◽  
Jesus Ruiz-Santaquiteria ◽  
Oscar Deniz

There is a great need to implement preventive mechanisms against shootings and terrorist acts in public spaces with a large influx of people. While surveillance cameras have become common, the need for monitoring 24/7 and real-time response requires automatic detection methods. This paper presents a study based on three convolutional neural network (CNN) models applied to the automatic detection of handguns in video surveillance images. It aims to investigate the reduction of false positives by including pose information associated with the way the handguns are held in the images belonging to the training dataset. The results highlighted the best average precision (96.36%) and recall (97.23%) obtained by RetinaNet fine-tuned with the unfrozen ResNet-50 backbone and the best precision (96.23%) and F1 score values (93.36%) obtained by YOLOv3 when it was trained on the dataset including pose information. This last architecture was the only one that showed a consistent improvement—around 2%—when pose information was expressly considered during training.


Entropy ◽  
2020 ◽  
Vol 22 (2) ◽  
pp. 249
Author(s):  
Weiguo Zhang ◽  
Chenggang Zhao ◽  
Yuxing Li

The quality and efficiency of generating face-swap images have been markedly strengthened by deep learning. For instance, the face-swap manipulations by DeepFake are so real that it is tricky to distinguish authenticity through automatic or manual detection. To augment the efficiency of distinguishing face-swap images generated by DeepFake from real facial ones, a novel counterfeit feature extraction technique was developed based on deep learning and error level analysis (ELA). It is related to entropy and information theory such as cross-entropy loss function in the final softmax layer. The DeepFake algorithm is only able to generate limited resolutions. Therefore, this algorithm results in two different image compression ratios between the fake face area as the foreground and the original area as the background, which would leave distinctive counterfeit traces. Through the ELA method, we can detect whether there are different image compression ratios. Convolution neural network (CNN), one of the representative technologies of deep learning, can extract the counterfeit feature and detect whether images are fake. Experiments show that the training efficiency of the CNN model can be significantly improved by the ELA method. In addition, the proposed technique can accurately extract the counterfeit feature, and therefore achieves outperformance in simplicity and efficiency compared with direct detection methods. Specifically, without loss of accuracy, the amount of computation can be significantly reduced (where the required floating-point computing power is reduced by more than 90%).


2021 ◽  
Vol 7 (9) ◽  
pp. 176
Author(s):  
Daniel Queirós da Silva ◽  
Filipe Neves dos Santos ◽  
Armando Jorge Sousa ◽  
Vítor Filipe

Mobile robotics in forests is currently a hugely important topic due to the recurring appearance of forest wildfires. Thus, in-site management of forest inventory and biomass is required. To tackle this issue, this work presents a study on detection at the ground level of forest tree trunks in visible and thermal images using deep learning-based object detection methods. For this purpose, a forestry dataset composed of 2895 images was built and made publicly available. Using this dataset, five models were trained and benchmarked to detect the tree trunks. The selected models were SSD MobileNetV2, SSD Inception-v2, SSD ResNet50, SSDLite MobileDet and YOLOv4 Tiny. Promising results were obtained; for instance, YOLOv4 Tiny was the best model that achieved the highest AP (90%) and F1 score (89%). The inference time was also evaluated, for these models, on CPU and GPU. The results showed that YOLOv4 Tiny was the fastest detector running on GPU (8 ms). This work will enhance the development of vision perception systems for smarter forestry robots.


2020 ◽  
pp. 123-145
Author(s):  
Sushma Jaiswal ◽  
Tarun Jaiswal

In computer vision, object detection is a very important, exciting and mind-blowing study. Object detection work in numerous fields such as observing security, independently/autonomous driving and etc. Deep-learning based object detection techniques have developed at a very fast pace and have attracted the attention of many researchers. The main focus of the 21st century is the development of the object-detection framework, comprehensively and genuinely. In this investigation, we initially investigate and evaluate the various object detection approaches and designate the benchmark datasets. We also delivered the wide-ranging general idea of object detection approaches in an organized way. We covered the first and second stage detectors of object detection methods. And lastly, we consider the construction of these object detection approaches to give dimensions for further research.


Sign in / Sign up

Export Citation Format

Share Document