Applying Data Augmentation and Mask R-CNN-based Instance Segmentation Method for Mixed-Type Wafer Maps Defect Patterns Classification

On the game screen, the UI interface provides key information for game play. A vision deep learning network exploits pure pixel information in the screen. Apart from this, if we separately extract the information provided by the UI interface and use it as an additional input value, we can enhance the learning efficiency of deep learning networks. To this end, by effectively segmenting UI interface components such as buttons, image icons, and gauge bars on the game screen, we should be able to separately analyze only the relevant images. In this paper, we propose a methodology that segments UI components in a game by using synthetic game images created on a game engine. We developed a tool that approximately detected the UI areas of an image in games on the game screen and generated a large amount of synthetic labeling data through this. By training this data on a Pix2Pix, we applied UI segmentation. The network trained in this way can segment the UI areas of the target game regardless of the position of the corresponding UI components. Our methodology can help analyze the game screen without applying data augmentation to the game screen. It can also help vision researchers who should extract semantic information from game image data.

Download Full-text

Real-Time Instance Segmentation of Traffic Videos for Embedded Devices

Sensors ◽

10.3390/s21010275 ◽

2021 ◽

Vol 21 (1) ◽

pp. 275

Author(s):

Ruben Panero Martinez ◽

Ionut Schiopu ◽

Bruno Cornelis ◽

Adrian Munteanu

Keyword(s):

Real Time ◽

Network Architecture ◽

Training Procedure ◽

Segmentation Method ◽

Embedded Devices ◽

Network Training ◽

Assignment Algorithm ◽

Ablation Study ◽

Reduced Rate ◽

Instance Segmentation

The paper proposes a novel instance segmentation method for traffic videos devised for deployment on real-time embedded devices. A novel neural network architecture is proposed using a multi-resolution feature extraction backbone and improved network designs for the object detection and instance segmentation branches. A novel post-processing method is introduced to ensure a reduced rate of false detection by evaluating the quality of the output masks. An improved network training procedure is proposed based on a novel label assignment algorithm. An ablation study on speed-vs.-performance trade-off further modifies the two branches and replaces the conventional ResNet-based performance-oriented backbone with a lightweight speed-oriented design. The proposed architectural variations achieve real-time performance when deployed on embedded devices. The experimental results demonstrate that the proposed instance segmentation method for traffic videos outperforms the you only look at coefficients algorithm, the state-of-the-art real-time instance segmentation method. The proposed architecture achieves qualitative results with 31.57 average precision on the COCO dataset, while its speed-oriented variations achieve speeds of up to 66.25 frames per second on the Jetson AGX Xavier module.

Download Full-text

An Efficient Module for Instance Segmentation Based on Multi-Level Features and Attention Mechanisms

Applied Sciences ◽

10.3390/app11030968 ◽

2021 ◽

Vol 11 (3) ◽

pp. 968

Author(s):

Yingchun Sun ◽

Wang Gao ◽

Shuguo Pan ◽

Tao Zhao ◽

Yahui Peng

Keyword(s):

Feature Extraction ◽

Spatial Structure ◽

Semantic Feature ◽

Semantic Features ◽

Segmentation Method ◽

Spatial Dimensions ◽

Feature Pyramid ◽

Multi Level ◽

High Level ◽

Instance Segmentation

Recently, multi-level feature networks have been extensively used in instance segmentation. However, because not all features are beneficial to instance segmentation tasks, the performance of networks cannot be adequately improved by synthesizing multi-level convolutional features indiscriminately. In order to solve the problem, an attention-based feature pyramid module (AFPM) is proposed, which integrates the attention mechanism on the basis of a multi-level feature pyramid network to efficiently and pertinently extract the high-level semantic features and low-level spatial structure features; for instance, segmentation. Firstly, we adopt a convolutional block attention module (CBAM) into feature extraction, and sequentially generate attention maps which focus on instance-related features along the channel and spatial dimensions. Secondly, we build inter-dimensional dependencies through a convolutional triplet attention module (CTAM) in lateral attention connections, which is used to propagate a helpful semantic feature map and filter redundant informative features irrelevant to instance objects. Finally, we construct branches for feature enhancement to strengthen detailed information to boost the entire feature hierarchy of the network. The experimental results on the Cityscapes dataset manifest that the proposed module outperforms other excellent methods under different evaluation metrics and effectively upgrades the performance of the instance segmentation method.

Download Full-text

Accurate Instance Segmentation in Pediatric Elbow Radiographs

Sensors ◽

10.3390/s21237966 ◽

2021 ◽

Vol 21 (23) ◽

pp. 7966

Author(s):

Dixiao Wei ◽

Qiongshui Wu ◽

Xianpei Wang ◽

Meng Tian ◽

Bowen Li

Keyword(s):

Elbow Joint ◽

Segmentation Method ◽

Qualitative And Quantitative ◽

Robust Segmentation ◽

Local Contexts ◽

Good Potential ◽

Quantitative Results ◽

Bounding Boxes ◽

Global And Local ◽

Instance Segmentation

Radiography is an essential basis for the diagnosis of fractures. For the pediatric elbow joint diagnosis, the doctor needs to diagnose abnormalities based on the location and shape of each bone, which is a great challenge for AI algorithms when interpreting radiographs. Bone instance segmentation is an effective upstream task for automatic radiograph interpretation. Pediatric elbow bone instance segmentation is a process by which each bone is extracted separately from radiography. However, the arbitrary directions and the overlapping of bones pose issues for bone instance segmentation. In this paper, we design a detection-segmentation pipeline to tackle these problems by using rotational bounding boxes to detect bones and proposing a robust segmentation method. The proposed pipeline mainly contains three parts: (i) We use Faster R-CNN-style architecture to detect and locate bones. (ii) We adopt the Oriented Bounding Box (OBB) to improve the localizing accuracy. (iii) We design the Global-Local Fusion Segmentation Network to combine the global and local contexts of the overlapped bones. To verify the effectiveness of our proposal, we conduct experiments on our self-constructed dataset that contains 1274 well-annotated pediatric elbow radiographs. The qualitative and quantitative results indicate that the network significantly improves the performance of bone extraction. Our methodology has good potential for applying deep learning in the radiography’s bone instance segmentation.

Download Full-text

Vehicle Logo Recognition with Small Sample Problem in Complex Scene Based on Data Augmentation

Mathematical Problems in Engineering ◽

10.1155/2020/6591873 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Xiao Ke ◽

Pengqiang Du

Keyword(s):

Gaussian Distribution ◽

Data Augmentation ◽

Small Sample Size ◽

Small Sample ◽

Segmentation Method ◽

Complex Scene ◽

Logo Recognition ◽

Small Frame ◽

Complex Scenes ◽

Frame Method

Automatic identification for vehicles is an important topic in the field of Intelligent Transportation Systems (ITS), and the vehicle logo is one of the most important characteristics of a vehicle. Therefore, vehicle logo detection and recognition are important research topics. Because of the problems that the area of a vehicle logo is too small to be detected and the dataset is too small to train for complex scenes, considering the speed of recognition and the robustness to complex scenes, we use deep learning methods which are based on data optimization for vehicle logo in complex scenes. We propose three augmentation strategies for vehicle logo data: cross-sliding segmentation method, small frame method, and Gaussian Distribution Segmentation method. For the problem of small sample size, we use cross-sliding segmentation method, which can effectively increase the amount of data without changing the aspect ratio of the original vehicle logo image. To expand the area of the logos in the images, we develop the small frame method which improves the detection results of the small area vehicle logos. In order to enrich the position diversity of vehicle logo in the image, we propose Gaussian Distribution Segmentation method, and the result shows that this method is very effective. The F1 value of our method in the YOLO framework is 0.7765, and the precision is greatly improved to 0.9295. In the Faster R-CNN framework, the F1 value of our method is 0.7799, which is also better than before. The results of experiments show that the above optimization methods can better represent the features of the vehicle logos than the traditional method, and the experimental results have been improved.

Download Full-text

Automatic Detection and Segmentation for Group-Housed Pigs Based on PigMS R-CNN

Sensors ◽

10.3390/s21093251 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3251

Author(s):

Shuqin Tu ◽

Weijun Yuan ◽

Yun Liang ◽

Fan Wang ◽

Hua Wan

Keyword(s):

Body Condition Score ◽

Live Weight ◽

Segmentation Method ◽

Feature Maps ◽

Residual Network ◽

Candidate Network ◽

Welfare Evaluation ◽

Condition Score ◽

Feature Pyramid ◽

Instance Segmentation

Instance segmentation is an accurate and reliable method to segment adhesive pigs’ images, and is critical for providing health and welfare information on individual pigs, such as body condition score, live weight, and activity behaviors in group-housed pig environments. In this paper, a PigMS R-CNN framework based on mask scoring R-CNN (MS R-CNN) is explored to segment adhesive pig areas in group-pig images, to separate the identification and location of group-housed pigs. The PigMS R-CNN consists of three processes. First, a residual network of 101-layers, combined with the feature pyramid network (FPN), is used as a feature extraction network to obtain feature maps for input images. Then, according to these feature maps, the region candidate network generates the regions of interest (RoIs). Finally, for each RoI, we can obtain the location, classification, and segmentation results of detected pigs through the regression and category, and mask three branches from the PigMS R-CNN head network. To avoid target pigs being missed and error detections in overlapping or stuck areas of group-housed pigs, the PigMS R-CNN framework uses soft non-maximum suppression (soft-NMS) by replacing the traditional NMS to conduct post-processing selected operation of pigs. The MS R-CNN framework with traditional NMS obtains results with an F1 of 0.9228. By setting the soft-NMS threshold to 0.7 on PigMS R-CNN, detection of the target pigs achieves an F1 of 0.9374. The work explores a new instance segmentation method for adhesive group-housed pig images, which provides valuable exploration for vision-based, real-time automatic pig monitoring and welfare evaluation.

Download Full-text

An Instance Segmentation Model for Strawberry Diseases Based on Mask R-CNN

Sensors ◽

10.3390/s21196565 ◽

2021 ◽

Vol 21 (19) ◽

pp. 6565

Author(s):

Usman Afzaal ◽

Bhuwan Bhattarai ◽

Yagya Raj Pandeya ◽

Joonwhoan Lee

Keyword(s):

Data Augmentation ◽

Human Life ◽

Low Cost ◽

Plant Diseases ◽

Fine Grained ◽

Agriculture Sector ◽

Detection Systems ◽

Pest Detection ◽

Treatment Procedures ◽

Instance Segmentation

Plant diseases must be identified at the earliest stage for pursuing appropriate treatment procedures and reducing economic and quality losses. There is an indispensable need for low-cost and highly accurate approaches for diagnosing plant diseases. Deep neural networks have achieved state-of-the-art performance in numerous aspects of human life including the agriculture sector. The current state of the literature indicates that there are a limited number of datasets available for autonomous strawberry disease and pest detection that allow fine-grained instance segmentation. To this end, we introduce a novel dataset comprised of 2500 images of seven kinds of strawberry diseases, which allows developing deep learning-based autonomous detection systems to segment strawberry diseases under complex background conditions. As a baseline for future works, we propose a model based on the Mask R-CNN architecture that effectively performs instance segmentation for these seven diseases. We use a ResNet backbone along with following a systematic approach to data augmentation that allows for segmentation of the target diseases under complex environmental conditions, achieving a final mean average precision of 82.43%.

Download Full-text

EmbedMask: Embedding Coupling for Instance Segmentation

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/175 ◽

2021 ◽

Author(s):

Hui Ying ◽

Zhaojin Huang ◽

Shu Liu ◽

Tianjia Shao ◽

Kun Zhou

Keyword(s):

High Resolution ◽

High Speed ◽

State Of The Art ◽

The State ◽

Single Stage ◽

Segmentation Method ◽

Two Stage ◽

Object Proposal ◽

Segmentation Methods ◽

Instance Segmentation

Current instance segmentation methods can be categorized into segmentation-based methods and proposal-based methods. The former performs segmentation first and then does clustering, while the latter detects objects first and then predicts the mask for each object proposal. In this work, we propose a single-stage method, named EmbedMask, that unifies both methods by taking their advantages, so it can achieve good performance in instance segmentation and produce high-resolution masks in a high speed. EmbedMask introduces two newly defined embeddings for mask prediction, which are pixel embedding and proposal embedding. During training, we enforce the pixel embedding to be close to its coupled proposal embedding if they belong to the same instance. During inference, pixels are assigned to the mask of the proposal if their embeddings are similar. This mechanism brings several benefits. First, the pixel-level clustering enables EmbedMask to generate high-resolution masks and avoids the complicated two-stage mask prediction. Second, the existence of proposal embedding simplifies and strengthens the clustering procedure, so our method can achieve high speed and better performance than segmentation-based methods. Without any bell or whistle, EmbedMask outperforms the state-of-the-art instance segmentation method Mask R-CNN on the challenging COCO dataset, obtaining more detailed masks at a higher speed.

Download Full-text

Segmentation of Additive Manufacturing Defects Using U-Net

10.1115/detc2021-68885 ◽

2021 ◽

Author(s):

Vivian Wen Hui Wong ◽

Max Ferguson ◽

Kincho H. Law ◽

Yung-Tsun Tina Lee ◽

Paul Witherell

Keyword(s):

Additive Manufacturing ◽

Data Augmentation ◽

Automatic Segmentation ◽

Model Development ◽

Segmentation Method ◽

Manufacturing Defects ◽

X Ray ◽

Design Flexibility ◽

X Ray Computed ◽

Non Destructive

Abstract Additive manufacturing (AM) provides design flexibility and allows rapid fabrications of parts with complex geometries. The presence of internal defects, however, can lead to deficit performance of the fabricated part. X-ray Computed Tomography (XCT) is a non-destructive inspection technique often used for AM parts. Although defects within AM specimens can be identified and segmented by manually thresholding the XCT images, the process can be tedious and inefficient, and the segmentation results can be ambiguous. The variation in the shapes and appearances of defects also poses difficulty in accurately segmenting defects. This paper describes an automatic defect segmentation method using U-Net based deep convolutional neural network (CNN) architectures. Several models of U-Net variants are trained and validated on an AM XCT image dataset containing pores and cracks, achieving a best mean intersection over union (IOU) value of 0.993. Performance of various U-Net models is compared and analyzed. Specific to AM porosity segmentation with XCT images, several techniques in data augmentation and model development are introduced. This work demonstrates that, using XCT images, U-Net can be effectively applied for automatic segmentation of AM porosity with high accuracy. The method can potentially help improve quality control of AM parts in an industry setting.

Download Full-text

Global River Monitoring Using Semantic Fusion Networks

Water ◽

10.3390/w12082258 ◽

2020 ◽

Vol 12 (8) ◽

pp. 2258 ◽

Cited By ~ 1

Author(s):

Zhihao Wei ◽

Kebin Jia ◽

Xiaowei Jia ◽

Ankush Khandelwal ◽

Vipin Kumar

Keyword(s):

Remote Sensing ◽

Semantic Information ◽

Data Augmentation ◽

Semantic Distance ◽

Global Scale ◽

Water Area ◽

Semantic Features ◽

Fusion Method ◽

Segmentation Method ◽

River Monitoring

Global river monitoring is an important mission within the remote sensing society. One of the main challenges faced by this mission is generating an accurate water mask from remote sensing images (RSI) of rivers (RSIR), especially on a global scale with various river features. Aiming at better water area classification using semantic information, this paper presents a segmentation method for global river monitoring based on semantic clustering and semantic fusion. Firstly, an encoder–decoder network (AEN)-based architecture is proposed to obtain the semantic features from RSIR. Secondly, a clustering-based semantic fusion method is proposed to divide semantic features of RSIR into groups and train convolutional neural networks (CNN) models corresponding to each group using data augmentation and semi-supervised learning. Thirdly, a semantic distance-based segmentation fusion method is proposed for fusing the CNN models result into final segmentation mask. We built a global river dataset that contains multiple river segments from each continent of the world based on Sentinel-2 satellite imagery. The result shows that the F1-score of the proposed segmentation method is 93.32%, which outperforms several state-of-the-art algorithms, and demonstrates that grouping semantic information helps better segment the RSIR in global scale.

Download Full-text