PLRSNet: a semantic segmentation network for segmenting plant leaf region under complex background

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Srinivas Talasila ◽  
Kirti Rawal ◽  
Gaurav Sethi

PurposeExtraction of leaf region from the plant leaf images is a prerequisite process for species recognition, disease detection and classification and so on, which are required for crop management. Several approaches were developed to implement the process of leaf region segmentation from the background. However, most of the methods were applied to the images taken under laboratory setups or plain background, but the application of leaf segmentation methods is vital to be used on real-time cultivation field images that contain complex backgrounds. So far, the efficient method that automatically segments leaf region from the complex background exclusively for black gram plant leaf images has not been developed.Design/methodology/approachExtracting leaf regions from the complex background is cumbersome, and the proposed PLRSNet (Plant Leaf Region Segmentation Net) is one of the solutions to this problem. In this paper, a customized deep network is designed and applied to extract leaf regions from the images taken from cultivation fields.FindingsThe proposed PLRSNet compared with the state-of-the-art methods and the experimental results evident that proposed PLRSNet yields 96.9% of Similarity Index/Dice, 94.2% of Jaccard/IoU, 98.55% of Correct Detection Ratio, Total Segmentation Error of 0.059 and Average Surface Distance of 3.037, representing a significant improvement over existing methods particularly taking into account of cultivation field images.Originality/valueIn this work, a customized deep learning network is designed for segmenting plant leaf region under complex background and named it as a PLRSNet.

2020 ◽  
pp. 110208
Author(s):  
Chenglong Feng ◽  
Lizhen Wang ◽  
Peng Xu ◽  
Zhaowei Chu ◽  
Jie Yao ◽  
...  

2021 ◽  
Vol 13 (13) ◽  
pp. 2524
Author(s):  
Ziyi Chen ◽  
Dilong Li ◽  
Wentao Fan ◽  
Haiyan Guan ◽  
Cheng Wang ◽  
...  

Deep learning models have brought great breakthroughs in building extraction from high-resolution optical remote-sensing images. Among recent research, the self-attention module has called up a storm in many fields, including building extraction. However, most current deep learning models loading with the self-attention module still lose sight of the reconstruction bias’s effectiveness. Through tipping the balance between the abilities of encoding and decoding, i.e., making the decoding network be much more complex than the encoding network, the semantic segmentation ability will be reinforced. To remedy the research weakness in combing self-attention and reconstruction-bias modules for building extraction, this paper presents a U-Net architecture that combines self-attention and reconstruction-bias modules. In the encoding part, a self-attention module is added to learn the attention weights of the inputs. Through the self-attention module, the network will pay more attention to positions where there may be salient regions. In the decoding part, multiple large convolutional up-sampling operations are used for increasing the reconstruction ability. We test our model on two open available datasets: the WHU and Massachusetts Building datasets. We achieve IoU scores of 89.39% and 73.49% for the WHU and Massachusetts Building datasets, respectively. Compared with several recently famous semantic segmentation methods and representative building extraction methods, our method’s results are satisfactory.


Author(s):  
R. B. Andrade ◽  
G. A. O. P. Costa ◽  
G. L. A. Mota ◽  
M. X. Ortega ◽  
R. Q. Feitosa ◽  
...  

Abstract. Deforestation is a wide-reaching problem, responsible for serious environmental issues, such as biodiversity loss and global climate change. Containing approximately ten percent of all biomass on the planet and home to one tenth of the known species, the Amazon biome has faced important deforestation pressure in the last decades. Devising efficient deforestation detection methods is, therefore, key to combat illegal deforestation and to aid in the conception of public policies directed to promote sustainable development in the Amazon. In this work, we implement and evaluate a deforestation detection approach which is based on a Fully Convolutional, Deep Learning (DL) model: the DeepLabv3+. We compare the results obtained with the devised approach to those obtained with previously proposed DL-based methods (Early Fusion and Siamese Convolutional Network) using Landsat OLI-8 images acquired at different dates, covering a region of the Amazon forest. In order to evaluate the sensitivity of the methods to the amount of training data, we also evaluate them using varying training sample set sizes. The results show that all tested variants of the proposed method significantly outperform the other DL-based methods in terms of overall accuracy and F1-score. The gains in performance were even more substantial when limited amounts of samples were used in training the evaluated methods.


2018 ◽  
pp. 1955-1967
Author(s):  
Haifeng Zhao ◽  
Jiangtao Wang ◽  
Wankou Yang

This chapter presents a graph-based approach to automatically categorize plant and insect species. In this approach, the plant leaf and insect objects are segmented from the background semi-automatically. For each object, the contour is then extracted, so that the contour points are used to form the vertices of a graph. We propose a vectorization method to recover clique histogram vectors from the graphs for classification. The clique histogram represents the distribution of one vertex with respect to its adjacent vertices. This treatment permits the use of a codebook approach to represent the graph in terms of a set of codewords that can be used for purposes of support vector machine classification. The experimental results show that the method is not only effective but also robust, and comparable with other methods in the literature for species recognition.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2501 ◽  
Author(s):  
Yanan Song ◽  
Liang Gao ◽  
Xinyu Li ◽  
Weiming Shen

Deep learning is robust to the perturbation of a point cloud, which is an important data form in the Internet of Things. However, it cannot effectively capture the local information of the point cloud and recognize the fine-grained features of an object. Different levels of features in the deep learning network are integrated to obtain local information, but this strategy increases network complexity. This paper proposes an effective point cloud encoding method that facilitates the deep learning network to utilize the local information. An axis-aligned cube is used to search for a local region that represents the local information. All of the points in the local region are available to construct the feature representation of each point. These feature representations are then input to a deep learning network. Two well-known datasets, ModelNet40 shape classification benchmark and Stanford 3D Indoor Semantics Dataset, are used to test the performance of the proposed method. Compared with other methods with complicated structures, the proposed method with only a simple deep learning network, can achieve a higher accuracy in 3D object classification and semantic segmentation.


2009 ◽  
pp. 150-171 ◽  
Author(s):  
Shilin Wang ◽  
Alan Wee-Chung Liew ◽  
Wing Hong Lau ◽  
Shu Hung Leung

As the first step of many visual speech recognition and visual speaker authentication systems, robust and accurate lip region segmentation is of vital importance for lip image analysis. However, most of the current techniques break down when dealing with lip images with complex and inhomogeneous background region such as mustaches and beards. In order to solve this problem, a Multi-class, Shapeguided FCM (MS-FCM) clustering algorithm is proposed in this chapter. In the proposed approach, one cluster is set for the lip region and a combination of multiple clusters for the background which generally includes the skin region, lip shadow or beards. With the spatial distribution of the lip cluster, a spatial penalty term considering the spatial location information is introduced and incorporated into the objective function such that pixels having similar color but located in different regions can be differentiated. Experimental results show that the proposed algorithm provides accurate lip-background partition even for the images with complex background features.


2019 ◽  
Vol 15 (3) ◽  
pp. 346-358
Author(s):  
Luciano Barbosa

Purpose Matching instances of the same entity, a task known as entity resolution, is a key step in the process of data integration. This paper aims to propose a deep learning network that learns different representations of Web entities for entity resolution. Design/methodology/approach To match Web entities, the proposed network learns the following representations of entities: embeddings, which are vector representations of the words in the entities in a low-dimensional space; convolutional vectors from a convolutional layer, which capture short-distance patterns in word sequences in the entities; and bag-of-word vectors, created by a bow layer that learns weights for words in the vocabulary based on the task at hand. Given a pair of entities, the similarity between their learned representations is used as a feature to a binary classifier that identifies a possible match. In addition to those features, the classifier also uses a modification of inverse document frequency for pairs, which identifies discriminative words in pairs of entities. Findings The proposed approach was evaluated in two commercial and two academic entity resolution benchmarking data sets. The results have shown that the proposed strategy outperforms previous approaches in the commercial data sets, which are more challenging, and have similar results to its competitors in the academic data sets. Originality/value No previous work has used a single deep learning framework to learn different representations of Web entities for entity resolution.


Author(s):  
Mathieu Dubois ◽  
Paola K. Rozo ◽  
Alexander Gepperth ◽  
O. Fabio A. Gonzalez ◽  
David Filliat

2020 ◽  
Vol 34 (07) ◽  
pp. 11717-11724
Author(s):  
Wenfeng Luo ◽  
Meng Yang

Current weakly-supervised semantic segmentation methods often estimate initial supervision from class activation maps (CAM), which produce sparse discriminative object seeds and rely on image saliency to provide background cues when only class labels are used. To eliminate the demand of extra data for training saliency detector, we propose to discover class pattern inherent in the lower layer convolution features, which are scarcely explored as in previous CAM methods. Specifically, we first project the convolution features into a low-dimension space and then decide on a decision boundary to generate class-agnostic maps for each semantic category that exists in the image. Features from Lower layer are more generic, thus capable of generating proxy ground-truth with more accurate and integral objects. Experiments on the PASCAL VOC 2012 dataset show that the proposed saliency-free method outperforms the previous approaches under the same weakly-supervised setting and achieves superior segmentation results, which are 64.5% on the validation set and 64.6% on the test set concerning mIoU metric.


Sign in / Sign up

Export Citation Format

Share Document