scholarly journals Fish Segmentation in Sonar Images by Mask R-CNN on Feature Maps of Conditional Random Fields

Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7625
Author(s):  
Chin-Chun Chang ◽  
Yen-Po Wang ◽  
Shyi-Chyi Cheng

Imaging sonar systems are widely used for monitoring fish behavior in turbid or low ambient light waters. For analyzing fish behavior in sonar images, fish segmentation is often required. In this paper, Mask R-CNN is adopted for segmenting fish in sonar images. Sonar images acquired from different shallow waters can be quite different in the contrast between fish and the background. That difference can make Mask R-CNN trained on examples collected from one fish farm ineffective to fish segmentation for the other fish farms. In this paper, a preprocessing convolutional neural network (PreCNN) is proposed to provide “standardized” feature maps for Mask R-CNN and to ease applying Mask R-CNN trained for one fish farm to the others. PreCNN aims at decoupling learning of fish instances from learning of fish-cultured environments. PreCNN is a semantic segmentation network and integrated with conditional random fields. PreCNN can utilize successive sonar images and can be trained by semi-supervised learning to make use of unlabeled information. Experimental results have shown that Mask R-CNN on the output of PreCNN is more accurate than Mask R-CNN directly on sonar images. Applying Mask R-CNN plus PreCNN trained for one fish farm to new fish farms is also more effective.

2020 ◽  
Vol 10 (5) ◽  
pp. 1679
Author(s):  
Xinying Xu ◽  
Yujing Xue ◽  
Xiaoxia Han ◽  
Zhe Zhang ◽  
Jun Xie ◽  
...  

Image semantic segmentation (ISS) is used to segment an image into regions with differently labeled semantic category. Most of the existing ISS methods are based on fully supervised learning, which requires pixel-level labeling for training the model. As a result, it is often very time-consuming and labor-intensive, yet still subject to manual errors and subjective inconsistency. To tackle such difficulties, a weakly supervised ISS approach is proposed, in which the challenging problem of label inference from image-level to pixel-level will be particularly addressed, using image patches and conditional random fields (CRF). An improved simple linear iterative cluster (SLIC) algorithm is employed to extract superpixels. for image segmentation. Specifically, it generates various numbers of superpixels according to different images, which can be used to guide the process of image patch extraction based on the image-level labeled information. Based on the extracted image patches, the CRF model is constructed for inferring semantic class labels, which uses the potential energy function to map from the image-level to pixel-level image labels. Finally, patch based CRF (PBCRF) model is used to accomplish the weakly supervised ISS. Experiments conducted on two publicly available benchmark datasets, MSRC and PASCAL VOC 2012, have demonstrated that our proposed algorithm can yield very promising results compared to quite a few state-of-the-art ISS methods, including some deep learning-based models.


2019 ◽  
Vol 7 (2) ◽  
pp. 22 ◽  
Author(s):  
Francisco Francisco ◽  
Jan Sundberg

Techniques for marine monitoring have been greatly evolved over the past decades, making the acquisition of environmental data safer, more reliable and more efficient. On the other hand, the marine renewable energy sector has introduced dissimilar ways of exploring the oceans. Marine energy is mostly harvested in murky and high energetic places where conventional data acquisition techniques are impractical. This new frontier on marine operations brings the need for finding new techniques for environmental data acquisition, processing and analysis. Modern sonar systems, operating at high frequencies, can acquire detailed images of the underwater environment. Variables such as occurrence, size, class and behavior of a variety of aquatic species of fish, birds, and mammals that coexist within marine energy sites can be monitored using imaging sonar systems. Although sonar images can provide high levels of detail, in most of the cases they are still difficult to decipher. In order to facilitate the classification of targets using sonar images, this study introduces a framework of extracting visual features of marine animals that would serve as unique signatures. The acoustic visibility measure (AVM) is here introduced as technique of identification and classification of targets by comparing the observed size with a standard value. This information can be used to instruct algorithms and protocols in order to automate the identification and classification of underwater targets using imaging sonar systems. Using image processing algorithms embedded in Proviwer4 and FIJI software, this study found that acoustic images can be effectively used to classify cod, harbour and grey seals, and orcas through their size, shape and swimming behavior. The sonar images showed that cod occurred as bright, 0.9 m long, ellipsoidal targets shoaling in groups. Harbour seals occurred as bright torpedo-like fast moving targets, whereas grey seals occurred as bulky-ellipsoidal targets with serpentine movements. Orca or larger marine mammals occurred with relatively low visibility on the acoustic images compared to their body size, which measured between 4 m and 7 m. This framework provide a new window of performing qualitative and quantitative observations of underwater targets, and with further improvements, this method can be useful for environmental studies within marine renewable energy farms and for other purposes.


Sensors ◽  
2019 ◽  
Vol 19 (24) ◽  
pp. 5361 ◽  
Author(s):  
Bruno Artacho ◽  
Andreas Savakis

We propose a new efficient architecture for semantic segmentation, based on a “Waterfall” Atrous Spatial Pooling architecture, that achieves a considerable accuracy increase while decreasing the number of network parameters and memory footprint. The proposed Waterfall architecture leverages the efficiency of progressive filtering in the cascade architecture while maintaining multiscale fields-of-view comparable to spatial pyramid configurations. Additionally, our method does not rely on a postprocessing stage with Conditional Random Fields, which further reduces complexity and required training time. We demonstrate that the Waterfall approach with a ResNet backbone is a robust and efficient architecture for semantic segmentation obtaining state-of-the-art results with significant reduction in the number of parameters for the Pascal VOC dataset and the Cityscapes dataset.


Sign in / Sign up

Export Citation Format

Share Document