scholarly journals Evolutionary Computation for Feature Manipulation in Salient Object Detection

2021 ◽  
Author(s):  
◽  
Shima Afzali Vahed Moghaddam

<p>The human visual system can efficiently cope with complex natural scenes containing various objects at different scales using the visual attention mechanism. Salient object detection (SOD) aims to simulate the capability of the human visual system in prioritizing objects for high-level processing. SOD is a process of identifying and localizing the most attention grabbing object(s) of a scene and separating the whole extent of the object(s) from the scene. In SOD, significant research has been dedicated to design and introduce new features to the domain. The existing saliency feature space suffers from some difficulties such as having high dimensionality, features are not equally important, some features are irrelevant, and the original features are not informative enough. These difficulties can lead to various performance limitations. Feature manipulation is the process which improves the input feature space to enhance the learning quality and performance.   Evolutionary computation (EC) techniques have been employed in a wide range of tasks due to their powerful search abilities. Genetic programming (GP) and particle swarm optimization (PSO) are well-known EC techniques which have been used for feature manipulation.   The overall goal of this thesis is to develop feature manipulation methods including feature weighting, feature selection, and feature construction using EC techniques to improve the input feature set for SOD.   This thesis proposes a feature weighting method utilizing PSO to explore the relative contribution of each saliency feature in the feature combination process. Saliency features are referred to the features which are extracted from different levels (e.g., pixel, segmentation) of an image to compute the saliency values over the entire image. The experimental results show that different datasets favour different weights for the employed features. The results also reveal that by considering the importance of each feature in the combination process, the proposed method has achieved better performance than that of the competitive methods.  This thesis proposes a new bottom-up SOD method to detect salient objects by constructing two new informative saliency features and designing a new feature combination framework. The proposed method aims at developing features which target to identify different regions of the image. The proposed method makes a good balance between computational time and performance.   This thesis proposes a GP-based method to automatically construct foreground and background saliency features. The automatically constructed features do not require domain-knowledge and they are more informative compared to the manually constructed features. The results show that GP is robust towards the changes in the input feature set (e.g., adding more features to the input feature set) and improves the performance by introducing more informative features to the SOD domain.   This thesis proposes a GP-based SOD method which automatically produces saliency maps (a 2-D map containing saliency values) for different types of images. This GP-based SOD method applies feature selection and feature combination during the learning process for SOD. GP with built-in feature selection process which selects informative features from the original set and combines the selected features to produce the final saliency map. The results show that GP can potentially explore a large search space and find a good way to combine different input features.  This thesis introduces GP for the first time to construct high-level saliency features from the low-level features for SOD, which aims to improve the performance of SOD, particularly on challenging and complex SOD tasks. The proposed method constructs fewer features that achieve better saliency performance than the original full feature set.</p>

2021 ◽  
Author(s):  
◽  
Shima Afzali Vahed Moghaddam

<p>The human visual system can efficiently cope with complex natural scenes containing various objects at different scales using the visual attention mechanism. Salient object detection (SOD) aims to simulate the capability of the human visual system in prioritizing objects for high-level processing. SOD is a process of identifying and localizing the most attention grabbing object(s) of a scene and separating the whole extent of the object(s) from the scene. In SOD, significant research has been dedicated to design and introduce new features to the domain. The existing saliency feature space suffers from some difficulties such as having high dimensionality, features are not equally important, some features are irrelevant, and the original features are not informative enough. These difficulties can lead to various performance limitations. Feature manipulation is the process which improves the input feature space to enhance the learning quality and performance.   Evolutionary computation (EC) techniques have been employed in a wide range of tasks due to their powerful search abilities. Genetic programming (GP) and particle swarm optimization (PSO) are well-known EC techniques which have been used for feature manipulation.   The overall goal of this thesis is to develop feature manipulation methods including feature weighting, feature selection, and feature construction using EC techniques to improve the input feature set for SOD.   This thesis proposes a feature weighting method utilizing PSO to explore the relative contribution of each saliency feature in the feature combination process. Saliency features are referred to the features which are extracted from different levels (e.g., pixel, segmentation) of an image to compute the saliency values over the entire image. The experimental results show that different datasets favour different weights for the employed features. The results also reveal that by considering the importance of each feature in the combination process, the proposed method has achieved better performance than that of the competitive methods.  This thesis proposes a new bottom-up SOD method to detect salient objects by constructing two new informative saliency features and designing a new feature combination framework. The proposed method aims at developing features which target to identify different regions of the image. The proposed method makes a good balance between computational time and performance.   This thesis proposes a GP-based method to automatically construct foreground and background saliency features. The automatically constructed features do not require domain-knowledge and they are more informative compared to the manually constructed features. The results show that GP is robust towards the changes in the input feature set (e.g., adding more features to the input feature set) and improves the performance by introducing more informative features to the SOD domain.   This thesis proposes a GP-based SOD method which automatically produces saliency maps (a 2-D map containing saliency values) for different types of images. This GP-based SOD method applies feature selection and feature combination during the learning process for SOD. GP with built-in feature selection process which selects informative features from the original set and combines the selected features to produce the final saliency map. The results show that GP can potentially explore a large search space and find a good way to combine different input features.  This thesis introduces GP for the first time to construct high-level saliency features from the low-level features for SOD, which aims to improve the performance of SOD, particularly on challenging and complex SOD tasks. The proposed method constructs fewer features that achieve better saliency performance than the original full feature set.</p>


2020 ◽  
Vol 34 (07) ◽  
pp. 10599-10606 ◽  
Author(s):  
Zuyao Chen ◽  
Qianqian Xu ◽  
Runmin Cong ◽  
Qingming Huang

Deep convolutional neural networks have achieved competitive performance in salient object detection, in which how to learn effective and comprehensive features plays a critical role. Most of the previous works mainly adopted multiple-level feature integration yet ignored the gap between different features. Besides, there also exists a dilution process of high-level features as they passed on the top-down pathway. To remedy these issues, we propose a novel network named GCPANet to effectively integrate low-level appearance features, high-level semantic features, and global context features through some progressive context-aware Feature Interweaved Aggregation (FIA) modules and generate the saliency map in a supervised way. Moreover, a Head Attention (HA) module is used to reduce information redundancy and enhance the top layers features by leveraging the spatial and channel-wise attention, and the Self Refinement (SR) module is utilized to further refine and heighten the input features. Furthermore, we design the Global Context Flow (GCF) module to generate the global context information at different stages, which aims to learn the relationship among different salient regions and alleviate the dilution effect of high-level features. Experimental results on six benchmark datasets demonstrate that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.


2017 ◽  
Vol 11 (3) ◽  
pp. 199-206 ◽  
Author(s):  
Anzhi Wang ◽  
Minghui Wang ◽  
Gang Pan ◽  
Xiaoyan Yuan

2021 ◽  
Vol 13 (11) ◽  
pp. 2163
Author(s):  
Zhou Huang ◽  
Huaixin Chen ◽  
Biyuan Liu ◽  
Zhixi Wang

Although remarkable progress has been made in salient object detection (SOD) in natural scene images (NSI), the SOD of optical remote sensing images (RSI) still faces significant challenges due to various spatial resolutions, cluttered backgrounds, and complex imaging conditions, mainly for two reasons: (1) accurate location of salient objects; and (2) subtle boundaries of salient objects. This paper explores the inherent properties of multi-level features to develop a novel semantic-guided attention refinement network (SARNet) for SOD of NSI. Specifically, the proposed semantic guided decoder (SGD) roughly but accurately locates the multi-scale object by aggregating multiple high-level features, and then this global semantic information guides the integration of subsequent features in a step-by-step feedback manner to make full use of deep multi-level features. Simultaneously, the proposed parallel attention fusion (PAF) module combines cross-level features and semantic-guided information to refine the object’s boundary and highlight the entire object area gradually. Finally, the proposed network architecture is trained through an end-to-end fully supervised model. Quantitative and qualitative evaluations on two public RSI datasets and additional NSI datasets across five metrics show that our SARNet is superior to 14 state-of-the-art (SOTA) methods without any post-processing.


2020 ◽  
Vol 10 (17) ◽  
pp. 5806 ◽  
Author(s):  
Yuzhen Chen ◽  
Wujie Zhou

Depth information has been widely used to improve RGB-D salient object detection by extracting attention maps to determine the position information of objects in an image. However, non-salient objects may be close to the depth sensor and present high pixel intensities in the depth maps. This situation in depth maps inevitably leads to erroneously emphasize non-salient areas and may have a negative impact on the saliency results. To mitigate this problem, we propose a hybrid attention neural network that fuses middle- and high-level RGB features with depth features to generate a hybrid attention map to remove background information. The proposed network extracts multilevel features from RGB images using the Res2Net architecture and then integrates high-level features from depth maps using the Inception-v4-ResNet2 architecture. The mixed high-level RGB features and depth features generate the hybrid attention map, which is then multiplied to the low-level RGB features. After decoding by several convolutions and upsampling, we obtain the final saliency prediction, achieving state-of-the-art performance on the NJUD and NLPR datasets. Moreover, the proposed network has good generalization ability compared with other methods. An ablation study demonstrates that the proposed network effectively performs saliency prediction even when non-salient objects interfere detection. In fact, after removing the branch with high-level RGB features, the RGB attention map that guides the network for saliency prediction is lost, and all the performance measures decline. The resulting prediction map from the ablation study shows the effect of non-salient objects close to the depth sensor. This effect is not present when using the complete hybrid attention network. Therefore, RGB information can correct and supplement depth information, and the corresponding hybrid attention map is more robust than using a conventional attention map constructed only with depth information.


Sign in / Sign up

Export Citation Format

Share Document