scholarly journals Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification

2020 ◽  
Vol 34 (07) ◽  
pp. 12709-12716
Author(s):  
Renchun You ◽  
Zhiyao Guo ◽  
Lei Cui ◽  
Xiang Long ◽  
Yingze Bao ◽  
...  

Multi-label image and video classification are fundamental yet challenging tasks in computer vision. The main challenges lie in capturing spatial or temporal dependencies between labels and discovering the locations of discriminative features for each class. In order to overcome these challenges, we propose to use cross-modality attention with semantic graph embedding for multi-label classification. Based on the constructed label graph, we propose an adjacency-based similarity graph embedding method to learn semantic label embeddings, which explicitly exploit label relationships. Then our novel cross-modality attention maps are generated with the guidance of learned label embeddings. Experiments on two multi-label image classification datasets (MS-COCO and NUS-WIDE) show our method outperforms other existing state-of-the-arts. In addition, we validate our method on a large multi-label video classification dataset (YouTube-8M Segments) and the evaluation results demonstrate the generalization capability of our method.

Author(s):  
Chang Tang ◽  
Xinzhong Zhu ◽  
Xinwang Liu ◽  
Lizhe Wang

Multi-view unsupervised feature selection (MV-UFS) aims to select a feature subset from multi-view data without using the labels of samples. However, we observe that existing MV-UFS algorithms do not well consider the local structure of cross views and the diversity of different views, which could adversely affect the performance of subsequent learning tasks. In this paper, we propose a cross-view local structure preserved diversity and consensus semantic learning model for MV-UFS, termed CRV-DCL briefly, to address these issues. Specifically, we project each view of data into a common semantic label space which is composed of a consensus part and a diversity part, with the aim to capture both the common information and distinguishing knowledge across different views. Further, an inter-view similarity graph between each pairwise view and an intra-view similarity graph of each view are respectively constructed to preserve the local structure of data in different views and different samples in the same view. An l2,1-norm constraint is imposed on the feature projection matrix to select discriminative features. We carefully design an efficient algorithm with convergence guarantee to solve the resultant optimization problem. Extensive experimental study is conducted on six publicly real multi-view datasets and the experimental results well demonstrate the effectiveness of CRV-DCL.


2018 ◽  
pp. 2211-2232
Author(s):  
C. J. Prabhakar ◽  
P. U. Praveen Kumar

In this chapter, the authors provide an overview of state-of-the-art image enhancement and restoration techniques for underwater images. Underwater imaging is one of the challenging tasks in the field of image processing and computer vision. Usually, underwater images suffer from non-uniform lighting, low contrast, diminished color, and blurring due to attenuation and scattering of light in the underwater environment. It is necessary to preprocess these images before applying computer vision techniques. Over the last few decades, many researchers have developed various image enhancement and restoration algorithms for enhancing the quality of images captured in underwater environments. The authors introduce a brief survey on image enhancement and restoration algorithms for underwater images. At the end of the chapter, we present an overview of our approach, which is well accepted by the image processing community to enhance the quality of underwater images. Our technique consists of filtering techniques such as homomorphic filtering, wavelet-based image denoising, bilateral filtering, and contrast equalization, which are applied sequentially. The proposed method increases better image visualization of objects which are captured in underwater environment compared to other existing methods.


Author(s):  
C. J. Prabhakar ◽  
P. U. Praveen Kumar

In this chapter, the authors provide an overview of state-of-the-art image enhancement and restoration techniques for underwater images. Underwater imaging is one of the challenging tasks in the field of image processing and computer vision. Usually, underwater images suffer from non-uniform lighting, low contrast, diminished color, and blurring due to attenuation and scattering of light in the underwater environment. It is necessary to preprocess these images before applying computer vision techniques. Over the last few decades, many researchers have developed various image enhancement and restoration algorithms for enhancing the quality of images captured in underwater environments. The authors introduce a brief survey on image enhancement and restoration algorithms for underwater images. At the end of the chapter, we present an overview of our approach, which is well accepted by the image processing community to enhance the quality of underwater images. Our technique consists of filtering techniques such as homomorphic filtering, wavelet-based image denoising, bilateral filtering, and contrast equalization, which are applied sequentially. The proposed method increases better image visualization of objects which are captured in underwater environment compared to other existing methods.


Sensors ◽  
2020 ◽  
Vol 20 (2) ◽  
pp. 495 ◽  
Author(s):  
Sophy Ai ◽  
Jangwoo Kwon

Low-light image enhancement is one of the most challenging tasks in computer vision, and it is actively researched and used to solve various problems. Most of the time, image processing achieves significant performance under normal lighting conditions. However, under low-light conditions, an image turns out to be noisy and dark, which makes subsequent computer vision tasks difficult. To make buried details more visible, and reduce blur and noise in a low-light captured image, a low-light image enhancement task is necessary. A lot of research has been applied to many different techniques. However, most of these approaches require much effort or expensive equipment to perform low-light image enhancement. For example, the image has to be captured in a raw camera file in order to be processed, and the addressing method does not perform well under extreme low-light conditions. In this paper, we propose a new convolutional network, Attention U-net (the integration of an attention gate and a U-net network), which is able to work on common file types (.PNG, .JPEG, .JPG, etc.) with primary support from deep learning to solve the problem of surveillance camera security in smart city inducements without requiring the raw image file from the camera, and it can perform under the most extreme low-light conditions.


Author(s):  
Qiang Li ◽  
Xiangling Kong ◽  
Yunjun Xu

Abstract In recent years, autonomous robots have been gradually introduced into various agricultural operations to address the ever-increasing labor shortage problem. Accurate navigation from one row to another is one of the many challenging tasks for an autonomous robot scouting in semi-structured agricultural fields. In this study, a marker-based row alignment control is proposed for the cross-bed motion of a scouting robot in strawberry fields. Specifically, a feature-based computer vision algorithm is used to detect primitive markers placed at the end of each planting bed. Then the image coordinates of detected markers are used to guide the robot to move away from one row and then align with the next one. The proposed method is low cost and robust with respect to varying lighting conditions, and has been validated in a local strawberry farm.


Sign in / Sign up

Export Citation Format

Share Document