Rain Detection and Removal via Shrinkage-based Sparse Coding and Learned Rain Dictionary

2020 ◽  
Vol 64 (3) ◽  
pp. 30501-1-30501-17 ◽  
Author(s):  
Chang-Hwan Son ◽  
Xiao-Ping Zhang

Abstract Rain removal is essential for achieving autonomous driving because it preserves the details of objects that are useful for feature extraction and removes the rain structures that hinder feature extraction. Based on a linear superposition model in which the observed rain image is decomposed into two layers, a rain layer and a non-rain layer, conventional rain removal methods estimate these two layers alternatively from an observed single image based on prior modeling. However, the prior knowledge used for the rain structures is not always correct because various types of rain structures can be observed in the rain images, which results in inaccurate rain removal. Therefore, in this article, a novel rain removal method based on the use of a scribbled rain image set and a new shrinkage-based sparse coding model is proposed. The scribbled rain images have information about which pixels have rain structures. Thus, various types of rain structures can be modeled, owing to the abundance of rain structures in the rain image set. To detect the rain regions, two types of approaches, one based on reconstruction error comparison (REC) via a learned rain dictionary and the other based on a deep convolutional neural network (DCNN), are presented. With the rain regions, the proposed shrinkage-based sparse coding model determines how much to reduce the sparse codes of the rain dictionary and maintain the sparse codes of the non-rain dictionary for accurate rain removal. Experimental results verified that the proposed shrinkage-based sparse coding model could remove rain structures and preserve objects’ details due to the REC- or DCNN-based rain detection using the scribbled rain image set. Moreover, it was confirmed that the proposed method is more effective at removing rain structures from similar objects’ structures than conventional methods.

2018 ◽  
Vol 116 ◽  
pp. 212-217 ◽  
Author(s):  
Jinsheng Xiao ◽  
Wentao Zou ◽  
Yunhua Chen ◽  
Wen Wang ◽  
Junfeng Lei

Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 15
Author(s):  
Filippo Aleotti ◽  
Giulio Zaccaroni ◽  
Luca Bartolomei ◽  
Matteo Poggi ◽  
Fabio Tosi ◽  
...  

Depth perception is paramount for tackling real-world problems, ranging from autonomous driving to consumer applications. For the latter, depth estimation from a single image would represent the most versatile solution since a standard camera is available on almost any handheld device. Nonetheless, two main issues limit the practical deployment of monocular depth estimation methods on such devices: (i) the low reliability when deployed in the wild and (ii) the resources needed to achieve real-time performance, often not compatible with low-power embedded systems. Therefore, in this paper, we deeply investigate all these issues, showing how they are both addressable by adopting appropriate network design and training strategies. Moreover, we also outline how to map the resulting networks on handheld devices to achieve real-time performance. Our thorough evaluation highlights the ability of such fast networks to generalize well to new environments, a crucial feature required to tackle the extremely varied contexts faced in real applications. Indeed, to further support this evidence, we report experimental results concerning real-time, depth-aware augmented reality and image blurring with smartphones in the wild.


Author(s):  
Cong Wang ◽  
Xiaoying Xing ◽  
Yutong Wu ◽  
Zhixun Su ◽  
Junyang Chen
Keyword(s):  

Author(s):  
Sanjeev Arora ◽  
Yuanzhi Li ◽  
Yingyu Liang ◽  
Tengyu Ma ◽  
Andrej Risteski

Word embeddings are ubiquitous in NLP and information retrieval, but it is unclear what they represent when the word is polysemous. Here it is shown that multiple word senses reside in linear superposition within the word embedding and simple sparse coding can recover vectors that approximately capture the senses. The success of our approach, which applies to several embedding methods, is mathematically explained using a variant of the random walk on discourses model (Arora et al., 2016). A novel aspect of our technique is that each extracted word sense is accompanied by one of about 2000 “discourse atoms” that gives a succinct description of which other words co-occur with that word sense. Discourse atoms can be of independent interest, and make the method potentially more useful. Empirical tests are used to verify and support the theory.


IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 66522-66535 ◽  
Author(s):  
Haiying Xia ◽  
Ruibin Zhuge ◽  
Haisheng Li ◽  
Shuxiang Song ◽  
Frank Jiang ◽  
...  

Author(s):  
Mingqin Liu ◽  
Xiaoguang Zhang ◽  
Guiyun Xu

The continuous image sequence recognition is more difficult than the single image recognition because the classification of continuous image sequences and the image edge recognition must be very accurate. Hence, a method based on sequence alignment for action segmentation and classification is proposed to reconstruct a template sequence by estimating the mean action of a class category, which calculates the distance between a single image and a template sequence by sparse coding in Dynamic Time Warping. The proposed method, the methods of Kulkarni et al. [Continuous action recognition based on sequence alignment, Int. J. Comput. Vis. pp. 1–26.] and Hoai et al. [Joint segmentation and classification of human actions in video, IEEE Conf. Computer Vision and Pattern Recognition, 2008, pp. 108–119.] are compared on the recognition accuracy of the continuous recognition and isolated recognition, which clearly shows that the proposed method outperforms the other methods. When applied to continuous gesture classification, it not only can recognize the gesture categories more quickly and accurately, but is more realistic in solving continuous action recognition problems in a video than the other existing methods.


Author(s):  
J. Jung ◽  
K. Bang ◽  
G. Sohn ◽  
C. Armenakis

In this paper, a new model-to-image framework to automatically align a single airborne image with existing 3D building models using geometric hashing is proposed. As a prerequisite process for various applications such as data fusion, object tracking, change detection and texture mapping, the proposed registration method is used for determining accurate exterior orientation parameters (EOPs) of a single image. This model-to-image matching process consists of three steps: 1) feature extraction, 2) similarity measure and matching, and 3) adjustment of EOPs of a single image. For feature extraction, we proposed two types of matching cues, edged corner points representing the saliency of building corner points with associated edges and contextual relations among the edged corner points within an individual roof. These matching features are extracted from both 3D building and a single airborne image. A set of matched corners are found with given proximity measure through geometric hashing and optimal matches are then finally determined by maximizing the matching cost encoding contextual similarity between matching candidates. Final matched corners are used for adjusting EOPs of the single airborne image by the least square method based on co-linearity equations. The result shows that acceptable accuracy of single image's EOP can be achievable by the proposed registration approach as an alternative to labour-intensive manual registration process.


Sign in / Sign up

Export Citation Format

Share Document