scholarly journals Explicit feature disentanglement for visual place recognition across appearance changes

2021 ◽  
Vol 18 (6) ◽  
pp. 172988142110374
Author(s):  
Li Tang ◽  
Yue Wang ◽  
Qimeng Tan ◽  
Rong Xiong

In the long-term deployment of mobile robots, changing appearance brings challenges for localization. When a robot travels to the same place or restarts from an existing map, global localization is needed, where place recognition provides coarse position information. For visual sensors, changing appearances such as the transition from day to night and seasonal variation can reduce the performance of a visual place recognition system. To address this problem, we propose to learn domain-unrelated features across extreme changing appearance, where a domain denotes a specific appearance condition, such as a season or a kind of weather. We use an adversarial network with two discriminators to disentangle domain-related features and domain-unrelated features from images, and the domain-unrelated features are used as descriptors in place recognition. Provided images from different domains, our network is trained in a self-supervised manner which does not require correspondences between these domains. Besides, our feature extractors are shared among all domains, making it possible to contain more appearance without increasing model complexity. Qualitative and quantitative results on two toy cases are presented to show that our network can disentangle domain-related and domain-unrelated features from given data. Experiments on three public datasets and one proposed dataset for visual place recognition are conducted to illustrate the performance of our method compared with several typical algorithms. Besides, an ablation study is designed to validate the effectiveness of the introduced discriminators in our network. Additionally, we use a four-domain dataset to verify that the network can extend to multiple domains with one model while achieving similar performance.

Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1810
Author(s):  
Dat Tien Nguyen ◽  
Tuyen Danh Pham ◽  
Ganbayar Batchuluun ◽  
Kyoung Jun Noh ◽  
Kang Ryoung Park

Although face-based biometric recognition systems have been widely used in many applications, this type of recognition method is still vulnerable to presentation attacks, which use fake samples to deceive the recognition system. To overcome this problem, presentation attack detection (PAD) methods for face recognition systems (face-PAD), which aim to classify real and presentation attack face images before performing a recognition task, have been developed. However, the performance of PAD systems is limited and biased due to the lack of presentation attack images for training PAD systems. In this paper, we propose a method for artificially generating presentation attack face images by learning the characteristics of real and presentation attack images using a few captured images. As a result, our proposed method helps save time in collecting presentation attack samples for training PAD systems and possibly enhance the performance of PAD systems. Our study is the first attempt to generate PA face images for PAD system based on CycleGAN network, a deep-learning-based framework for image generation. In addition, we propose a new measurement method to evaluate the quality of generated PA images based on a face-PAD system. Through experiments with two public datasets (CASIA and Replay-mobile), we show that the generated face images can capture the characteristics of presentation attack images, making them usable as captured presentation attack samples for PAD system training.


2017 ◽  
Vol 30 (7) ◽  
pp. e4146 ◽  
Author(s):  
Loukas Bampis ◽  
Savvas Chatzichristofis ◽  
Chryssanthi Iakovidou ◽  
Angelos Amanatiadis ◽  
Yiannis Boutalis ◽  
...  

Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6515
Author(s):  
Lu Xiong ◽  
Zhenwen Deng ◽  
Yuyao Huang ◽  
Weixin Du ◽  
Xiaolong Zhao ◽  
...  

Perception of road structures especially the traffic intersections by visual sensors is an essential task for automated driving. However, compared with intersection detection or visual place recognition, intersection re-identification (intersection re-ID) strongly affects driving behavior decisions with given routes, yet has long been neglected by researchers. This paper strives to explore intersection re-ID by a monocular camera sensor. We propose a Hybrid Double-Level re-identification approach which exploits two branches of Deep Convolutional Neural Network to accomplish multi-task including classification of intersection and its fine attributes, and global localization in topological maps. Furthermore, we propose a mixed loss training for the network to learn the similarity of two intersection images. As no public datasets are available for the intersection re-ID task, based on the work of RobotCar, we propose a new dataset with carefully-labeled intersection attributes, which is called “RobotCar Intersection” and covers more than 30,000 images of eight intersections in different seasons and day time. Additionally, we provide another dataset, called “Campus Intersection” consisting of panoramic images of eight intersections in a university campus to verify our updating strategy of topology map. Experimental results demonstrate that our proposed approach can achieve promising results in re-ID of both coarse road intersections and its global pose, and is well suited for updating and completion of topological maps.


2019 ◽  
Vol 9 (15) ◽  
pp. 3146 ◽  
Author(s):  
Bo Yang ◽  
Xiaosu Xu ◽  
Jun Li ◽  
Hong Zhang

Landmark generation is an essential component in landmark-based visual place recognition. In this paper, we present a simple yet effective method, called multi-scale sliding window (MSW), for landmark generation in order to improve the performance of place recognition. In our method, we generate landmarks that form a uniform distribution in multiple landmark scales (sizes) within an appropriate range by a process that samples an image with a sliding window. This is in contrast to conventional methods of landmark generation that typically depend on detecting objects whose size distributions are uneven and, as a result, may not be effective in achieving shift invariance and viewpoint invariance, two important properties in visual place recognition. We conducted experiments on four challenging datasets to demonstrate that the recognition performance can be significantly improved by our method in a standard landmark-based visual place recognition system. Our method is simple with a single input parameter, the scales of landmarks required, and it is efficient as it does not involve detecting objects.


2015 ◽  
Vol 35 (4) ◽  
pp. 334-356 ◽  
Author(s):  
Elena S. Stumm ◽  
Christopher Mei ◽  
Simon Lacroix

2021 ◽  
Vol 6 (3) ◽  
pp. 5976-5983
Author(s):  
Maria Waheed ◽  
Michael Milford ◽  
Klaus McDonald-Maier ◽  
Shoaib Ehsan

Author(s):  
Xinyi Li ◽  
Liqiong Chang ◽  
Fangfang Song ◽  
Ju Wang ◽  
Xiaojiang Chen ◽  
...  

This paper focuses on a fundamental question in Wi-Fi-based gesture recognition: "Can we use the knowledge learned from some users to perform gesture recognition for others?". This problem is also known as cross-target recognition. It arises in many practical deployments of Wi-Fi-based gesture recognition where it is prohibitively expensive to collect training data from every single user. We present CrossGR, a low-cost cross-target gesture recognition system. As a departure from existing approaches, CrossGR does not require prior knowledge (such as who is currently performing a gesture) of the target user. Instead, CrossGR employs a deep neural network to extract user-agnostic but gesture-related Wi-Fi signal characteristics to perform gesture recognition. To provide sufficient training data to build an effective deep learning model, CrossGR employs a generative adversarial network to automatically generate many synthetic training data from a small set of real-world examples collected from a small number of users. Such a strategy allows CrossGR to minimize the user involvement and the associated cost in collecting training examples for building an accurate gesture recognition system. We evaluate CrossGR by applying it to perform gesture recognition across 10 users and 15 gestures. Experimental results show that CrossGR achieves an accuracy of over 82.6% (up to 99.75%). We demonstrate that CrossGR delivers comparable recognition accuracy, but uses an order of magnitude less training samples collected from the end-users when compared to state-of-the-art recognition systems.


2021 ◽  
Vol 11 (4) ◽  
pp. 1380
Author(s):  
Yingbo Zhou ◽  
Pengcheng Zhao ◽  
Weiqin Tong ◽  
Yongxin Zhu

While Generative Adversarial Networks (GANs) have shown promising performance in image generation, they suffer from numerous issues such as mode collapse and training instability. To stabilize GAN training and improve image synthesis quality with diversity, we propose a simple yet effective approach as Contrastive Distance Learning GAN (CDL-GAN) in this paper. Specifically, we add Consistent Contrastive Distance (CoCD) and Characteristic Contrastive Distance (ChCD) into a principled framework to improve GAN performance. The CoCD explicitly maximizes the ratio of the distance between generated images and the increment between noise vectors to strengthen image feature learning for the generator. The ChCD measures the sampling distance of the encoded images in Euler space to boost feature representations for the discriminator. We model the framework by employing Siamese Network as a module into GANs without any modification on the backbone. Both qualitative and quantitative experiments conducted on three public datasets demonstrate the effectiveness of our method.


Sign in / Sign up

Export Citation Format

Share Document