scholarly journals Non-Local Spatial and Temporal Attention Network for Video-Based Person Re-Identification

2020 ◽  
Vol 10 (15) ◽  
pp. 5385
Author(s):  
Zheng Liu ◽  
Feixiang Du ◽  
Wang Li ◽  
Xu Liu ◽  
Qiang Zou

Given a video containing a person, the video-based person re-identification (Re-ID) task aims to identify the same person from videos captured under different cameras. How to embed spatial-temporal information of a video into its feature representation is a crucial challenge. Most existing methods have failed to make full use of the relationship between frames during feature extraction. In this work, we propose a plug-and-play non-local attention module (NLAM) for frame-level feature extraction. NLAM, based on global spatial attention and channel attention, helps the network to determine the location of the person in each frame. Besides, we propose a non-local temporal pooling (NLTP) method used for temporal features’ aggregation, which can effectively capture long-range and global dependencies among the frames of the video. Our model obtained impressive results on different datasets compared to the state-of-the-art methods. In particular, it achieved the rank-1 accuracy of 86.3% on the MARS (Motion Analysis and Re-identification Set) dataset without re-ranking, which is 1.4% higher than the state-of-the-art way. On the DukeMTMC-VideoReID (Duke Multi-Target Multi-Camera Video Reidentification) dataset, our method also had an excellent performance of 95% rank-1 accuracy and 94.5% mAP (mean Average Precision).

2020 ◽  
Vol 34 (07) ◽  
pp. 11077-11084
Author(s):  
Yung-Han Huang ◽  
Kuang-Jui Hsu ◽  
Shyh-Kang Jeng ◽  
Yen-Yu Lin

Video re-localization aims to localize a sub-sequence, called target segment, in an untrimmed reference video that is similar to a given query video. In this work, we propose an attention-based model to accomplish this task in a weakly supervised setting. Namely, we derive our CNN-based model without using the annotated locations of the target segments in reference videos. Our model contains three modules. First, it employs a pre-trained C3D network for feature extraction. Second, we design an attention mechanism to extract multiscale temporal features, which are then used to estimate the similarity between the query video and a reference video. Third, a localization layer detects where the target segment is in the reference video by determining whether each frame in the reference video is consistent with the query video. The resultant CNN model is derived based on the proposed co-attention loss which discriminatively separates the target segment from the reference video. This loss maximizes the similarity between the query video and the target segment while minimizing the similarity between the target segment and the rest of the reference video. Our model can be modified to fully supervised re-localization. Our method is evaluated on a public dataset and achieves the state-of-the-art performance under both weakly supervised and fully supervised settings.


2017 ◽  
Vol 9 (3) ◽  
pp. 58-72 ◽  
Author(s):  
Guangyu Wang ◽  
Xiaotian Wu ◽  
WeiQi Yan

The security issue of currency has attracted awareness from the public. De-spite the development of applying various anti-counterfeit methods on currency notes, cheaters are able to produce illegal copies and circulate them in market without being detected. By reviewing related work in currency security, the focus of this paper is on conducting a comparative study of feature extraction and classification algorithms of currency notes authentication. We extract various computational features from the dataset consisting of US dollar (USD), Chinese Yuan (CNY) and New Zealand Dollar (NZD) and apply the classification algorithms to currency identification. Our contributions are to find and implement various algorithms from the existing literatures and choose the best approaches for use.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Jifeng Guo ◽  
Zhiqi Pang ◽  
Wenbo Sun ◽  
Shi Li ◽  
Yu Chen

Active learning aims to select the most valuable unlabelled samples for annotation. In this paper, we propose a redundancy removal adversarial active learning (RRAAL) method based on norm online uncertainty indicator, which selects samples based on their distribution, uncertainty, and redundancy. RRAAL includes a representation generator, state discriminator, and redundancy removal module (RRM). The purpose of the representation generator is to learn the feature representation of a sample, and the state discriminator predicts the state of the feature vector after concatenation. We added a sample discriminator to the representation generator to improve the representation learning ability of the generator and designed a norm online uncertainty indicator (Norm-OUI) to provide a more accurate uncertainty score for the state discriminator. In addition, we designed an RRM based on a greedy algorithm to reduce the number of redundant samples in the labelled pool. The experimental results on four datasets show that the state discriminator, Norm-OUI, and RRM can improve the performance of RRAAL, and RRAAL outperforms the previous state-of-the-art active learning methods.


2018 ◽  
pp. 252-269
Author(s):  
Guangyu Wang ◽  
Xiaotian Wu ◽  
WeiQi Yan

The security issue of currency has attracted awareness from the public. De-spite the development of applying various anti-counterfeit methods on currency notes, cheaters are able to produce illegal copies and circulate them in market without being detected. By reviewing related work in currency security, the focus of this paper is on conducting a comparative study of feature extraction and classification algorithms of currency notes authentication. We extract various computational features from the dataset consisting of US dollar (USD), Chinese Yuan (CNY) and New Zealand Dollar (NZD) and apply the classification algorithms to currency identification. Our contributions are to find and implement various algorithms from the existing literatures and choose the best approaches for use.


Author(s):  
Mirko Luca Lobina ◽  
Luigi Atzori ◽  
Davide Mula

Many audio watermarking techniques presented in the last years make use of masking and psychological models derived from signal processing. Such a basic idea is winning because it guarantees a high level of robustness and bandwidth of the watermark as well as fidelity of the watermarked signal. This chapter first describes the relationship between digital right management, intellectual property, and use of watermarking techniques. Then, the crossing use of watermarking and masking models is detailed, providing schemes, examples, and references. Finally, the authors present two strategies that make use of a masking model, applied to a classic watermarking technique. The joint use of classic frameworks and masking models seems to be one of the trends for the future of research in watermarking. Several tests on the proposed strategies with the state of the art are also offered to give an idea of how to assess the effectiveness of a watermarking technique.


2020 ◽  
Vol 10 (7) ◽  
pp. 2474
Author(s):  
Honglie Wang ◽  
Shouqian Sun ◽  
Lunan Zhou ◽  
Lilin Guo ◽  
Xin Min ◽  
...  

Vehicle re-identification is attracting an increasing amount of attention in intelligent transportation and is widely used in public security. In comparison to person re-identification, vehicle re-identification is more challenging because vehicles with different IDs are generated by a unified pipeline and cannot only be distinguished based on the subtle differences in their features such as lights, ornaments, and decorations. In this paper, we propose a local feature-aware Siamese matching model for vehicle re-identification. A local feature-aware Siamese matching model focuses on the informative parts in an image and these are the parts most likely to differ among vehicles with different IDs. In addition, we utilize Siamese feature matching to better supervise our attention. Furthermore, a perspective transformer network, which can eliminate image deformation, has been designed for feature extraction. We have conducted extensive experiments on three large-scale vehicle re-ID datasets, i.e., VeRi-776, VehicleID, and PKU-VD, and the results show that our method is superior to the state-of-the-art methods.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Ali Mohamad Mouazen ◽  
Ana Beatriz Hernández-Lara

Purpose Smart cities attract efficient and profitable economic activities, contribute to the societal welfare of their citizens and foster the efficient use and conservation of natural resources. Developing smart cities has become a priority for many developed countries, but as they are preferred destinations for migrants, this raises sustainability issues. They attract people who are seeking a better quality of life, smart services and solutions, a better environment and business activities. The purpose of this paper is to review the state of the art on the relationship between smart cities and migration, with a view to determining sustainability. Design/methodology/approach A bibliometric review and text mining analyses were conducted on publications between 2000 and 2019. Findings The results determined the main parameters of this research topic in terms of its growth, top journals and articles. The role of sustainability in the relationship between smart cities and migration is also identified, highlighting the special interest of its social dimension. Originality/value A bibliometric approach has not been used previously to investigate the link between smart cities and migration. However, given the current relevance of both phenomena, their emergence and growth, this approach is appropriate in determining the state of the art and its main descriptors, with special emphasis on the sustainability implications.


Author(s):  
Denis Delfitto

This chapter provides the state-of-the-art around expletive negation (EN), by discussing: (i) the relationship between EN and negative concord; (ii) EN as a real negation; (iii) EN as a special formative linked to an additional evaluative/expressive layer in the semantics of language. Moreover, the chapter offers a potentially unifying analysis of EN in comparative, exclamative, and temporal clauses: EN as an operator of implicature denial. This approach derives the fact that EN is logically and compositionally independent from what is said from the fact that EN shifts the semantics of negation to the layer of implicated meaning. Some of the interpretive effects normally linked to the expressive/evaluative analysis of EN can be arguably derived as side-effects of this semantic analysis. The proposal advanced here has a number of implications regarding the relationship among morpho-syntax, pragmatic enrichment, and the non-incremental analysis of negation in theories of negation processing.


Author(s):  
Julio C. Díaz-Montes ◽  
Jesús Manuel Dorador-González

A review of the state of the art in prosthetic hands is presented; this review covers the most common commercial prosthesis and prototypes under development. In this analysis, prosthetic devices were divided in six systems: actuation, reduction, blocking, transmission, flexion and support. The information obtained is presented according to those systems. The most important features of each system are presented together with their relationship with the performance of the entire prosthesis. An analysis that indicates the way in which prosthesis take advantage of the capabilities of current technologies is presented. Recommendations for improving the performance of upper limb prosthesis are proposed.


2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Xiaojun Lu ◽  
Xu Duan ◽  
Xiuping Mao ◽  
Yuanyuan Li ◽  
Xiangde Zhang

This paper proposes a method that uses feature fusion to represent images better for face detection after feature extraction by deep convolutional neural network (DCNN). First, with Clarifai net and VGG Net-D (16 layers), we learn features from data, respectively; then we fuse features extracted from the two nets. To obtain more compact feature representation and mitigate computation complexity, we reduce the dimension of the fused features by PCA. Finally, we conduct face classification by SVM classifier for binary classification. In particular, we exploit offset max-pooling to extract features with sliding window densely, which leads to better matches of faces and detection windows; thus the detection result is more accurate. Experimental results show that our method can detect faces with severe occlusion and large variations in pose and scale. In particular, our method achieves 89.24% recall rate on FDDB and 97.19% average precision on AFW.


Sign in / Sign up

Export Citation Format

Share Document