A unified framework for automated person re-indentification

2020 ◽  
Vol 71 (7) ◽  
pp. 868-880
Author(s):  
Nguyen Hong-Quan ◽  
Nguyen Thuy-Binh ◽  
Tran Duc-Long ◽  
Le Thi-Lan

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.

2020 ◽  
Vol 71 (7) ◽  
pp. 868-880
Author(s):  
Nguyen Hong Quan ◽  
Nguyen Thuy Binh ◽  
Tran Duc Long ◽  
Le Thi Lan

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.


Author(s):  
Jiajia Liao ◽  
Yujun Liu ◽  
Yingchao Piao ◽  
Jinhe Su ◽  
Guorong Cai ◽  
...  

AbstractRecent advances in camera-equipped drone applications increased the demand for visual object detection algorithms with deep learning for aerial images. There are several limitations in accuracy for a single deep learning model. Inspired by ensemble learning can significantly improve the generalization ability of the model in the machine learning field, we introduce a novel integration strategy to combine the inference results of two different methods without non-maximum suppression. In this paper, a global and local ensemble network (GLE-Net) was proposed to increase the quality of predictions by considering the global weights for different models and adjusting the local weights for bounding boxes. Specifically, the global module assigns different weights to models. In the local module, we group the bounding boxes that corresponding to the same object as a cluster. Each cluster generates a final predict box and assigns the highest score in the cluster as the score of the final predict box. Experiments on benchmarks VisDrone2019 show promising performance of GLE-Net compared with the baseline network.


Author(s):  
Ye Wang ◽  
Yueru Chen ◽  
Jongmoo Choi ◽  
C.-C. Jay Kuo

This paper reports a visible and thermal drone monitoring system that integrates deep-learning-based detection and tracking modules. The biggest challenge in adopting deep learning methods for drone detection is the paucity of training drone images especially thermal drone images. To address this issue, we develop two data augmentation techniques. One is a model-based drone augmentation technique that automatically generates visible drone images with a bounding box label on the drone's location. The other is exploiting an adversarial data augmentation methodology to create thermal drone images. To track a small flying drone, we utilize the residual information between consecutive image frames. Finally, we present an integrated detection and tracking system that outperforms the performance of each individual module containing detection or tracking only. The experiments show that, even being trained on synthetic data, the proposed system performs well on real-world drone images with complex background. The USC drone detection and tracking dataset with user labeled bounding boxes is available to the public.


Author(s):  
Guillermo Hernández ◽  
Sara Rodríguez ◽  
Angélica González ◽  
Juan Manuel Corchado ◽  
Javier Prieto

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Mansheng Xiao ◽  
Yuezhong Wu ◽  
Guocai Zuo ◽  
Shuangnan Fan ◽  
Huijun Yu ◽  
...  

Next-generation networks are data-driven by design but face uncertainty due to various changing user group patterns and the hybrid nature of infrastructures running these systems. Meanwhile, the amount of data gathered in the computer system is increasing. How to classify and process the massive data to reduce the amount of data transmission in the network is a very worthy problem. Recent research uses deep learning to propose solutions for these and related issues. However, deep learning faces problems like overfitting that may undermine the effectiveness of its applications in solving different network problems. This paper considers the overfitting problem of convolutional neural network (CNN) models in practical applications. An algorithm for maximum pooling dropout and weight attenuation is proposed to avoid overfitting. First, design the maximum value pooling dropout in the pooling layer of the model to sparse the neurons and then introduce the regularization based on weight attenuation to reduce the complexity of the model when the gradient of the loss function is calculated by backpropagation. Theoretical analysis and experiments show that the proposed method can effectively avoid overfitting and can reduce the error rate of data set classification by more than 10% on average than other methods. The proposed method can improve the quality of different deep learning-based solutions designed for data management and processing in next-generation networks.


2020 ◽  
Vol 10 (13) ◽  
pp. 4423
Author(s):  
Huu-Huy Ngo ◽  
Feng-Cheng Lin ◽  
Yang-Ting Sehn ◽  
Mengru Tu ◽  
Chyi-Ren Dow

Studies on room monitoring have only focused on objects in a singular and uniform posture or low-density groups. Considering the wide use of convolutional neural networks for object detection, especially person detection, we use deep learning and perspective correction techniques to propose a room monitoring system that can detect persons with different motion states, high-density groups, and small-sized persons owing to the distance from the camera. This system uses consecutive frames from the monitoring camera as input images. Two approaches are used: perspective correction and person detection. First, perspective correction is used to transform an input image into a 2D top-view image. This allows users to observe the system more easily with different views (2D and 3D views). Second, the proposed person detection scheme combines the Mask region-based convolutional neural network (R-CNN) scheme and the tile technique for person detection, especially for detecting small-sized persons. All results are stored in a cloud database. Moreover, new person coordinates in 2D images are generated from the final bounding boxes and heat maps are created according to the 2D images; these enable users to examine the system quickly in different views. Additionally, a system prototype is developed to demonstrate the feasibility of the proposed system. Experimental results prove that our proposed system outperforms existing schemes in terms of accuracy, mean absolute error (MAE), and root mean squared error (RMSE).


Sensors ◽  
2019 ◽  
Vol 19 (16) ◽  
pp. 3506 ◽  
Author(s):  
Christoffer Bøgelund Rasmussen ◽  
Thomas B. Moeslund

Efficient and robust evaluation of kernel processing from corn silage is an important indicator to a farmer to determine the quality of their harvested crop. Current methods are cumbersome to conduct and take between hours to days. We present the adoption of two deep learning-based methods for kernel processing prediction without the cumbersome step of separating kernels and stover before capturing images. The methods show that kernels can be detected both with bounding boxes and at pixel-level instance segmentation. Networks were trained on up to 1393 images containing just over 6907 manually annotated kernel instances. Both methods showed promising results despite the challenging setting, with an average precision at an intersection-over-union of 0.5 of 34.0% and 36.1% on the test set consisting of images from three different harvest seasons for the bounding-box and instance segmentation networks respectively. Additionally, analysis of the correlation between the Kernel Processing Score (KPS) of annotations against the KPS of model predictions showed a strong correlation, with the best performing at r(15) = 0.88, p = 0.00003. The adoption of deep learning-based object recognition approaches for kernel processing measurement has the potential to lower the quality assessment process to minutes, greatly aiding a farmer in the strenuous harvesting season.


Sign in / Sign up

Export Citation Format

Share Document