Spatio-temporal object detection by deep learning: Video-interlacing to improve multi-object tracking

2019 ◽  
Vol 88 ◽  
pp. 120-131 ◽  
Author(s):  
Ala Mhalla ◽  
Thierry Chateau ◽  
Najoua Essoukri Ben Amara
Author(s):  
Dan Oneata ◽  
Jerome Revaud ◽  
Jakob Verbeek ◽  
Cordelia Schmid

2021 ◽  
Author(s):  
Mirela Beloiu ◽  
Dimitris Poursanidis ◽  
Samuel Hoffmann ◽  
Nektarios Chrysoulakis ◽  
Carl Beierkuhnlein

<p>Recent advances in deep learning techniques for object detection and the availability of high-resolution images facilitate the analysis of both temporal and spatial vegetation patterns in remote areas. High-resolution satellite imagery has been used successfully to detect trees in small areas with homogeneous rather than heterogeneous forests, in which single tree species have a strong contrast compared to their neighbors and landscape. However, no research to date has detected trees at the treeline in the remote and complex heterogeneous landscape of Greece using deep learning methods. We integrated high-resolution aerial images, climate data, and topographical characteristics to study the treeline dynamic over 70 years in the Samaria National Park on the Mediterranean island of Crete, Greece. We combined mapping techniques with deep learning approaches to detect and analyze spatio-temporal dynamics in treeline position and tree density. We use visual image interpretation to detect single trees on high-resolution aerial imagery from 1945, 2008, and 2015. Using the RGB aerial images from 2008 and 2015 we test a Convolution Neural Networks (CNN)-object detection approach (SSD) and a CNN-based segmentation technique (U-Net). Based on the mapping and deep learning approach, we have not detected a shift in treeline elevation over the last 70 years, despite warming, although tree density has increased. However, we show that CNN approach accurately detects and maps tree position and density at the treeline. We also reveal that the treeline elevation on Crete varies with topography. Treeline elevation decreases from the southern to the northern study sites. We explain these differences between study sites by the long-term interaction between topographical characteristics and meteorological factors. The study highlights the feasibility of using deep learning and high-resolution imagery as a promising technique for monitoring forests in remote areas.</p>


2021 ◽  
pp. 143-152
Author(s):  
Daniel Cores ◽  
Víctor M. Brea ◽  
Manuel Mucientes

Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6358
Author(s):  
Youngkeun Lee ◽  
Sang-ha Lee ◽  
Jisang Yoo ◽  
Soonchul Kwon

Multi-object tracking is a significant field in computer vision since it provides essential information for video surveillance and analysis. Several different deep learning-based approaches have been developed to improve the performance of multi-object tracking by applying the most accurate and efficient combinations of object detection models and appearance embedding extraction models. However, two-stage methods show a low inference speed since the embedding extraction can only be performed at the end of the object detection. To alleviate this problem, single-shot methods, which simultaneously perform object detection and embedding extraction, have been developed and have drastically improved the inference speed. However, there is a trade-off between accuracy and efficiency. Therefore, this study proposes an enhanced single-shot multi-object tracking system that displays improved accuracy while maintaining a high inference speed. With a strong feature extraction and fusion, the object detection of our model achieves an AP score of 69.93% on the UA-DETRAC dataset and outperforms previous state-of-the-art methods, such as FairMOT and JDE. Based on the improved object detection performance, our multi-object tracking system achieves a MOTA score of 68.5% and a PR-MOTA score of 24.5% on the same dataset, also surpassing the previous state-of-the-art trackers.


Sensors ◽  
2020 ◽  
Vol 20 (2) ◽  
pp. 532 ◽  
Author(s):  
Antoine Mauri ◽  
Redouane Khemmar ◽  
Benoit Decoux ◽  
Nicolas Ragot ◽  
Romain Rossi ◽  
...  

In core computer vision tasks, we have witnessed significant advances in object detection, localisation and tracking. However, there are currently no methods to detect, localize and track objects in road environments, and taking into account real-time constraints. In this paper, our objective is to develop a deep learning multi object detection and tracking technique applied to road smart mobility. Firstly, we propose an effective detector-based on YOLOv3 which we adapt to our context. Subsequently, to localize successfully the detected objects, we put forward an adaptive method aiming to extract 3D information, i.e., depth maps. To do so, a comparative study is carried out taking into account two approaches: Monodepth2 for monocular vision and MADNEt for stereoscopic vision. These approaches are then evaluated over datasets containing depth information in order to discern the best solution that performs better in real-time conditions. Object tracking is necessary in order to mitigate the risks of collisions. Unlike traditional tracking approaches which require target initialization beforehand, our approach consists of using information from object detection and distance estimation to initialize targets and to track them later. Expressly, we propose here to improve SORT approach for 3D object tracking. We introduce an extended Kalman filter to better estimate the position of objects. Extensive experiments carried out on KITTI dataset prove that our proposal outperforms state-of-the-art approches.


In the current times, tasks like Object Detection, Object tracking, Gesture prediction, Video prediction in computer vision are being solved effectively with models of deep learning . Video frame prediction involves predicting the next few frames of a video given the previous frame or frames as input. Currently, the challenge in video frame prediction is that the predicted future frames are blurry. This paper focuses on the removal of noise from the predicted image using Denoising Autoencoders, solve the above-addressed issue. The proposed work, trains LSTM model which generates future frames by giving a sequence of input frames. The predicted output is given as an input to the Denoising Autoencoders which tries to remove the blurry predictions. Our approach is implemented on Moving MNIST Dataset. The result of our proposed method improved accuracy and is compared with the accuracy of Denoising Autoencoders, LSTM, and LSTM along with Denoising Autoencoders


Author(s):  
M. N. Favorskaya ◽  
L. C. Jain

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.


Sign in / Sign up

Export Citation Format

Share Document