Resolving Conflicts in Object Tracking in Video Stream Employing Key Point Matching

Author(s):  
Grzegorz Szwoch
2013 ◽  
Vol 5 (2) ◽  
pp. 74-78
Author(s):  
Tomyslav Sledevič

The paper describes the FPGA-based implementation of the modified speeded-up robust features (SURF) algorithm. FPGA was selected for parallel process implementation using VHDL to ensure features extraction in real-time. A sliding 84×84 size window was used to store integral pixels and accelerate Hessian determinant calculation, orientation assignment and descriptor estimation. The local extreme searching was used to find point of interest in 8 scales. The simplified descriptor and orientation vector were calculated in parallel in 6 scales. The algorithm was investigated by tracking marker and drawing a plane or cube. All parts of algorithm worked on 25 MHz clock. The video stream was generated using 60 fps and 640×480 pixel camera. Article in Lithuanian Santrauka Pateikiamas modifikuoto požymių vaizde išskyrimo algoritmo SURF įgyvendinimas lauku programuojamų loginių matricų (LPLM) įrenginiuose. LPLM įrenginiai pasirinkti dėl galimybės tuo pat metu įgyvendinti veikiančius procesus taikant VHDL kalbą. Tai garantuoja, kad požymiai vaizde bus išskirti realiuoju laiku. Skaičiavimams paspartinti taikomas slankusis 84×84 taškų dydžio langas, kuriame saugomas sudėtingas vaizdas. Šio slankiojo lango duomenys taikomi Hessian determinantui, būdingųjų taškų orientacijai ir deskriptoriams apskaičiuoti. Požymiai ieškomi aštuoniose skalėse taikant lokalių ekstremumų paiešką. Požymių orientacijos vektorius ir supaprastintas deskriptorius skaičiuojami šešiose skalėse tuo pat metu. Algoritmo veikimas tiriamas sekant keturių taškų žymeklį ir pagal jį braižant plokštumą arba erdvinį kubą. Skaičiuojama 25 MHz taktiniu dažniu. Vaizdui gauti taikoma 60 kadrų per sekundę dažnio 640×480 taškų raiškos vaizdo kamera.


2020 ◽  
Author(s):  
Dominika Przewlocka ◽  
Mateusz Wasala ◽  
Hubert Szolc ◽  
Krzysztof Blachut ◽  
Tomasz Kryjak

In this paper the research on optimisation of visual object tracking using a Siamese neural network for embedded vision systems is presented. It was assumed that the solution shall operate in real-time, preferably for a high resolution video stream, with the lowest possible energy consumption. To meet these requirements, techniques such as the reduction of computational precision and pruning were considered. Brevitas, a tool dedicated for optimisation and quantisation of neural networks for FPGA implementation, was used. A number of training scenarios were tested with varying levels of optimisations-from integer uniform quantisation with 16 bits to ternary and binary networks. Next, the influence of these optimisations on the tracking performance was evaluated. It was possible to reduce the size of the convolutional filters up to 10 times in relation to the original network. The obtained results indicate that using quantisation can significantly reduce the memory and computational complexity of the proposed network while still enabling precise tracking, thus allow to use it in embedded vision systems. Moreover , quantisation of weights positively affects the network training by decreasing overfitting.


2020 ◽  
Vol 56 (6) ◽  
pp. 642-648
Author(s):  
Yu. N. Zolotukhin ◽  
K. Yu. Kotov ◽  
A. A. Nesterov ◽  
E. D. Semenyuk

2014 ◽  
Vol 602-605 ◽  
pp. 1689-1692
Author(s):  
Cong Lin ◽  
Chi Man Pun

A novel visual object tracking method for color video stream based on traditional particle filter is proposed in this paper. Feature vectors are extracted from coefficient matrices of fast three-dimensional Discrete Cosine Transform (fast 3-D DCT). The feature, as experiment showed, is very robust to occlusion and rotation and it is not sensitive to scale changes. The proposed method is efficient enough to be used in a real-time application. The experiment was carried out on some common used datasets in literature. The results are satisfied and showed the estimated trace follows the target object very closely.


2021 ◽  
Vol 1 ◽  
Author(s):  
Karim El Khoury ◽  
Jonathan Samelson ◽  
Benoît Macq

The extensive rise of high-definition CCTV camera footage has stimulated both the data compression and the data analysis research fields. The increased awareness of citizens to the vulnerability of their private information, creates a third challenge for the video surveillance community that also has to encompass privacy protection. In this paper, we aim to tackle those needs by proposing a deep learning-based object tracking solution via compressed domain residual frames. The goal is to be able to provide a public and privacy-friendly image representation for data analysis. In this work, we explore a scenario where the tracking is achieved directly on a restricted part of the information extracted from the compressed domain. We utilize exclusively the residual frames already generated by the video compression codec to train and test our network. This very compact representation also acts as an information filter, which limits the amount of private information leakage in a video stream. We manage to show that using residual frames for deep learning-based object tracking can be just as effective as using classical decoded frames. More precisely, the use of residual frames is particularly beneficial in simple video surveillance scenarios with non-overlapping and continuous traffic.


2020 ◽  
Author(s):  
Dominika Przewlocka ◽  
Mateusz Wasala ◽  
Hubert Szolc ◽  
Krzysztof Blachut ◽  
Tomasz Kryjak

In this paper the research on optimisation of visual object tracking using a Siamese neural network for embedded vision systems is presented. It was assumed that the solution shall operate in real-time, preferably for a high resolution video stream, with the lowest possible energy consumption. To meet these requirements, techniques such as the reduction of computational precision and pruning were considered. Brevitas, a tool dedicated for optimisation and quantisation of neural networks for FPGA implementation, was used. A number of training scenarios were tested with varying levels of optimisations-from integer uniform quantisation with 16 bits to ternary and binary networks. Next, the influence of these optimisations on the tracking performance was evaluated. It was possible to reduce the size of the convolutional filters up to 10 times in relation to the original network. The obtained results indicate that using quantisation can significantly reduce the memory and computational complexity of the proposed network while still enabling precise tracking, thus allow to use it in embedded vision systems. Moreover , quantisation of weights positively affects the network training by decreasing overfitting.


Sign in / Sign up

Export Citation Format

Share Document