scholarly journals ViTT: Vision Transformer Tracker

Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5608
Author(s):  
Xiaoning Zhu ◽  
Yannan Jia ◽  
Sun Jian ◽  
Lize Gu ◽  
Zhang Pu

This paper presents a new model for multi-object tracking (MOT) with a transformer. MOT is a spatiotemporal correlation task among interest objects and one of the crucial technologies of multi-unmanned aerial vehicles (Multi-UAV). The transformer is a self-attentional codec architecture that has been successfully used in natural language processing and is emerging in computer vision. This study proposes the Vision Transformer Tracker (ViTT), which uses a transformer encoder as the backbone and takes images directly as input. Compared with convolution networks, it can model global context at every encoder layer from the beginning, which addresses the challenges of occlusion and complex scenarios. The model simultaneously outputs object locations and corresponding appearance embeddings in a shared network through multi-task learning. Our work demonstrates the superiority and effectiveness of transformer-based networks in complex computer vision tasks and paves the way for applying the pure transformer in MOT. We evaluated the proposed model on the MOT16 dataset, achieving 65.7% MOTA, and obtained a competitive result compared with other typical multi-object trackers.

Author(s):  
Santosh Kumar Mishra ◽  
Rijul Dhir ◽  
Sriparna Saha ◽  
Pushpak Bhattacharyya

Image captioning is the process of generating a textual description of an image that aims to describe the salient parts of the given image. It is an important problem, as it involves computer vision and natural language processing, where computer vision is used for understanding images, and natural language processing is used for language modeling. A lot of works have been done for image captioning for the English language. In this article, we have developed a model for image captioning in the Hindi language. Hindi is the official language of India, and it is the fourth most spoken language in the world, spoken in India and South Asia. To the best of our knowledge, this is the first attempt to generate image captions in the Hindi language. A dataset is manually created by translating well known MSCOCO dataset from English to Hindi. Finally, different types of attention-based architectures are developed for image captioning in the Hindi language. These attention mechanisms are new for the Hindi language, as those have never been used for the Hindi language. The obtained results of the proposed model are compared with several baselines in terms of BLEU scores, and the results show that our model performs better than others. Manual evaluation of the obtained captions in terms of adequacy and fluency also reveals the effectiveness of our proposed approach. Availability of resources : The codes of the article are available at https://github.com/santosh1821cs03/Image_Captioning_Hindi_Language ; The dataset will be made available: http://www.iitp.ac.in/∼ai-nlp-ml/resources.html .


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4227
Author(s):  
Nicolás Jacob-Loyola ◽  
Felipe Muñoz-La Rivera ◽  
Rodrigo F. Herrera ◽  
Edison Atencio

The physical progress of a construction project is monitored by an inspector responsible for verifying and backing up progress information, usually through site photography. Progress monitoring has improved, thanks to advances in image acquisition, computer vision, and the development of unmanned aerial vehicles (UAVs). However, no comprehensive and simple methodology exists to guide practitioners and facilitate the use of these methods. This research provides recommendations for the periodic recording of the physical progress of a construction site through the manual operation of UAVs and the use of point clouds obtained under photogrammetric techniques. The programmed progress is then compared with the actual progress made in a 4D BIM environment. This methodology was applied in the construction of a reinforced concrete residential building. The results showed the methodology is effective for UAV operation in the work site and the use of the photogrammetric visual records for the monitoring of the physical progress and the communication of the work performed to the project stakeholders.


Actuators ◽  
2018 ◽  
Vol 8 (1) ◽  
pp. 1 ◽  
Author(s):  
Sunan Huang ◽  
Rodney Swee Huat Teo ◽  
Wenqi Liu

It is well-known that collision-free control is a crucial issue in the path planning of unmanned aerial vehicles (UAVs). In this paper, we explore the collision avoidance scheme in a multi-UAV system. The research is based on the concept of multi-UAV cooperation combined with information fusion. Utilizing the fused information, the velocity obstacle method is adopted to design a decentralized collision avoidance algorithm. Four case studies are presented for the demonstration of the effectiveness of the proposed method. The first two case studies are to verify if UAVs can avoid a static circular or polygonal shape obstacle. The third case is to verify if a UAV can handle a temporary communication failure. The fourth case is to verify if UAVs can avoid other moving UAVs and static obstacles. Finally, hardware-in-the-loop test is given to further illustrate the effectiveness of the proposed method.


2015 ◽  
Vol 4 (8) ◽  
pp. 46 ◽  
Author(s):  
Pedro Ortiz Coder

<p>New techniques in graphical heritage documentation have been improving recently. Modern photogrammetry and laser scanner constitute techniques with a good quality for those purposes. In this document, we will explain an easy photogrammetric method which permits to obtain accurate results. It is important to separate it from other methods based on computer vision with less accuracy. 4e photogrammetry solution is applied in this test through pictures taken from UAV (Unmanned Aerial Vehicles) and used on an archaeological site in Extremadura.</p>


Author(s):  
Maryna Zharikova ◽  
Vladimir Sherstjuk

In this chapter, the authors propose an approach to using a heterogeneous team of unmanned aerial vehicles and remote sensing techniques to perform tactical forest firefighting operations. The authors present the three-level architecture of the multi-UAV-based forest firefighting monitoring system; features of patrolling, confirming, and monitoring missions; as well as functions of UAV in such missions. The authors consider an infrastructure for the UAV ground support and equipment used for the UAVs control. The method of the data integration into a fire-spreading model in a real-time DSS for the forest fire response is proposed. The proposed approach has been tested with the multi-UAV team that included three drones for the patrol missions, one helicopter for the confirmation mission, and one octocopter for the monitoring mission. The performance of such multi-UAV team has been studied in the laboratory conditions. The result of the experiment has shown that the proposed approach provides required credibility and efficiency of fire prediction and response.


2019 ◽  
Vol 9 (15) ◽  
pp. 3196 ◽  
Author(s):  
Lidia María Belmonte ◽  
Rafael Morales ◽  
Antonio Fernández-Caballero

Personal assistant robots provide novel technological solutions in order to monitor people’s activities, helping them in their daily lives. In this sense, unmanned aerial vehicles (UAVs) can also bring forward a present and future model of assistant robots. To develop aerial assistants, it is necessary to address the issue of autonomous navigation based on visual cues. Indeed, navigating autonomously is still a challenge in which computer vision technologies tend to play an outstanding role. Thus, the design of vision systems and algorithms for autonomous UAV navigation and flight control has become a prominent research field in the last few years. In this paper, a systematic mapping study is carried out in order to obtain a general view of this subject. The study provides an extensive analysis of papers that address computer vision as regards the following autonomous UAV vision-based tasks: (1) navigation, (2) control, (3) tracking or guidance, and (4) sense-and-avoid. The works considered in the mapping study—a total of 144 papers from an initial set of 2081—have been classified under the four categories above. Moreover, type of UAV, features of the vision systems employed and validation procedures are also analyzed. The results obtained make it possible to draw conclusions about the research focuses, which UAV platforms are mostly used in each category, which vision systems are most frequently employed, and which types of tests are usually performed to validate the proposed solutions. The results of this systematic mapping study demonstrate the scientific community’s growing interest in the development of vision-based solutions for autonomous UAVs. Moreover, they will make it possible to study the feasibility and characteristics of future UAVs taking the role of personal assistants.


2020 ◽  
Vol 08 (04) ◽  
pp. 269-277
Author(s):  
Patricio Moreno ◽  
Santiago Esteva ◽  
Ignacio Mas ◽  
Juan I. Giribet

This work presents a multi-unmanned aerial vehicle formation implementing a trajectory-following controller based on the cluster-space robot coordination method. The controller is augmented with a feed-forward input from a control station operator. This teleoperation input is generated by means of a remote control, as a simple way of modifying the trajectory or taking over control of the formation during flight. The cluster-space formulation presents a simple specification of the system’s motion and, in this work, the operator benefits from this capability to easily evade obstacles by means of controlling the cluster parameters in real time. The proposed augmented controller is tested in a simulated environment first, and then deployed for outdoor field experiments. Results are shown in different scenarios using a cluster of three autonomous unmanned aerial vehicles.


2018 ◽  
Vol 92 ◽  
pp. 447-463 ◽  
Author(s):  
Abdulla Al-Kaff ◽  
David Martín ◽  
Fernando García ◽  
Arturo de la Escalera ◽  
José María Armingol

2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Xiaoxuan Hu ◽  
Jing Cheng ◽  
He Luo

This paper considers a task assignment problem for multiple unmanned aerial vehicles (UAVs). The UAVs are set to perform attack tasks on a collection of ground targets in a severe uncertain environment. The UAVs have different attack capabilities and are located at different positions. Each UAV should be assigned an attack task before the mission starts. Due to uncertain information, many criteria values essential to task assignment were random or fuzzy, and the weights of criteria were not precisely known. In this study, a novel task assignment approach based on stochastic Multicriteria acceptability analysis (SMAA) method was proposed to address this problem. The uncertainties in the criteria were analyzed, and a task assignment procedure was designed. The results of simulation experiments show that the proposed approach is useful for finding a satisfactory assignment under severe uncertain circumstances.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Wei Tan ◽  
Yong-jiang Hu ◽  
Yue-fei Zhao ◽  
Wen-guang Li ◽  
Xiao-meng Zhang ◽  
...  

Unmanned aerial vehicles (UAVs) are increasingly used in different military missions. In this paper, we focus on the autonomous mission allocation and planning abilities for the UAV systems. Such abilities enable adaptation to more complex and dynamic mission environments. We first examine the mission planning of a single unmanned aerial vehicle. Based on that, we then investigate the multi-UAV cooperative system under the mission background of cooperative target destruction and show that it is a many-to-one rendezvous problem. A heterogeneous UAV cooperative mission planning model is then proposed where the mission background is generated based on the Voronoi diagram. We then adopt the tabu genetic algorithm (TGA) to obtain multi-UAV mission planning. The simulation results show that the single-UAV and multi-UAV mission planning can be effectively realized by the Voronoi diagram-TGA (V-TGA). It is also shown that the proposed algorithm improves the performance by 3% in comparison with the Voronoi diagram-particle swarm optimization (V-PSO) algorithm.


Sign in / Sign up

Export Citation Format

Share Document