scholarly journals A Distributed Automatic Video Annotation Platform

2020 ◽  
Vol 10 (15) ◽  
pp. 5319
Author(s):  
Md Anwarul Islam ◽  
Md Azher Uddin ◽  
Young-Koo Lee

In the era of digital devices and the Internet, thousands of videos are taken and share through the Internet. Similarly, CCTV cameras in the digital city produce a large amount of video data that carry essential information. To handle the increased video data and generate knowledge, there is an increasing demand for distributed video annotation. Therefore, in this paper, we propose a novel distributed video annotation platform that explores the spatial information and temporal information. Afterward, we provide higher-level semantic information. The proposed framework is divided into two parts: spatial annotation and spatiotemporal annotation. Therefore, we propose a spatiotemporal descriptor, namely, volume local directional ternary pattern-three orthogonal planes (VLDTP–TOP) in a distributed manner using Spark. Moreover, we developed several state-of-the-art appearance-based and spatiotemporal-based feature descriptors on top of Spark. We also provide the distributed video annotation services for the end-users so that they can easily use the video annotation and APIs for development to produce new video annotation algorithms. Due to the lack of a spatiotemporal video annotation dataset that provides ground truth for both spatial and temporal information, we introduce a video annotation dataset, namely, STAD which provides ground truth for spatial and temporal information. An extensive experimental analysis was performed in order to validate the performance and scalability of the proposed feature descriptors, which proved the excellence of our proposed approach.

2020 ◽  
Vol 10 (3) ◽  
pp. 966
Author(s):  
Zeyu Jiao ◽  
Guozhu Jia ◽  
Yingjie Cai

In this study, we consider fully automated action recognition based on deep learning in the industrial environment. In contrast to most existing methods, which rely on professional knowledge to construct complex hand-crafted features, or only use basic deep-learning methods, such as convolutional neural networks (CNNs), to extract information from images in the production process, we exploit a novel and effective method, which integrates multiple deep-learning networks including CNNs, spatial transformer networks (STNs), and graph convolutional networks (GCNs) to process video data in industrial workflows. The proposed method extracts both spatial and temporal information from video data. The spatial information is extracted by estimating the human pose of each frame, and the skeleton image of the human body in each frame is obtained. Furthermore, multi-frame skeleton images are processed by GCN to obtain temporal information, meaning the action recognition results are predicted automatically. By training on a large human action dataset, Kinetics, we apply the proposed method to the real-world industrial environment and achieve superior performance compared with the existing methods.


2020 ◽  
Vol 325 ◽  
pp. 01007
Author(s):  
Haoxiang Su ◽  
Zhenghong Dong ◽  
Fan Yang

With the continuous innovation of optical remote sensing technology and the increasing demand for spatial information, satellite videos, which can provide higher spatial and temporal resolution, have been paid a lot of attention. And moving vehicles extraction in satellite videos is one of the most important tasks. By analyzing the shortcomings of current satellite video moving vehicles extraction algorithms, and combining with the characteristics of satellite videos and moving vehicles, this paper proposes an algorithm to extract moving vehicles in satellite videos, that some vehicles are firstly separated from the background by using image extreme points and mean differences, and then the moving vehicles are extracted by joint detection of inter-frame vehicles motion. At the same time, based on the extracted moving vehicles, we also propose a method that can extract road masks by using only three frames. Finally, we use Jilin-1 satellite video data to prove the proposed methods in the experiment. And also this paper has compared the propose methods with another two algorithms, where the results show that the proposed methods greatly improve the accuracy and quality of moving vehicles detection in satellite videos.


2020 ◽  
Vol 2020 (4) ◽  
pp. 116-1-116-7
Author(s):  
Raphael Antonius Frick ◽  
Sascha Zmudzinski ◽  
Martin Steinebach

In recent years, the number of forged videos circulating on the Internet has immensely increased. Software and services to create such forgeries have become more and more accessible to the public. In this regard, the risk of malicious use of forged videos has risen. This work proposes an approach based on the Ghost effect knwon from image forensics for detecting forgeries in videos that can replace faces in video sequences or change the mimic of a face. The experimental results show that the proposed approach is able to identify forgery in high-quality encoded video content.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Masahiro Inagawa ◽  
Toshinobu Takei ◽  
Etsujiro Imanishi

AbstractMany cooking robots have been developed in response to the increasing demand for such robots. However, most existing robots must be programmed according to specific recipes to enable cooking using robotic arms, which requires considerable time and expertise. Therefore, this paper proposes a method to allow a robot to cook by analyzing recipes available on the internet, without any recipe-specific programming. The proposed method can be used to plan robot motion based on the analysis of the cooking procedure for a recipe. We developed a cooking robot to execute the proposed method and evaluated the effectiveness of this approach by analyzing 50 recipes. More than 25 recipes could be cooked using the proposed approach.


2021 ◽  
Vol 10 (3) ◽  
pp. 166
Author(s):  
Hartmut Müller ◽  
Marije Louwsma

The Covid-19 pandemic put a heavy burden on member states in the European Union. To govern the pandemic, having access to reliable geo-information is key for monitoring the spatial distribution of the outbreak over time. This study aims to analyze the role of spatio-temporal information in governing the pandemic in the European Union and its member states. The European Nomenclature of Territorial Units for Statistics (NUTS) system and selected national dashboards from member states were assessed to analyze which spatio-temporal information was used, how the information was visualized and whether this changed over the course of the pandemic. Initially, member states focused on their own jurisdiction by creating national dashboards to monitor the pandemic. Information between member states was not aligned. Producing reliable data and timeliness reporting was problematic, just like selecting indictors to monitor the spatial distribution and intensity of the outbreak. Over the course of the pandemic, with more knowledge about the virus and its characteristics, interventions of member states to govern the outbreak were better aligned at the European level. However, further integration and alignment of public health data, statistical data and spatio-temporal data could provide even better information for governments and actors involved in managing the outbreak, both at national and supra-national level. The Infrastructure for Spatial Information in Europe (INSPIRE) initiative and the NUTS system provide a framework to guide future integration and extension of existing systems.


Author(s):  
Wei Sun ◽  
Ethan Stoop ◽  
Scott S. Washburn

Florida’s interstate rest areas are heavily utilized by commercial trucks for overnight parking. Many of these rest areas regularly experience 100% utilization of available commercial truck parking spaces during the evening and early-morning hours. Being able to communicate availability of commercial truck parking space to drivers in advance of arriving at a rest area would reduce unnecessary stops at full rest areas as well as driver anxiety. In order to do this, it is critical to implement a vehicle detection technology to reflect the parking status of the rest area correctly. The objective of this project was to evaluate three different wireless in-pavement vehicle detection technologies as applied to commercial truck parking at interstate rest areas. This paper mainly focuses on the following aspects: (a) accuracy of the vehicle detection in parking spaces, (b) installation, setup, and maintenance of the vehicle detection technology, and (c) truck parking trends at the rest area study site. The final project report includes a more detailed summary of the evaluation. The research team recorded video of the rest areas as the ground-truth data and developed a software tool to compare the video data with the parking sensor data. Two accuracy tests (event accuracy and occupancy accuracy) were conducted to evaluate each sensor’s ability to reflect the status of each parking space correctly. Overall, it was found that all three technologies performed well, with accuracy rates of 95% or better for both tests. This result suggests that, for implementation, pricing, and/or maintenance issues may be more significant factors for the choice of technology.


2001 ◽  
Vol 10 (04) ◽  
pp. 715-734 ◽  
Author(s):  
SHU-CHING CHEN ◽  
MEI-LING SHYU ◽  
CHENGCUI ZHANG ◽  
R. L. KASHYAP

The identification of the overlapped objects is a great challenge in object tracking and video data indexing. For this purpose, a backtrack-chain-updation split algorithm is proposed to assist an unsupervised video segmentation method called the "simultaneous partition and class parameter estimation" (SPCPE) algorithm to identify the overlapped objects in the video sequence. The backtrack-chain-updation split algorithm can identify the split segment (object) and use the information in the current frame to update the previous frames in a backtrack-chain manner. The split algorithm provides more accurate temporal and spatial information of the semantic objects so that the semantic objects can be indexed and modeled by multimedia input strings and the multimedia augmented transition network (MATN) model. The MATN model is based on the ATN model that has been used in artificial intelligence (AI) areas for natural language understanding systems, and its inputs are modeled by the multimedia input strings. In this paper, we will show that the SPCPE algorithm together with the backtrack-chain-updation split algorithm can significantly enhance the efficiency of spatio-temporal video indexing by improving the accuracy of multimedia database queries related to semantic objects.


2019 ◽  
Vol 7 (2) ◽  
Author(s):  
Chien-Hong Chao ◽  
Huey-Wen Chou ◽  
Chih-Hao Tu

With the popularity of the Internet and the development of information technology, digital reading has affected human reading styles. In essence, digital reading is different from conventional reading in many ways. The aim of this research focuses primarily on exploring the differences in reading behaviors among different digital reading devices. Results reveal that the reading experience on the Tablet PC is superior to that on the other two digital devices. Subjects in the Tablet PC group demonstrate the highest preference in terms of depth reading which implies that Tablet PC should be the most appropriate device for digital learning platform in the future. Discussion and suggestions are in the conclusions at the end of this paper. 


eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Avner Wallach ◽  
Erik Harvey-Girard ◽  
James Jaeyoon Jun ◽  
André Longtin ◽  
Len Maler

Learning the spatial organization of the environment is essential for most animals’ survival. This requires the animal to derive allocentric spatial information from egocentric sensory and motor experience. The neural mechanisms underlying this transformation are mostly unknown. We addressed this problem in electric fish, which can precisely navigate in complete darkness and whose brain circuitry is relatively simple. We conducted the first neural recordings in the preglomerular complex, the thalamic region exclusively connecting the optic tectum with the spatial learning circuits in the dorsolateral pallium. While tectal topographic information was mostly eliminated in preglomerular neurons, the time-intervals between object encounters were precisely encoded. We show that this reliable temporal information, combined with a speed signal, can permit accurate estimation of the distance between encounters, a necessary component of path-integration that enables computing allocentric spatial relations. Our results suggest that similar mechanisms are involved in sequential spatial learning in all vertebrates.


Sign in / Sign up

Export Citation Format

Share Document