scholarly journals Toward Scalable Video Analytics Using Compressed-Domain Features at the Edge

2020 ◽  
Vol 10 (18) ◽  
pp. 6391
Author(s):  
Dien Van Nguyen ◽  
Jaehyuk Choi

Intelligent video analytics systems have come to play an essential role in many fields, including public safety, transportation safety, and many other industrial areas, such as automated tools for data extraction, and analyzing huge datasets, such as multiple live video streams transmitted from a large number of cameras. A key characteristic of such systems is that it is critical to perform real-time analytics so as to provide timely actionable alerts on various tasks, activities, and conditions. Due to the computation-intensive and bandwidth-intensive nature of these operations, however, video analytics servers may not fulfill the requirements when serving a large number of cameras simultaneously. To handle these challenges, we present an edge computing-based system that minimizes the transfer of video data from the surveillance camera feeds on a cloud video analytics server. Based on a novel approach of utilizing the information from the encoded bitstream, the edge can achieve low processing complexity of object tracking in surveillance videos and filter non-motion frames from the list of data that will be forwarded to the cloud server. To demonstrate the effectiveness of our approach, we implemented a video surveillance prototype consisting of edge devices with low computational capacity and a GPU-enabled server. The evaluation results show that our method can efficiently catch the characteristics of the frame and is compatible with the edge-to-cloud platform in terms of accuracy and delay sensitivity. The average processing time of this method is approximately 39 ms/frame with high definition resolution video, which outperforms most of the state-of-the-art methods. In addition to the scenario implementation of the proposed system, the method helps the cloud server reduce 49% of the load of the GPU, 49% that of the CPU, and 55% of the network traffic while maintaining the accuracy of video analytics event detection.

2018 ◽  
Author(s):  
Richard Robert Suminski Jr ◽  
Gregory Dominick ◽  
Philip Sapanaro

BACKGROUND A considerable proportion of outdoor physical activity is done on sidewalk/streets. For example, we found that ~70% of adults who walked during the previous week used the sidewalks/streets around their homes. Interventions conducted at geographical levels (e.g., community) and studies examining relationships between environmental conditions (e.g., traffic) and walking/biking, necessitate a reliable measure of physical activities performed on sidewalks/streets. The Block Walk Method (BWM) is one of the more common approaches available for this purpose. Although it utilizes reliable observation techniques and displays criterion validity, it remains relatively unchanged since its introduction in 2006. It is a non-technical, labor-intensive, first generation method. Advancing the BWM would contribute significantly to our understanding of physical activity behavior. OBJECTIVE Therefore, the objective of the proposed study is to develop and test a new BWM that utilizes a wearable video device (WVD) and computer video analysis to assess physical activities performed on sidewalks/streets. The following aims will be completed to accomplish this objective. Aim 1: Improve the BWM by incorporating a WVD into the methodology. The WVD is a pair of eyeglasses with a high definition video camera embedded into the frames. We expect the WVD to be a viable option for improving the acquisition and accuracy of data collected using the BWM. Aim 2: Advance the WVD-enhanced BWM by applying machine learning and recognition software to automatically extract information on physical activities occurring on the sidewalks/streets from the videos. METHODS Trained observers (one wearing and one not wearing the WVD) will walk together at a set pace along predetermined, 1000 ft. sidewalk/street observation routes representing low, medium, and high walkable areas. During the walks, the non-WVD observer will use the traditional BWM to record the number of individuals standing/sitting, walking, biking, and running along the routes. The WVD observer will only record a video while walking. Later, two investigators will view the videos to determine the numbers of individuals performing physical activities along the routes. For aim 2, the video data will be analyzed automatically using multiple deep convolutional neural networks (CNNs) to determine the number of humans along an observation route as well as the type of physical activities being performed. Bland Altman methods and intraclass correlation coefficients will be used to assess agreement. Potential sources of error such as occlusions (e.g., trees) will be assessed using moderator analyses. RESULTS Outcomes from this study are pending; however, preliminary studies supporting the research protocol indicate that the BWM is reliable and the number of individuals were seen walking along routes are correlated with several environmental characteristics (e.g., traffic, sidewalk defects). Further, we have used CNNs to detect cars, bikes, and pedestrians as well as individuals using park facilities. CONCLUSIONS We expect the new approach will enhance measurement accuracy while reducing the burden of data collection. In the future, the capabilities of the WVD-CNNs system will be expanded to allow for the determination of other characteristics captured by the videos such as caloric expenditure and environmental conditions.


Author(s):  
Jaber Almutairi ◽  
Mohammad Aldossary

AbstractRecently, the number of Internet of Things (IoT) devices connected to the Internet has increased dramatically as well as the data produced by these devices. This would require offloading IoT tasks to release heavy computation and storage to the resource-rich nodes such as Edge Computing and Cloud Computing. Although Edge Computing is a promising enabler for latency-sensitive related issues, its deployment produces new challenges. Besides, different service architectures and offloading strategies have a different impact on the service time performance of IoT applications. Therefore, this paper presents a novel approach for task offloading in an Edge-Cloud system in order to minimize the overall service time for latency-sensitive applications. This approach adopts fuzzy logic algorithms, considering application characteristics (e.g., CPU demand, network demand and delay sensitivity) as well as resource utilization and resource heterogeneity. A number of simulation experiments are conducted to evaluate the proposed approach with other related approaches, where it was found to improve the overall service time for latency-sensitive applications and utilize the edge-cloud resources effectively. Also, the results show that different offloading decisions within the Edge-Cloud system can lead to various service time due to the computational resources and communications types.


2021 ◽  
Vol 13 (3) ◽  
pp. 63
Author(s):  
Maghsoud Morshedi ◽  
Josef Noll

Video conferencing services based on web real-time communication (WebRTC) protocol are growing in popularity among Internet users as multi-platform solutions enabling interactive communication from anywhere, especially during this pandemic era. Meanwhile, Internet service providers (ISPs) have deployed fiber links and customer premises equipment that operate according to recent 802.11ac/ax standards and promise users the ability to establish uninterrupted video conferencing calls with ultra-high-definition video and audio quality. However, the best-effort nature of 802.11 networks and the high variability of wireless medium conditions hinder users experiencing uninterrupted high-quality video conferencing. This paper presents a novel approach to estimate the perceived quality of service (PQoS) of video conferencing using only 802.11-specific network performance parameters collected from Wi-Fi access points (APs) on customer premises. This study produced datasets comprising 802.11-specific network performance parameters collected from off-the-shelf Wi-Fi APs operating at 802.11g/n/ac/ax standards on both 2.4 and 5 GHz frequency bands to train machine learning algorithms. In this way, we achieved classification accuracies of 92–98% in estimating the level of PQoS of video conferencing services on various Wi-Fi networks. To efficiently troubleshoot wireless issues, we further analyzed the machine learning model to correlate features in the model with the root cause of quality degradation. Thus, ISPs can utilize the approach presented in this study to provide predictable and measurable wireless quality by implementing a non-intrusive quality monitoring approach in the form of edge computing that preserves customers’ privacy while reducing the operational costs of monitoring and data analytics.


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4045
Author(s):  
Alessandro Sassu ◽  
Jose Francisco Saenz-Cogollo ◽  
Maurizio Agelli

Edge computing is the best approach for meeting the exponential demand and the real-time requirements of many video analytics applications. Since most of the recent advances regarding the extraction of information from images and video rely on computation heavy deep learning algorithms, there is a growing need for solutions that allow the deployment and use of new models on scalable and flexible edge architectures. In this work, we present Deep-Framework, a novel open source framework for developing edge-oriented real-time video analytics applications based on deep learning. Deep-Framework has a scalable multi-stream architecture based on Docker and abstracts away from the user the complexity of cluster configuration, orchestration of services, and GPU resources allocation. It provides Python interfaces for integrating deep learning models developed with the most popular frameworks and also provides high-level APIs based on standard HTTP and WebRTC interfaces for consuming the extracted video data on clients running on browsers or any other web-based platform.


Author(s):  
Friedrich Knuth ◽  
Leila Belabassi ◽  
Lori Garzio ◽  
Michael Smith ◽  
Michael Vardaro ◽  
...  

Author(s):  
Brendan J. Russo ◽  
Emmanuel James ◽  
Cristopher Y. Aguilar ◽  
Edward J. Smaglik

In the past two decades, cell phone and smartphone use in the United States has increased substantially. Although mobile phones provide a convenient way for people to communicate, the distraction caused by the use of these devices has led to unintended traffic safety and operational consequences. Although it is well recognized that distracted driving is extremely dangerous for all road users (including pedestrians), the potential impacts of distracted walking have not been as comprehensively studied. Although practitioners should design facilities with the safety, efficiency, and comfort of pedestrians in mind, it is still important to investigate certain pedestrian behaviors at existing facilities to minimize the risk of pedestrian–vehicle crashes, and to reduce behaviors that may unnecessarily increase delay at signalized intersections. To gain new insights into factors associated with distracted walking, pedestrian violations, and walking speed, 3,038 pedestrians were observed across four signalized intersections in New York and Arizona using high-definition video cameras. The video data were reduced and summarized, and an ordinary least squares (OLS) regression model was estimated to analyze factors affecting walking speeds. In addition, binary logit models were estimated to analyze both pedestrian distraction and pedestrian violations. Ultimately, several site- and pedestrian-specific variables were found to be significantly associated with pedestrian distraction, violation behavior, and walking speeds. The results provide important information for researchers, practitioners, and legislators, and may be useful in planning strategies to reduce or mitigate the impacts of pedestrian behavior that may be considered unsafe or potentially inefficient.


Author(s):  
Urvish Trivedi ◽  
Jonielle McDonnough ◽  
Muhaimen Shamsi ◽  
Andrez Izurieta Ochoa ◽  
Alec Braynen ◽  
...  

Detecting humans and objects during walking has been a very difficult problem for people with visual impairment. To safely avoid collision with any object or human and to navigate from one location to another, it is significant to know how far and what kind of obstacle the user is facing. In recent years, many researches have shown that providing different vibration stimulation can be very useful to convey important information to the user. In this paper, we present our stereovision system with high definition camera to detect and identify humans and obstacles in real time and compare it with a modified version of existing wearable haptic belt that uses high-performance Ultrasonic sensors. The aim of this paper is to present the practicability of stereovision system over cane and assistive technology such as vibrotactile belt. The study is based on two assistive technologies. The first one consists of the vibrotactile belt connected to ultrasonic sensors and an accelerometer which returns user movement & speed information to the microcontroller. The microcontroller initiates expressive vibrotactile stimulation based on sensor data. Data gathered from this technology will be used as the baseline data for comparison with our stereovision system. Second, we present a novel approach to detect the type of obstacle using object recognition algorithm and the best approach to avoid it using the stereovision feedback. Data gathered from this technology with be comparted against the baseline data from the vibrotactile belt. In addition, we present the results of the comparative study which shows that stereovision system has plethora of advantages over vibrotactile belt.


Author(s):  
Daniel Danso Essel ◽  
Ben-Bright Benuwa ◽  
Benjamin Ghansah

Sparse Representation (SR) and Dictionary Learning (DL) based Classifier have shown promising results in classification tasks, with impressive recognition rate on image data. In Video Semantic Analysis (VSA) however, the local structure of video data contains significant discriminative information required for classification. To the best of our knowledge, this has not been fully explored by recent DL-based approaches. Further, similar coding findings are not being realized from video features with the same video category. Based on the foregoing, a novel learning algorithm, Sparsity based Locality-Sensitive Discriminative Dictionary Learning (SLSDDL) for VSA is proposed in this paper. In the proposed algorithm, a discriminant loss function for the category based on sparse coding of the sparse coefficients is introduced into structure of Locality-Sensitive Dictionary Learning (LSDL) algorithm. Finally, the sparse coefficients for the testing video feature sample are solved by the optimized method of SLSDDL and the classification result for video semantic is obtained by minimizing the error between the original and reconstructed samples. The experimental results show that, the proposed SLSDDL significantly improves the performance of video semantic detection compared with state-of-the-art approaches. The proposed approach also shows robustness to diverse video environments, proving the universality of the novel approach.


Author(s):  
Sheila M. Pinto-Cáceres ◽  
Jurandy Almeida ◽  
Vânia P. A. Neris ◽  
M. Cecília C. Baranauskas ◽  
Neucimar J. Leite ◽  
...  

The fast evolution of technology has led to a growing demand for video data, increasing the amount of research into efficient systems to manage those materials. Making efficient use of video information requires that data be accessed in a user-friendly way. Ideally, one would like to perform video search using an intuitive tool. Most of existing browsers for the interactive search of video sequences, however, have employed a too rigid layout to arrange the results, restricting users to explore the results using list- or grid-based layouts. This paper presents a novel approach for the interactive search that displays the result set in a flexible manner. The proposed method is based on a simple and fast algorithm to build video stories and on an effective visual structure to arrange the storyboards, called Clustering Set. It is able to group together videos with similar content and to organize the result set in a well-defined tree. Results from a rigorous empirical comparison with a subjective evaluation show that such a strategy makes the navigation more coherent and engaging to users.


Sign in / Sign up

Export Citation Format

Share Document