Status on Video Data Reduction and Air Delivery Payload Pose Estimation

Video copy detection is the process of comparing and analyzing videos to extract a measure of their similarity in order to determine if they are copies, modified versions, or completely different videos. With video frame sizes increasing rapidly, it is important to allow for a data reduction process to take place in order to achieve fast video comparisons. Further, detecting video streaming and storage of legal and illegal video data necessitates the fast and efficient implementation of video copy detection algorithms. In this paper some commonly used algorithms for video copy detection are implemented with the Log-Polar transformation being used as a pre-processing step to reduce the frame size prior to signature calculation. Two global based algorithms were chosen to validate the use of Log-Polar as an acceptable data reduction stage. The results of this research demonstrate that the addition of this pre-processing step significantly reduces the computation time of the overall video copy detection process while not significantly affecting the detection accuracy of the algorithm used for the detection process.

Download Full-text

3D Human Pose Estimation Using Spatio-Temporal Networks with Explicit Occlusion Training

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6689 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10631-10638

Author(s):

Yu Cheng ◽

Bo Yang ◽

Bo Wang ◽

Robby T. Tan

Keyword(s):

Pose Estimation ◽

Ground Truth ◽

Video Data ◽

Training Data ◽

Human Pose Estimation ◽

Ground Truth Data ◽

Public Data ◽

Spatio Temporal ◽

Human Pose ◽

3D Human Pose Estimation

Estimating 3D poses from a monocular video is still a challenging task, despite the significant progress that has been made in the recent years. Generally, the performance of existing methods drops when the target person is too small/large, or the motion is too fast/slow relative to the scale and speed of the training data. Moreover, to our knowledge, many of these methods are not designed or trained under severe occlusion explicitly, making their performance on handling occlusion compromised. Addressing these problems, we introduce a spatio-temporal network for robust 3D human pose estimation. As humans in videos may appear in different scales and have various motion speeds, we apply multi-scale spatial features for 2D joints or keypoints prediction in each individual frame, and multi-stride temporal convolutional networks (TCNs) to estimate 3D joints or keypoints. Furthermore, we design a spatio-temporal discriminator based on body structures as well as limb motions to assess whether the predicted pose forms a valid pose and a valid movement. During training, we explicitly mask out some keypoints to simulate various occlusion cases, from minor to severe occlusion, so that our network can learn better and becomes robust to various degrees of occlusion. As there are limited 3D ground truth data, we further utilize 2D video data to inject a semi-supervised learning capability to our network. Experiments on public data sets validate the effectiveness of our method, and our ablation studies show the strengths of our network's individual submodules.

Download Full-text

Video data reduction with error resilience based on macroblock reorder

Journal of Electronic Imaging ◽

10.1117/1.1868001 ◽

2005 ◽

Vol 14 (1) ◽

pp. 013008

Author(s):

Tanzeem Muzaffar

Keyword(s):

Data Reduction ◽

Error Resilience ◽

Video Data

Download Full-text

On the inference speed and video-compression robustness of DeepLabCut

10.1101/457242 ◽

2018 ◽

Cited By ~ 9

Author(s):

Alexander Mathis ◽

Richard Warren

Keyword(s):

Neural Networks ◽

Transfer Learning ◽

Video Compression ◽

Pose Estimation ◽

Efficient Method ◽

Deep Neural Networks ◽

Video Data ◽

Frame Size ◽

Highly Efficient

Pose estimation is crucial for many applications in neuroscience, biomechanics, genetics and beyond. We recently presented a highly efficient method for markerless pose estimation based on transfer learning with deep neural networks called DeepLabCut. Current experiments produce vast amounts of video data, which pose challenges for both storage and analysis. Here we improve the inference speed of DeepLabCut by up to tenfold and benchmark these updates on various CPUs and GPUs. In particular, depending on the frame size, poses can be inferred offline at up to 1200 frames per second (FPS). For instance, 278 × 278 images can be processed at 225 FPS on a GTX 1080 Ti graphics card. Furthermore, we show that DeepLabCut is highly robust to standard video compression (ffmpeg). Compression rates of greater than 1,000 only decrease accuracy by about half a pixel (for 640 × 480 frame size). DeepLabCut’s speed and robustness to compression can save both time and hardware expenses.

Download Full-text

WormPose: Image synthesis and convolutional networks for pose estimation in C. elegans

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008914 ◽

2021 ◽

Vol 17 (4) ◽

pp. e1008914

Author(s):

Laetitia Hebert ◽

Tosif Ahamed ◽

Antonio C. Costa ◽

Liam O’Shaughnessy ◽

Greg J. Stephens

Keyword(s):

Pose Estimation ◽

Synthetic Data ◽

Image Synthesis ◽

Video Data ◽

C Elegans ◽

Convolutional Networks ◽

Food Conditions ◽

And Behavior ◽

Nematode Worm ◽

Imaging Conditions

An important model system for understanding genes, neurons and behavior, the nematode worm C. elegans naturally moves through a variety of complex postures, for which estimation from video data is challenging. We introduce an open-source Python package, WormPose, for 2D pose estimation in C. elegans, including self-occluded, coiled shapes. We leverage advances in machine vision afforded from convolutional neural networks and introduce a synthetic yet realistic generative model for images of worm posture, thus avoiding the need for human-labeled training. WormPose is effective and adaptable for imaging conditions across worm tracking efforts. We quantify pose estimation using synthetic data as well as N2 and mutant worms in on-food conditions. We further demonstrate WormPose by analyzing long (∼ 8 hour), fast-sampled (∼ 30 Hz) recordings of on-food N2 worms to provide a posture-scale analysis of roaming/dwelling behaviors.

Download Full-text

Pose Estimation of Swimming Fish Using NACA Airfoil Model for Collective Behavior Analysis

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2021.p0547 ◽

2021 ◽

Vol 33 (3) ◽

pp. 547-555

Author(s):

Hitoshi Habe ◽

Yoshiki Takeuchi ◽

Kei Terayama ◽

Masa-aki Sakagami ◽

◽

...

Keyword(s):

Behavior Analysis ◽

Pose Estimation ◽

Collective Behavior ◽

Estimation Method ◽

The State ◽

Video Data ◽

Body Parts ◽

Fish Schools ◽

Dynamic Variations ◽

Rapid Changes

We propose a pose estimation method using a National Advisory Committee for Aeronautics (NACA) airfoil model for fish schools. This method allows one to understand the state in which fish are swimming based on their posture and dynamic variations. Moreover, their collective behavior can be understood based on their posture changes. Therefore, fish pose is a crucial indicator for collective behavior analysis. We use the NACA model to represent the fish posture; this enables more accurate tracking and movement prediction owing to the capability of the model in describing posture dynamics. To fit the model to video data, we first adopt the DeepLabCut toolbox to detect body parts (i.e., head, center, and tail fin) in an image sequence. Subsequently, we apply a particle filter to fit a set of parameters from the NACA model. The results from DeepLabCut, i.e., three points on a fish body, are used to adjust the components of the state vector. This enables more reliable estimation results to be obtained when the speed and direction of the fish change abruptly. Experimental results using both simulation data and real video data demonstrate that the proposed method provides good results, including when rapid changes occur in the swimming direction.

Download Full-text

Real-time pre-ATR video data reduction in wireless networks

10.1117/12.663148 ◽

2006 ◽

Author(s):

Tomasz Jannson ◽

Andrew Kostrzewski

Keyword(s):

Wireless Networks ◽

Real Time ◽

Data Reduction ◽

Video Data

Download Full-text

Quantifying Postural Instability with Pose Estimation Software and 3D Depth Extraction

2021 Design of Medical Devices Conference ◽

10.1115/dmd2021-1067 ◽

2020 ◽

Author(s):

Cara Piazza ◽

Joseph Schroeder ◽

Chiahao Lu ◽

Arthur Erdman ◽

Matthew Johnson ◽

...

Keyword(s):

Pose Estimation ◽

Step Length ◽

Computational Approach ◽

Postural Instability ◽

Video Data ◽

Motor Symptom ◽

Computational Framework ◽

Estimation Algorithms ◽

Length Estimation ◽

Clinical Environments

Abstract Patients who suffer from Parkinson’s Disease are more prone to postural instability, a major risk factor for falls. One of the most common clinical methods of gauging the severity of a patient’s postural instability is with the retropulsion test [1], in which a clinician perturbs the balance of the patient and then rates their response to the perturbation. This test is subjective and largely based on the observations made by the clinician. In order to improve postural instability diagnosis and encourage more meaningful therapies for this cognitive-motor symptom, there is a clinical need to enable more objective, quantifiable approaches to measuring postural instability. In this paper, we describe a novel computational approach to quantifying the number, length, and trajectory of steps taken during a retropulsion test or other type of balance perturbation from a single camera facing the anterior side (front) of the subject. The computational framework involved first analyzing the video data using markerless pose estimation algorithms to track the movement of the subject’s feet. These pixel data were then converted from 2D to 3D using calibrated transformation functions, and then analyzed for consistency when compared to the known step lengths. The testing data showed accurate step length estimation within 1 cm, which suggests this computational approach could have utility in a variety of clinical environments.

Download Full-text

WormPose: Image synthesis and convolutional networks for pose estimation in C. elegans

10.1101/2020.07.09.193755 ◽

2020 ◽

Author(s):

Laetitia Hebert ◽

Tosif Ahamed ◽

Antonio C. Costa ◽

Liam O’Shaugnessy ◽

Greg J. Stephens

Keyword(s):

Pose Estimation ◽

Synthetic Data ◽

Image Synthesis ◽

Video Data ◽

C Elegans ◽

Convolutional Networks ◽

Food Conditions ◽

And Behavior ◽

Nematode Worm ◽

Imaging Conditions

An important model system for understanding genes, neurons and behavior, the nematode worm C. elegans naturally moves through a variety of complex postures, for which estimation from video data is challenging. We introduce an open-source Python package, WormPose, for 2D pose estimation in C. elegans, including self-occluded, coiled shapes. We leverage advances in machine vision afforded from convolutional neural networks and introduce a synthetic yet realistic generative model for images of worm posture, thus avoiding the need for human-labeled training. WormPose is effective and adaptable for imaging conditions across worm tracking efforts. We quantify pose estimation using synthetic data as well as N2 and mutant worms in on-food conditions. We further demonstrate WormPose by analyzing long (∼ 10 hour), fast-sampled (∼ 30 Hz) recordings of on-food N2 worms to provide a posture-scale analysis of roaming/dwelling behaviors.

Download Full-text

Automated Air Drop Video Data Reduction and Air Delivery Payload Position Estimation

2006 9th International Conference on Control, Automation, Robotics and Vision ◽

10.1109/icarcv.2006.345130 ◽

2006 ◽

Cited By ~ 3

Author(s):

O.A. Yakimenko ◽

R.M. Berlind ◽

C. Albrigh

Keyword(s):

Data Reduction ◽

Position Estimation ◽

Video Data

Download Full-text