scholarly journals Status on Video Data Reduction and Air Delivery Payload Pose Estimation

Author(s):  
Oleg Yakimenko ◽  
Robert Berlind ◽  
Chad Albright
2016 ◽  
pp. 8-13
Author(s):  
Daniel Reynolds ◽  
Richard A. Messner

Video copy detection is the process of comparing and analyzing videos to extract a measure of their similarity in order to determine if they are copies, modified versions, or completely different videos. With video frame sizes increasing rapidly, it is important to allow for a data reduction process to take place in order to achieve fast video comparisons. Further, detecting video streaming and storage of legal and illegal video data necessitates the fast and efficient implementation of video copy detection algorithms. In this paper some commonly used algorithms for video copy detection are implemented with the Log-Polar transformation being used as a pre-processing step to reduce the frame size prior to signature calculation. Two global based algorithms were chosen to validate the use of Log-Polar as an acceptable data reduction stage. The results of this research demonstrate that the addition of this pre-processing step significantly reduces the computation time of the overall video copy detection process while not significantly affecting the detection accuracy of the algorithm used for the detection process.


2020 ◽  
Vol 34 (07) ◽  
pp. 10631-10638
Author(s):  
Yu Cheng ◽  
Bo Yang ◽  
Bo Wang ◽  
Robby T. Tan

Estimating 3D poses from a monocular video is still a challenging task, despite the significant progress that has been made in the recent years. Generally, the performance of existing methods drops when the target person is too small/large, or the motion is too fast/slow relative to the scale and speed of the training data. Moreover, to our knowledge, many of these methods are not designed or trained under severe occlusion explicitly, making their performance on handling occlusion compromised. Addressing these problems, we introduce a spatio-temporal network for robust 3D human pose estimation. As humans in videos may appear in different scales and have various motion speeds, we apply multi-scale spatial features for 2D joints or keypoints prediction in each individual frame, and multi-stride temporal convolutional networks (TCNs) to estimate 3D joints or keypoints. Furthermore, we design a spatio-temporal discriminator based on body structures as well as limb motions to assess whether the predicted pose forms a valid pose and a valid movement. During training, we explicitly mask out some keypoints to simulate various occlusion cases, from minor to severe occlusion, so that our network can learn better and becomes robust to various degrees of occlusion. As there are limited 3D ground truth data, we further utilize 2D video data to inject a semi-supervised learning capability to our network. Experiments on public data sets validate the effectiveness of our method, and our ablation studies show the strengths of our network's individual submodules.


2018 ◽  
Author(s):  
Alexander Mathis ◽  
Richard Warren

Pose estimation is crucial for many applications in neuroscience, biomechanics, genetics and beyond. We recently presented a highly efficient method for markerless pose estimation based on transfer learning with deep neural networks called DeepLabCut. Current experiments produce vast amounts of video data, which pose challenges for both storage and analysis. Here we improve the inference speed of DeepLabCut by up to tenfold and benchmark these updates on various CPUs and GPUs. In particular, depending on the frame size, poses can be inferred offline at up to 1200 frames per second (FPS). For instance, 278 × 278 images can be processed at 225 FPS on a GTX 1080 Ti graphics card. Furthermore, we show that DeepLabCut is highly robust to standard video compression (ffmpeg). Compression rates of greater than 1,000 only decrease accuracy by about half a pixel (for 640 × 480 frame size). DeepLabCut’s speed and robustness to compression can save both time and hardware expenses.


2021 ◽  
Vol 17 (4) ◽  
pp. e1008914
Author(s):  
Laetitia Hebert ◽  
Tosif Ahamed ◽  
Antonio C. Costa ◽  
Liam O’Shaughnessy ◽  
Greg J. Stephens

An important model system for understanding genes, neurons and behavior, the nematode worm C. elegans naturally moves through a variety of complex postures, for which estimation from video data is challenging. We introduce an open-source Python package, WormPose, for 2D pose estimation in C. elegans, including self-occluded, coiled shapes. We leverage advances in machine vision afforded from convolutional neural networks and introduce a synthetic yet realistic generative model for images of worm posture, thus avoiding the need for human-labeled training. WormPose is effective and adaptable for imaging conditions across worm tracking efforts. We quantify pose estimation using synthetic data as well as N2 and mutant worms in on-food conditions. We further demonstrate WormPose by analyzing long (∼ 8 hour), fast-sampled (∼ 30 Hz) recordings of on-food N2 worms to provide a posture-scale analysis of roaming/dwelling behaviors.


2021 ◽  
Vol 33 (3) ◽  
pp. 547-555
Author(s):  
Hitoshi Habe ◽  
Yoshiki Takeuchi ◽  
Kei Terayama ◽  
Masa-aki Sakagami ◽  
◽  
...  

We propose a pose estimation method using a National Advisory Committee for Aeronautics (NACA) airfoil model for fish schools. This method allows one to understand the state in which fish are swimming based on their posture and dynamic variations. Moreover, their collective behavior can be understood based on their posture changes. Therefore, fish pose is a crucial indicator for collective behavior analysis. We use the NACA model to represent the fish posture; this enables more accurate tracking and movement prediction owing to the capability of the model in describing posture dynamics. To fit the model to video data, we first adopt the DeepLabCut toolbox to detect body parts (i.e., head, center, and tail fin) in an image sequence. Subsequently, we apply a particle filter to fit a set of parameters from the NACA model. The results from DeepLabCut, i.e., three points on a fish body, are used to adjust the components of the state vector. This enables more reliable estimation results to be obtained when the speed and direction of the fish change abruptly. Experimental results using both simulation data and real video data demonstrate that the proposed method provides good results, including when rapid changes occur in the swimming direction.


Author(s):  
Cara Piazza ◽  
Joseph Schroeder ◽  
Chiahao Lu ◽  
Arthur Erdman ◽  
Matthew Johnson ◽  
...  

Abstract Patients who suffer from Parkinson’s Disease are more prone to postural instability, a major risk factor for falls. One of the most common clinical methods of gauging the severity of a patient’s postural instability is with the retropulsion test [1], in which a clinician perturbs the balance of the patient and then rates their response to the perturbation. This test is subjective and largely based on the observations made by the clinician. In order to improve postural instability diagnosis and encourage more meaningful therapies for this cognitive-motor symptom, there is a clinical need to enable more objective, quantifiable approaches to measuring postural instability. In this paper, we describe a novel computational approach to quantifying the number, length, and trajectory of steps taken during a retropulsion test or other type of balance perturbation from a single camera facing the anterior side (front) of the subject. The computational framework involved first analyzing the video data using markerless pose estimation algorithms to track the movement of the subject’s feet. These pixel data were then converted from 2D to 3D using calibrated transformation functions, and then analyzed for consistency when compared to the known step lengths. The testing data showed accurate step length estimation within 1 cm, which suggests this computational approach could have utility in a variety of clinical environments.


2020 ◽  
Author(s):  
Laetitia Hebert ◽  
Tosif Ahamed ◽  
Antonio C. Costa ◽  
Liam O’Shaugnessy ◽  
Greg J. Stephens

An important model system for understanding genes, neurons and behavior, the nematode worm C. elegans naturally moves through a variety of complex postures, for which estimation from video data is challenging. We introduce an open-source Python package, WormPose, for 2D pose estimation in C. elegans, including self-occluded, coiled shapes. We leverage advances in machine vision afforded from convolutional neural networks and introduce a synthetic yet realistic generative model for images of worm posture, thus avoiding the need for human-labeled training. WormPose is effective and adaptable for imaging conditions across worm tracking efforts. We quantify pose estimation using synthetic data as well as N2 and mutant worms in on-food conditions. We further demonstrate WormPose by analyzing long (∼ 10 hour), fast-sampled (∼ 30 Hz) recordings of on-food N2 worms to provide a posture-scale analysis of roaming/dwelling behaviors.


Sign in / Sign up

Export Citation Format

Share Document