Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes

Hyeongseok Son; Junyong Lee; Jonghyeop Lee; Sunghyun Cho; Seungyong Lee

doi:10.1145/3453720

Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes

ACM Transactions on Graphics ◽

10.1145/3453720 ◽

2021 ◽

Vol 40 (5) ◽

pp. 1-18

Author(s):

Hyeongseok Son ◽

Junyong Lee ◽

Jonghyeop Lee ◽

Sunghyun Cho ◽

Seungyong Lee

Keyword(s):

Motion Estimation ◽

Motion Compensation ◽

State Of The Art ◽

Estimation Accuracy ◽

Estimation Errors ◽

Aggregate Information ◽

Video Frames ◽

Compensation Methods ◽

Target Frame ◽

Multiple Frames

For the success of video deblurring, it is essential to utilize information from neighboring frames. Most state-of-the-art video deblurring methods adopt motion compensation between video frames to aggregate information from multiple frames that can help deblur a target frame. However, the motion compensation methods adopted by previous deblurring methods are not blur-invariant, and consequently, their accuracy is limited for blurry frames with different blur amounts. To alleviate this problem, we propose two novel approaches to deblur videos by effectively aggregating information from multiple video frames. First, we present blur-invariant motion estimation learning to improve motion estimation accuracy between blurry frames. Second, for motion compensation, instead of aligning frames by warping with estimated motions, we use a pixel volume that contains candidate sharp pixels to resolve motion estimation errors. We combine these two processes to propose an effective recurrent video deblurring network that fully exploits deblurred previous frames. Experiments show that our method achieves the state-of-the-art performance both quantitatively and qualitatively compared to recent methods that use deep learning.

Bidirectional Temporal-Recurrent Propagation Networks for Video Super-Resolution

Electronics ◽

10.3390/electronics9122085 ◽

2020 ◽

Vol 9 (12) ◽

pp. 2085

Author(s):

Lei Han ◽

Cien Fan ◽

Ye Yang ◽

Lian Zou

Keyword(s):

Neural Networks ◽

Motion Estimation ◽

Motion Compensation ◽

Sampling Method ◽

State Of The Art ◽

Super Resolution ◽

Temporal Information ◽

Complex Motion ◽

One Step ◽

Model Size

Recently, convolutional neural networks have made a remarkable performance for video super-resolution. However, how to exploit the spatial and temporal information of video efficiently and effectively remains challenging. In this work, we design a bidirectional temporal-recurrent propagation unit. The bidirectional temporal-recurrent propagation unit makes it possible to flow temporal information in an RNN-like manner from frame to frame, which avoids complex motion estimation modeling and motion compensation. To better fuse the information of the two temporal-recurrent propagation units, we use channel attention mechanisms. Additionally, we recommend a progressive up-sampling method instead of one-step up-sampling. We find that progressive up-sampling gets better experimental results than one-stage up-sampling. Extensive experiments show that our algorithm outperforms several recent state-of-the-art video super-resolution (VSR) methods with a smaller model size.

ROBUST SPARSE MATCHING AND MOTION ESTIMATION USING GENETIC ALGORITHMS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-3-w2-197-2015 ◽

2015 ◽

Vol XL-3/W2 ◽

pp. 197-204 ◽

Cited By ~ 1

Author(s):

M. Shahbazi ◽

G. Sohn ◽

J. Théau ◽

P. Ménard

Keyword(s):

Motion Estimation ◽

Computational Efficiency ◽

State Of The Art ◽

Random Search ◽

Synthetic Data ◽

The State ◽

Estimation Accuracy ◽

Motion Model ◽

Least Trimmed Squares ◽

Orientation Parameters

In this paper, we propose a robust technique using genetic algorithm for detecting inliers and estimating accurate motion parameters from putative correspondences containing any percentage of outliers. The proposed technique aims to increase computational efficiency and modelling accuracy in comparison with the state-of-the-art via the following contributions: i) guided generation of initial populations for both avoiding degenerate solutions and increasing the rate of useful hypotheses, ii) replacing random search with evolutionary search, iii) possibility of evaluating the individuals of every population by parallel computation, iv) being performable on images with unknown internal orientation parameters, iv) estimating the motion model via detecting a minimum, however more than enough, set of inliers, v) ensuring the robustness of the motion model against outliers, degeneracy and poorperspective camera models, vi) making no assumptions about the probability distribution of inliers and/or outliers residuals from the estimated motion model, vii) detecting all the inliers by setting the threshold on their residuals adaptively with regard to the uncertainty of the estimated motion model and the position of the matches. The proposed method was evaluated both on synthetic data and real images. The results were compared with the most popular techniques from the state-of-the-art, including RANSAC, MSAC, MLESAC, Least Trimmed Squares and Least Median of Squares. Experimental results proved that the proposed approach perform better than others in terms of accuracy of motion estimation, accuracy of inlier detection and the computational efficiency.

A Preprocess Method of External Disturbance Suppression for Carotid Wall Motion Estimation Using Local Phase and Orientation of B-Mode Ultrasound Sequences

BioMed Research International ◽

10.1155/2019/6547982 ◽

2019 ◽

Vol 2019 ◽

pp. 1-15

Author(s):

Qinghui Zhang ◽

Junqiu Li ◽

Zhenping Qiang ◽

Libo He

Keyword(s):

Motion Estimation ◽

Wall Motion ◽

Structural Information ◽

External Disturbance ◽

Estimation Accuracy ◽

Ultrasound Images ◽

Local Phase ◽

Registration Method ◽

Anisotropic Diffusion Filter ◽

Disturbance Suppression

Estimating the motions of the common carotid artery wall plays a very important role in early diagnosis of the carotid atherosclerotic disease. However, the disturbances caused by either the instability of the probe operator or the breathing of subjects degrade the estimation accuracy of arterial wall motion when performing speckle tracking on the B-mode ultrasound images. In this paper, we propose a global registration method to suppress external disturbances before motion estimation. The local vector images, transformed from B-mode images, were used for registration. To take advantage of both the structural information from the local phase and the geometric information from the local orientation, we proposed a confidence coefficient to combine them two. Furthermore, we altered the speckle reducing anisotropic diffusion filter to improve the performance of disturbance suppression. We compared this method with schemes of extracting wall displacement directly from B-mode or phase images. The results show that this scheme can effectively suppress the disturbances and significantly improve the estimation accuracy.

Motion-estimation/motion-compensation hardware architecture for a scene-adaptive algorithm on a single-chip MPEG-2 MP@ML video encoder

10.1117/12.334739 ◽

1998 ◽

Author(s):

Koyo Nitta ◽

Toshihiro Minami ◽

Toshio Kondo ◽

Takeshi Ogura

Keyword(s):

Motion Estimation ◽

Motion Compensation ◽

Adaptive Algorithm ◽

Hardware Architecture ◽

Single Chip ◽

Video Encoder

A Dataset of Photos and Videos for Digital Forensics Analysis Using Machine Learning Processing

Data ◽

10.3390/data6080087 ◽

2021 ◽

Vol 6 (8) ◽

pp. 87

Author(s):

Sara Ferreira ◽

Mário Antunes ◽

Manuel E. Correia

Keyword(s):

Machine Learning ◽

Digital Forensics ◽

State Of The Art ◽

Forensic Analysis ◽

Third Party ◽

Support Vector ◽

Multimedia Content ◽

Digital Forensic ◽

Video Frames ◽

Forensic Tools

Deepfake and manipulated digital photos and videos are being increasingly used in a myriad of cybercrimes. Ransomware, the dissemination of fake news, and digital kidnapping-related crimes are the most recurrent, in which tampered multimedia content has been the primordial disseminating vehicle. Digital forensic analysis tools are being widely used by criminal investigations to automate the identification of digital evidence in seized electronic equipment. The number of files to be processed and the complexity of the crimes under analysis have highlighted the need to employ efficient digital forensics techniques grounded on state-of-the-art technologies. Machine Learning (ML) researchers have been challenged to apply techniques and methods to improve the automatic detection of manipulated multimedia content. However, the implementation of such methods have not yet been massively incorporated into digital forensic tools, mostly due to the lack of realistic and well-structured datasets of photos and videos. The diversity and richness of the datasets are crucial to benchmark the ML models and to evaluate their appropriateness to be applied in real-world digital forensics applications. An example is the development of third-party modules for the widely used Autopsy digital forensic application. This paper presents a dataset obtained by extracting a set of simple features from genuine and manipulated photos and videos, which are part of state-of-the-art existing datasets. The resulting dataset is balanced, and each entry comprises a label and a vector of numeric values corresponding to the features extracted through a Discrete Fourier Transform (DFT). The dataset is available in a GitHub repository, and the total amount of photos and video frames is 40,588 and 12,400, respectively. The dataset was validated and benchmarked with deep learning Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) methods; however, a plethora of other existing ones can be applied. Generically, the results show a better F1-score for CNN when comparing with SVM, both for photos and videos processing. CNN achieved an F1-score of 0.9968 and 0.8415 for photos and videos, respectively. Regarding SVM, the results obtained with 5-fold cross-validation are 0.9953 and 0.7955, respectively, for photos and videos processing. A set of methods written in Python is available for the researchers, namely to preprocess and extract the features from the original photos and videos files and to build the training and testing sets. Additional methods are also available to convert the original PKL files into CSV and TXT, which gives more flexibility for the ML researchers to use the dataset on existing ML frameworks and tools.

Video Frame Interpolation via Deformable Separable Convolution

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6634 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10607-10614 ◽

Cited By ~ 2

Author(s):

Xianhang Cheng ◽

Zhenzhong Chen

Keyword(s):

State Of The Art ◽

Video Frame ◽

Kernel Size ◽

Frame Interpolation ◽

Interpolation Methods ◽

Video Frames ◽

Convolution Process ◽

Strong Performance ◽

Existing Frames ◽

Better Than

Learning to synthesize non-existing frames from the original consecutive video frames is a challenging task. Recent kernel-based interpolation methods predict pixels with a single convolution process to replace the dependency of optical flow. However, when scene motion is larger than the pre-defined kernel size, these methods yield poor results even though they take thousands of neighboring pixels into account. To solve this problem in this paper, we propose to use deformable separable convolution (DSepConv) to adaptively estimate kernels, offsets and masks to allow the network to obtain information with much fewer but more relevant pixels. In addition, we show that the kernel-based methods and conventional flow-based methods are specific instances of the proposed DSepConv. Experimental results demonstrate that our method significantly outperforms the other kernel-based interpolation methods and shows strong performance on par or even better than the state-of-the-art algorithms both qualitatively and quantitatively.

syris: a flexible and efficient framework for X-ray imaging experiments simulation

Journal of Synchrotron Radiation ◽

10.1107/s1600577517012255 ◽

2017 ◽

Vol 24 (6) ◽

pp. 1283-1295 ◽

Cited By ~ 4

Author(s):

Tomáš Faragó ◽

Petr Mikulík ◽

Alexey Ershov ◽

Matthias Vogelgesang ◽

Daniel Hänschke ◽

...

Keyword(s):

Motion Estimation ◽

Data Processing ◽

Graphics Processing Units ◽

High Speed ◽

Real Data ◽

Estimation Accuracy ◽

Processing Parameter ◽

X Ray ◽

X Ray Imaging ◽

Imaging Conditions

An open-source framework for conducting a broad range of virtual X-ray imaging experiments,syris, is presented. The simulated wavefield created by a source propagates through an arbitrary number of objects until it reaches a detector. The objects in the light path and the source are time-dependent, which enables simulations of dynamic experiments,e.g.four-dimensional time-resolved tomography and laminography. The high-level interface ofsyrisis written in Python and its modularity makes the framework very flexible. The computationally demanding parts behind this interface are implemented in OpenCL, which enables fast calculations on modern graphics processing units. The combination of flexibility and speed opens new possibilities for studying novel imaging methods and systematic search of optimal combinations of measurement conditions and data processing parameters. This can help to increase the success rates and efficiency of valuable synchrotron beam time. To demonstrate the capabilities of the framework, various experiments have been simulated and compared with real data. To show the use case of measurement and data processing parameter optimization based on simulation, a virtual counterpart of a high-speed radiography experiment was created and the simulated data were used to select a suitable motion estimation algorithm; one of its parameters was optimized in order to achieve the best motion estimation accuracy when applied on the real data.syriswas also used to simulate tomographic data sets under various imaging conditions which impact the tomographic reconstruction accuracy, and it is shown how the accuracy may guide the selection of imaging conditions for particular use cases.

Experimental Comparison between Event and Global Shutter Cameras

Sensors ◽

10.3390/s21041137 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1137

Author(s):

Ondřej Holešovský ◽

Radoslav Škoviera ◽

Václav Hlaváč ◽

Roman Vítek

Keyword(s):

High Speed ◽

Ground Truth ◽

Motion Blur ◽

Position Estimation ◽

Estimation Accuracy ◽

Experimental Comparison ◽

Estimation Errors ◽

Detection Rates ◽

Ballistic Experiment ◽

Event Camera

We compare event-cameras with fast (global shutter) frame-cameras experimentally, asking: “What is the application domain, in which an event-camera surpasses a fast frame-camera?” Surprisingly, finding the answer has been difficult. Our methodology was to test event- and frame-cameras on generic computer vision tasks where event-camera advantages should manifest. We used two methods: (1) a controlled, cheap, and easily reproducible experiment (observing a marker on a rotating disk at varying speeds); (2) selecting one challenging practical ballistic experiment (observing a flying bullet having a ground truth provided by an ultra-high-speed expensive frame-camera). The experimental results include sampling/detection rates and position estimation errors as functions of illuminance and motion speed; and the minimum pixel latency of two commercial state-of-the-art event-cameras (ATIS, DVS240). Event-cameras respond more slowly to positive than to negative large and sudden contrast changes. They outperformed a frame-camera in bandwidth efficiency in all our experiments. Both camera types provide comparable position estimation accuracy. The better event-camera was limited by pixel latency when tracking small objects, resulting in motion blur effects. Sensor bandwidth limited the event-camera in object recognition. However, future generations of event-cameras might alleviate bandwidth limitations.

Algoritmo de Descomposición Ciega Basado en el Modelo de Mezcla Multi-Lineal

10.24254/cnib.21.13 ◽

2021 ◽

Author(s):

◽

J. N. Mendoza Chavarría

Keyword(s):

State Of The Art ◽

Hyperspectral Data ◽

Spectral Unmixing ◽

Mixing Models ◽

Nonlinear Interactions ◽

Estimation Errors ◽

Spectral Signatures ◽

Scattering Effects ◽

Linear Unmixing ◽

Multiple Light Scattering

Spectral unmixing has proven to be a great tool for the analysis of hyperspectral data, with linear mixing models (LMMs) being the most used in the literature. Nevertheless, due to the limitations of the LMMs to accurately describe the multiple light scattering effects in multi and hyperspectral imaging, new mixing models have emerged to describe nonlinear interactions. In this paper, we propose a new nonlinear unmixing algorithm based on a multilinear mixture model called Non-linear Extended Blind Endmember and Abundance Extraction (NEBEAE), which is based on a linear unmixing method established in the literature. The results of this study show that proposed method decreases the estimation errors of the spectral signatures and abundance maps, as well as the execution time with respect the state of the art methods.

Analysis and Implementation of MPEG-4 Motion Codec

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.268-270.1667 ◽

2012 ◽

Vol 268-270 ◽

pp. 1667-1670

Author(s):

Yun Peng Liu ◽

Ren Fang Wang ◽

Jin Li

Keyword(s):

Motion Estimation ◽

Motion Compensation ◽

Block Matching ◽

Modified Block

Motion codec is the key technique of MPEG-4 VM. The motion encoding and decoding use “Unrestricted Motion Estimation”, “Advanced Prediction”, “Overlapped Motion Compensation”, in addition, the particular “Repetitive Padding”, and “Modified Block Matching “ are added for MPEG-4. At the same time, some questions during motion are explained carefully, and some new viewpoints are brought forward.