scholarly journals Rectangling irregular videos by optimal spatio-temporal warping

2021 ◽  
Vol 8 (1) ◽  
pp. 93-103
Author(s):  
Jin-Liang Wu ◽  
Jun-Jie Shi ◽  
Lei Zhang

AbstractImage and video processing based on geometric principles typically changes the rectangular shape of video frames to an irregular shape. This paper presents a warping based approach for rectangling such irregular frame boundaries in space and time, i.e., making them rectangular again. To reduce geometric distortion in the rectangling process, we employ content-preserving deformation of a mesh grid with line structures as constraints to warp the frames. To conform to the original inter-frame motion, we keep feature trajectory distribution as constraints during motion compensation to ensure stability after warping the frames. Such spatially and temporally optimized warps enable the output of regular rectangular boundaries for the video frames with low geometric distortion and jitter. Our experiments demonstrate that our approach can generate plausible video rectangling results in a variety of applications.

Author(s):  
Tao Sun ◽  
Yaping Wu ◽  
Yan Bai ◽  
Zhenguo Wang ◽  
Chushu Shen ◽  
...  

Abstract As a non-invasive imaging tool, Positron Emission Tomography (PET) plays an important role in brain science and disease research. Dynamic acquisition is one way of brain PET imaging. Its wide application in clinical research has often been hindered by practical challenges, such as patient involuntary movement, which could degrade both image quality and the accuracy of the quantification. This is even more obvious in scans of patients with neurodegeneration or mental disorders. Conventional motion compensation methods were either based on images or raw measured data, were shown to be able to reduce the effect of motion on the image quality. As for a dynamic PET scan, motion compensation can be challenging as tracer kinetics and relatively high noise can be present in dynamic frames. In this work, we propose an image-based inter-frame motion compensation approach specifically designed for dynamic brain PET imaging. Our method has an iterative implementation that only requires reconstructed images, based on which the inter-frame subject movement can be estimated and compensated. The method utilized tracer-specific kinetic modelling and can deal with simple and complex movement patterns. The synthesized phantom study showed that the proposed method can compensate for the simulated motion in scans with 18F-FDG, 18F-Fallypride and 18F-AV45. Fifteen dynamic 18F-FDG patient scans with motion artifacts were also processed. The quality of the recovered image was superior to the one of the non-corrected images and the corrected images with other image-based methods. The proposed method enables retrospective image quality control for dynamic brain PET imaging, hence facilitates the applications of dynamic PET in clinics and research.


2015 ◽  
Vol 28 (2) ◽  
pp. 165-175
Author(s):  
Andrzej Napieralski ◽  
Jakub Cłapa ◽  
Kamil Grabowski ◽  
Małgorzata Napieralska ◽  
Wojciech Sankowski ◽  
...  

Paper presents the recent research in DMCS. The image processing and biometric research projects are presented. One of the key elements is an image acquisition and processing. The most recent biometric research projects are in the area of authentication in uncooperative scenarios and utilizing many different biometric traits (multimodal biometric systems). Also, the recent research on the removal of geometric distortion from live video streams using FPGA and GPU hardware is presented together with preliminary performance results.


2018 ◽  
Vol 1 (2) ◽  
pp. 17-23
Author(s):  
Takialddin Al Smadi

This survey outlines the use of computer vision in Image and video processing in multidisciplinary applications; either in academia or industry, which are active in this field.The scope of this paper covers the theoretical and practical aspects in image and video processing in addition of computer vision, from essential research to evolution of application.In this paper a various subjects of image processing and computer vision will be demonstrated ,these subjects are spanned from the evolution of mobile augmented reality (MAR) applications, to augmented reality under 3D modeling and real time depth imaging, video processing algorithms will be discussed to get higher depth video compression, beside that in the field of mobile platform an automatic computer vision system for citrus fruit has been implemented ,where the Bayesian classification with Boundary Growing to detect the text in the video scene. Also the paper illustrates the usability of the handed interactive method to the portable projector based on augmented reality.   © 2018 JASET, International Scholars and Researchers Association


Author(s):  
Chamin Morikawa ◽  
Michihiro Kobayashi ◽  
Masaki Satoh ◽  
Yasuhiro Kuroda ◽  
Teppei Inomata ◽  
...  

2021 ◽  
Vol 11 (4) ◽  
pp. 1438
Author(s):  
Sebastián Risco ◽  
Germán Moltó

Serverless computing has introduced scalable event-driven processing in Cloud infrastructures. However, it is not trivial for multimedia processing to benefit from the elastic capabilities featured by serverless applications. To this aim, this paper introduces the evolution of a framework to support the execution of customized runtime environments in AWS Lambda in order to accommodate workloads that do not satisfy its strict computational requirements: increased execution times and the ability to use GPU-based resources. This has been achieved through the integration of AWS Batch, a managed service to deploy virtual elastic clusters for the execution of containerized jobs. In addition, a Functions Definition Language (FDL) is introduced for the description of data-driven workflows of functions. These workflows can simultaneously leverage both AWS Lambda for the highly-scalable execution of short jobs and AWS Batch, for the execution of compute-intensive jobs that can profit from GPU-based computing. To assess the developed open-source framework, we executed a case study for efficient serverless video processing. The workflow automatically generates subtitles based on the audio and applies GPU-based object recognition to the video frames, thus simultaneously harnessing different computing services. This allows for the creation of cost-effective highly-parallel scale-to-zero serverless workflows in AWS.


2016 ◽  
Vol 26 (04) ◽  
pp. 1750054
Author(s):  
M. Kiruba ◽  
V. Sumathy

The Discrete Cosine Transform (DCT) structure plays a significant role in the signal processing applications such as image and video processing applications. In the traditional hardware design, the 8-point DCT architecture contains more number of logical slices in it. Also, it consists of number of multipliers to update the weight. This leads to huge area consumption and power dissipation in that architecture. To mitigate the conventional drawbacks, this paper presents a novel Hierarchical-based Expression (HBE)-Multiple Constant Multiplication (MCM)-based multiplier architecture design for the 8-point DCT structure used in the video CODEC applications. The proposed work involves modified data path architecture and Floating Point Processing Element (FPPE) architecture. Our proposed design of the multipliers and DCT architecture requires minimum number of components when compared to the traditional DCT method. The HBE-MCM-based multiplier architecture includes shifters and adders. The number of Flip-Flops (FFs) and Look Up Tables (LUTs) used in the proposed architecture is reduced. The power consumption is reduced due to the reduction in the size of the components. This design is synthesized in VERILOG code language and implemented in the Field Programmable Gate Array (FPGA). The performance of the proposed architecture is evaluated by comparing it with traditional DCT architecture in terms of the Number of FFs, Number of LUTs, area, power, delay and speed.


Symmetry ◽  
2020 ◽  
Vol 13 (1) ◽  
pp. 38
Author(s):  
Dong Zhao ◽  
Baoqing Ding ◽  
Yulin Wu ◽  
Lei Chen ◽  
Hongchao Zhou

This paper proposes a method for discovering the primary objects in single images by learning from videos in a purely unsupervised manner—the learning process is based on videos, but the generated network is able to discover objects from a single input image. The rough idea is that an image typically consists of multiple object instances (like the foreground and background) that have spatial transformations across video frames and they can be sparsely represented. By exploring the sparsity representation of a video with a neural network, one may learn the features of each object instance without any labels, which can be used to discover, recognize, or distinguish object instances from a single image. In this paper, we consider a relatively simple scenario, where each image roughly consists of a foreground and a background. Our proposed method is based on encoder-decoder structures to sparsely represent the foreground, background, and segmentation mask, which further reconstruct the original images. We apply the feed-forward network trained from videos for object discovery in single images, which is different from the previous co-segmentation methods that require videos or collections of images as the input for inference. The experimental results on various object segmentation benchmarks demonstrate that the proposed method extracts primary objects accurately and robustly, which suggests that unsupervised image learning tasks can benefit from the sparsity of images and the inter-frame structure of videos.


Sign in / Sign up

Export Citation Format

Share Document