scholarly journals Visual Features and Their Own Optical Flow

2021 ◽  
Vol 4 ◽  
Author(s):  
Alessandro Betti ◽  
Giuseppe Boccignone ◽  
Lapo Faggi ◽  
Marco Gori ◽  
Stefano Melacci

Symmetries, invariances and conservation equations have always been an invaluable guide in Science to model natural phenomena through simple yet effective relations. For instance, in computer vision, translation equivariance is typically a built-in property of neural architectures that are used to solve visual tasks; networks with computational layers implementing such a property are known as Convolutional Neural Networks (CNNs). This kind of mathematical symmetry, as well as many others that have been recently studied, are typically generated by some underlying group of transformations (translations in the case of CNNs, rotations, etc.) and are particularly suitable to process highly structured data such as molecules or chemical compounds which are known to possess those specific symmetries. When dealing with video streams, common built-in equivariances are able to handle only a small fraction of the broad spectrum of transformations encoded in the visual stimulus and, therefore, the corresponding neural architectures have to resort to a huge amount of supervision in order to achieve good generalization capabilities. In the paper we formulate a theory on the development of visual features that is based on the idea that movement itself provides trajectories on which to impose consistency. We introduce the principle of Material Point Invariance which states that each visual feature is invariant with respect to the associated optical flow, so that features and corresponding velocities are an indissoluble pair. Then, we discuss the interaction of features and velocities and show that certain motion invariance traits could be regarded as a generalization of the classical concept of affordance. These analyses of feature-velocity interactions and their invariance properties leads to a visual field theory which expresses the dynamical constraints of motion coherence and might lead to discover the joint evolution of the visual features along with the associated optical flows.

2020 ◽  
Vol 34 (07) ◽  
pp. 10713-10720
Author(s):  
Mingyu Ding ◽  
Zhe Wang ◽  
Bolei Zhou ◽  
Jianping Shi ◽  
Zhiwu Lu ◽  
...  

A major challenge for video semantic segmentation is the lack of labeled data. In most benchmark datasets, only one frame of a video clip is annotated, which makes most supervised methods fail to utilize information from the rest of the frames. To exploit the spatio-temporal information in videos, many previous works use pre-computed optical flows, which encode the temporal consistency to improve the video segmentation. However, the video segmentation and optical flow estimation are still considered as two separate tasks. In this paper, we propose a novel framework for joint video semantic segmentation and optical flow estimation. Semantic segmentation brings semantic information to handle occlusion for more robust optical flow estimation, while the non-occluded optical flow provides accurate pixel-level temporal correspondences to guarantee the temporal consistency of the segmentation. Moreover, our framework is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference. Extensive experiments show that the proposed model makes the video semantic segmentation and optical flow estimation benefit from each other and outperforms existing methods under the same settings in both tasks.


Author(s):  
Antônio Busson ◽  
Alan L. V. Guedes ◽  
Sergio Colcher

Machine Learning field, methods based on Deep Learning (e.g. CNN, RNN) becomes the state-of-the-art in several problems of the multimedia domain, especially in audio-visual tasks. Typically, the training of Deep Learning Methods is done in a supervised manner, and it is trained on datasets containing thousands/millions of media examples and several related concepts/classes. During training, the Deep Learning Methods learn a hierarchy of filters that are applied to input data to classify/recognize the media content. In computer vision scenario, for example, given image pixels, the series of layers of the network can learn to extract visual features from it, the shallow layers can extract lower-level features (e.g. edges, corner, contours), while the deeper combine these features to produce higher-level features (e.g. textures, part of objects). These representative features can be clustered into groups, each one representing a specific concept. H.761 NCL currently lacks support for Deep Learning Methods inside their application specification. Because those languages still focus on presentations tasks such as capture, streaming, and presentation. They do not consider programmers to describe the semantic understanding of the used media and handle recognition of such under-standing. In this proposal, we aim at extending NCL to provide such support. More precisely, our proposal able NCL application support: (1) describe learning-based on structured multimedia datasets; (2) recognize content semantics of the media elements in presentation time. To achieve such goals, we propose, an extension that includes: (a) the new "knowledge" element describe concepts based on multimedia datasets; (b) "area" anchor with an associated "recognition" event that describes when a concept occurrences in multimedia content.


Sensors ◽  
2019 ◽  
Vol 19 (11) ◽  
pp. 2523 ◽  
Author(s):  
Gangik Cho ◽  
Jongyun Kim ◽  
Hyondong Oh

Due to payload restrictions for micro aerial vehicles (MAVs), vision-based approaches have been widely studied with their light weight characteristics and cost effectiveness. In particular, optical flow-based obstacle avoidance has proven to be one of the most efficient methods in terms of obstacle avoidance capabilities and computational load; however, existing approaches do not consider 3-D complex environments. In addition, most approaches are unable to deal with situations where there are wall-like frontal obstacles. Although some algorithms consider wall-like frontal obstacles, they cause a jitter or unnecessary motion. To address these limitations, this paper proposes a vision-based obstacle avoidance algorithm for MAVs using the optical flow in 3-D textured environments. The image obtained from a monocular camera is first split into two horizontal and vertical half planes. The desired heading direction and climb rate are then determined by comparing the sum of optical flows between half planes horizontally and vertically, respectively, for obstacle avoidance in 3-D environments. Besides, the proposed approach is capable of avoiding wall-like frontal obstacles by considering the divergence of the optical flow at the focus of expansion and navigating to the goal position using a sigmoid weighting function. The performance of the proposed algorithm was validated through numerical simulations and indoor flight experiments in various situations.


2014 ◽  
Vol 2014 ◽  
pp. 1-8 ◽  
Author(s):  
Joko Hariyono ◽  
Van-Dung Hoang ◽  
Kang-Hyun Jo

This paper presents a pedestrian detection method from a moving vehicle using optical flows and histogram of oriented gradients (HOG). A moving object is extracted from the relative motion by segmenting the region representing the same optical flows after compensating the egomotion of the camera. To obtain the optical flow, two consecutive images are divided into grid cells14×14pixels; then each cell is tracked in the current frame to find corresponding cell in the next frame. Using at least three corresponding cells, affine transformation is performed according to each corresponding cell in the consecutive images, so that conformed optical flows are extracted. The regions of moving object are detected as transformed objects, which are different from the previously registered background. Morphological process is applied to get the candidate human regions. In order to recognize the object, the HOG features are extracted on the candidate region and classified using linear support vector machine (SVM). The HOG feature vectors are used as input of linear SVM to classify the given input into pedestrian/nonpedestrian. The proposed method was tested in a moving vehicle and also confirmed through experiments using pedestrian dataset. It shows a significant improvement compared with original HOG using ETHZ pedestrian dataset.


2012 ◽  
Vol 24 (4) ◽  
pp. 686-698 ◽  
Author(s):  
Lei Chen ◽  
◽  
Hua Yang ◽  
Takeshi Takaki ◽  
Idaku Ishii

In this paper, we propose a novel method for accurate optical flow estimation in real time for both high-speed and low-speed moving objects based on High-Frame-Rate (HFR) videos. We introduce a multiframe-straddling function to select several pairs of images with different frame intervals from an HFR image sequence even when the estimated optical flow is required to output at standard video rates (NTSC at 30 fps and PAL at 25 fps). The multiframestraddling function can remarkably improve the measurable range of velocities in optical flow estimation without heavy computation by adaptively selecting a small frame interval for high-speed objects and a large frame interval for low-speed objects. On the basis of the relationship between the frame intervals and the accuracies of the optical flows estimated by the Lucas–Kanade method, we devise a method to determine multiple frame intervals in optical flow estimation and select an optimal frame interval from these intervals according to the amplitude of the estimated optical flow. Our method was implemented using software on a high-speed vision platform, IDP Express. The estimated optical flows were accurately outputted at intervals of 40 ms in real time by using three pairs of 512×512 images; these images were selected by frame-straddling a 2000-fps video with intervals of 0.5, 1.5, and 5 ms. Several experiments were performed for high-speed movements to verify that our method can remarkably improve the measurable range of velocities in optical flow estimation, compared to optical flows estimated for 25-fps videos with the Lucas–Kanade method.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1150
Author(s):  
Jun Nagata ◽  
Yusuke Sekikawa ◽  
Yoshimitsu Aoki

In this work, we propose a novel method of estimating optical flow from event-based cameras by matching the time surface of events. The proposed loss function measures the timestamp consistency between the time surface formed by the latest timestamp of each pixel and the one that is slightly shifted in time. This makes it possible to estimate dense optical flows with high accuracy without restoring luminance or additional sensor information. In the experiment, we show that the gradient was more correct and the loss landscape was more stable than the variance loss in the motion compensation approach. In addition, we show that the optical flow can be estimated with high accuracy by optimization with L1 smoothness regularization using publicly available datasets.


Author(s):  
V. V. Kniaz ◽  
V. V. Fedorenko

The growing interest for self-driving cars provides a demand for scene understanding and obstacle detection algorithms. One of the most challenging problems in this field is the problem of pedestrian detection. Main difficulties arise from a diverse appearances of pedestrians. Poor visibility conditions such as fog and low light conditions also significantly decrease the quality of pedestrian detection. This paper presents a new optical flow based algorithm BipedDetet that provides robust pedestrian detection on a single-borad computer. The algorithm is based on the idea of simplified Kalman filtering suitable for realization on modern single-board computers. To detect a pedestrian a synthetic optical flow of the scene without pedestrians is generated using slanted-plane model. The estimate of a real optical flow is generated using a multispectral image sequence. The difference of the synthetic optical flow and the real optical flow provides the optical flow induced by pedestrians. The final detection of pedestrians is done by the segmentation of the difference of optical flows. To evaluate the BipedDetect algorithm a multispectral dataset was collected using a mobile robot.


2013 ◽  
Vol 376 ◽  
pp. 455-460
Author(s):  
Wei Zhu ◽  
Li Tian ◽  
Fang Di ◽  
Jian Li Li ◽  
Ke Jie Li

Optical flow method is an important and valid method in the field of detection and tracking of moving objects for robot inspection system. Due to the traditional Horn-Schunck optical flow method and Lucas-Kanade optical flow method cannot meet the demands of real-time and accuracy simultaneously, an improved optical flow method based on Gaussian image pyramid is proposed. The layered structure of the images can be obtained by desampling of the original sequential images so that the motion with the high speed can be changed into continuous motion with lower speed. Then the optical flows of corner points of the lowest layer will be calculated by the LK method and be delivered to the upper layer and so on. Thus the estimated optical flow vectors of the original sequential images will be obtained. In this way, the requirement of accuracy and real time could be met for robotic moving obstacle recognition.


2014 ◽  
Vol 2014 ◽  
pp. 1-16 ◽  
Author(s):  
Chong Wang ◽  
Zheng You ◽  
Fei Xing ◽  
Borui Zhao ◽  
Bin Li ◽  
...  

It has been discovered that image motions and optical flows usually become much more nonlinear and anisotropic in space-borne cameras with large field of view, especially when perturbations or jitters exist. The phenomenon arises from the fact that the attitude motion greatly affects the image of the three-dimensional planet. In this paper, utilizing the characteristics, an optical flow inversion method is proposed to treat high-accurate remote sensor attitude motion measurement. The principle of the new method is that angular velocities can be measured precisely by means of rebuilding some nonuniform optical flows. Firstly, to determine the relative displacements and deformations between the overlapped images captured by different detectors is the primary process of the method. A novel dense subpixel image registration approach is developed towards this goal. Based on that, optical flow can be rebuilt and high-accurate attitude measurements are successfully fulfilled. In the experiment, a remote sensor and its original photographs are investigated, and the results validate that the method is highly reliable and highly accurate in a broad frequency band.


Sensors ◽  
2019 ◽  
Vol 19 (11) ◽  
pp. 2459 ◽  
Author(s):  
Ji-Hun Mun ◽  
Moongu Jeon ◽  
Byung-Geun Lee

Herein, we propose an unsupervised learning architecture under coupled consistency conditions to estimate the depth, ego-motion, and optical flow. Previously invented learning techniques in computer vision adopted a large amount of the ground truth dataset for network training. A ground truth dataset, including depth and optical flow collected from the real world, requires tremendous effort in pre-processing due to the exposure to noise artifacts. In this paper, we propose a framework that trains networks while using a different type of data with combined losses that are derived from a coupled consistency structure. The core concept is composed of two parts. First, we compare the optical flows, which are estimated from both the depth plus ego-motion and flow estimation network. Subsequently, to prevent the effects of the artifacts of the occluded regions in the estimated optical flow, we compute flow local consistency along the forward–backward directions. Second, synthesis consistency enables the exploration of the geometric correlation between the spatial and temporal domains in a stereo video. We perform extensive experiments on the depth, ego-motion, and optical flow estimation on the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) dataset. We verify that the flow local consistency loss improves the optical flow accuracy in terms of the occluded regions. Furthermore, we also show that the view-synthesis-based photometric loss enhances the depth and ego-motion accuracy via scene projection. The experimental results exhibit the competitive performance of the estimated depth and the optical flow; moreover, the induced ego-motion is comparable to that obtained from other unsupervised methods.


Sign in / Sign up

Export Citation Format

Share Document