scholarly journals Motion Feature Retrieval in Basketball Match Video Based on Multisource Motion Feature Fusion

2022 ◽  
Vol 2022 ◽  
pp. 1-10
Author(s):  
Biao Ma ◽  
Minghui Ji

Both the human body and its motion are three-dimensional information, while the traditional feature description method of two-person interaction based on RGB video has a low degree of discrimination due to the lack of depth information. According to the respective advantages and complementary characteristics of RGB video and depth video, a retrieval algorithm based on multisource motion feature fusion is proposed. Firstly, the algorithm uses the combination of spatiotemporal interest points and word bag model to represent the features of RGB video. Then, the directional gradient histogram is used to represent the feature of the depth video frame. The statistical features of key frames are introduced to represent the histogram features of depth video. Finally, the multifeature image fusion algorithm is used to fuse the two video features. The experimental results show that multisource feature fusion can greatly improve the retrieval accuracy of motion features.

2019 ◽  
Vol 15 (1) ◽  
pp. 155014771882245
Author(s):  
Hongjie Yang ◽  
Fan Zhou ◽  
Ge Lin ◽  
Mouguang Lin ◽  
Shujin Lin

The three-dimensional animated model is widely used in scientific visualization, entertainment, and virtual applications, especially in visual sensor networks. The main purpose of simplification is to capture the shape sequence of an object with very few elements while preserving the overall shape. As three-dimensional animated mesh is time-varying in all frames, the trade-off between the temporal coherence and geometric distortion must be considered to develop simplification algorithm. In this article, a novel three-dimensional animated mesh simplification algorithm based on motion features is presented. Here, motion features are the connection areas of the relative movement consisted of vertices and edges. Motion feature extraction is to find a subgraph that has movement property. Dihedral angle of the edge through all frames is used to determine whether an edge is connected or not to the movement parts. Then, a rotation connected graph is defined to extract motion features. Traveling this graph, all motion features can be extracted. Based on the motion features, animated quadric error metric is created and quadric error matrix is built through all frames. Compared with the other methods, the important advantages of this method are high-efficiency simplification process and smoother simplification effects. It is suitable to be used in real-time applications. Experiment results show that the 3D animated mesh simplification effects by our method are satisfactory.


2014 ◽  
Vol 530-531 ◽  
pp. 534-539
Author(s):  
Dong Zhang ◽  
Cun Qian Feng ◽  
Ning Ning Tong ◽  
Si San He ◽  
Teng Lei

To solve the problem of sliding-type scattering center and the problem of angle limitation of single radar, a kind of method for extracting three-dimensional micro-motion characteristics of ballistic target based on netted radars was proposed. Firstly, A new feature extraction method based on half period delay multiplication was introduced which establishes the relationship of sliding-type scattering center and ideal scattering center, then making use of the expand hough transformation, the parameters of echo signals were extracted. At last, the three-dimensional micro-motion features and the structure features were obtained by solving nonlinear multivariable equation systems which were established by the multi-view of netted radar. Simulation validated the method can get both the high estimating precision and the three-dimensional micro-motion parameters.


2019 ◽  
Vol 63 (5) ◽  
pp. 50402-1-50402-9 ◽  
Author(s):  
Ing-Jr Ding ◽  
Chong-Min Ruan

Abstract The acoustic-based automatic speech recognition (ASR) technique has been a matured technique and widely seen to be used in numerous applications. However, acoustic-based ASR will not maintain a standard performance for the disabled group with an abnormal face, that is atypical eye or mouth geometrical characteristics. For governing this problem, this article develops a three-dimensional (3D) sensor lip image based pronunciation recognition system where the 3D sensor is efficiently used to acquire the action variations of the lip shapes of the pronunciation action from a speaker. In this work, two different types of 3D lip features for pronunciation recognition are presented, 3D-(x, y, z) coordinate lip feature and 3D geometry lip feature parameters. For the 3D-(x, y, z) coordinate lip feature design, 18 location points, each of which has 3D-sized coordinates, around the outer and inner lips are properly defined. In the design of 3D geometry lip features, eight types of features considering the geometrical space characteristics of the inner lip are developed. In addition, feature fusion to combine both 3D-(x, y, z) coordinate and 3D geometry lip features is further considered. The presented 3D sensor lip image based feature evaluated the performance and effectiveness using the principal component analysis based classification calculation approach. Experimental results on pronunciation recognition of two different datasets, Mandarin syllables and Mandarin phrases, demonstrate the competitive performance of the presented 3D sensor lip image based pronunciation recognition system.


2021 ◽  
pp. 1-18
Author(s):  
R.S. Rampriya ◽  
Sabarinathan ◽  
R. Suganya

In the near future, combo of UAV (Unmanned Aerial Vehicle) and computer vision will play a vital role in monitoring the condition of the railroad periodically to ensure passenger safety. The most significant module involved in railroad visual processing is obstacle detection, in which caution is obstacle fallen near track gage inside or outside. This leads to the importance of detecting and segment the railroad as three key regions, such as gage inside, rails, and background. Traditional railroad segmentation methods depend on either manual feature selection or expensive dedicated devices such as Lidar, which is typically less reliable in railroad semantic segmentation. Also, cameras mounted on moving vehicles like a drone can produce high-resolution images, so segmenting precise pixel information from those aerial images has been challenging due to the railroad surroundings chaos. RSNet is a multi-level feature fusion algorithm for segmenting railroad aerial images captured by UAV and proposes an attention-based efficient convolutional encoder for feature extraction, which is robust and computationally efficient and modified residual decoder for segmentation which considers only essential features and produces less overhead with higher performance even in real-time railroad drone imagery. The network is trained and tested on a railroad scenic view segmentation dataset (RSSD), which we have built from real-time UAV images and achieves 0.973 dice coefficient and 0.94 jaccard on test data that exhibits better results compared to the existing approaches like a residual unit and residual squeeze net.


Information ◽  
2020 ◽  
Vol 12 (1) ◽  
pp. 3
Author(s):  
Shuang Chen ◽  
Zengcai Wang ◽  
Wenxin Chen

The effective detection of driver drowsiness is an important measure to prevent traffic accidents. Most existing drowsiness detection methods only use a single facial feature to identify fatigue status, ignoring the complex correlation between fatigue features and the time information of fatigue features, and this reduces the recognition accuracy. To solve these problems, we propose a driver sleepiness estimation model based on factorized bilinear feature fusion and a long- short-term recurrent convolutional network to detect driver drowsiness efficiently and accurately. The proposed framework includes three models: fatigue feature extraction, fatigue feature fusion, and driver drowsiness detection. First, we used a convolutional neural network (CNN) to effectively extract the deep representation of eye and mouth-related fatigue features from the face area detected in each video frame. Then, based on the factorized bilinear feature fusion model, we performed a nonlinear fusion of the deep feature representations of the eyes and mouth. Finally, we input a series of fused frame-level features into a long-short-term memory (LSTM) unit to obtain the time information of the features and used the softmax classifier to detect sleepiness. The proposed framework was evaluated with the National Tsing Hua University drowsy driver detection (NTHU-DDD) video dataset. The experimental results showed that this method had better stability and robustness compared with other methods.


2005 ◽  
Vol 22 (7) ◽  
pp. 909-929 ◽  
Author(s):  
Hirohiko Masunaga ◽  
Christian D. Kummerow

Abstract A methodology to analyze precipitation profiles using the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) and precipitation radar (PR) is proposed. Rainfall profiles are retrieved from PR measurements, defined as the best-fit solution selected from precalculated profiles by cloud-resolving models (CRMs), under explicitly defined assumptions of drop size distribution (DSD) and ice hydrometeor models. The PR path-integrated attenuation (PIA), where available, is further used to adjust DSD in a manner that is similar to the PR operational algorithm. Combined with the TMI-retrieved nonraining geophysical parameters, the three-dimensional structure of the geophysical parameters is obtained across the satellite-observed domains. Microwave brightness temperatures are then computed for a comparison with TMI observations to examine if the radar-retrieved rainfall is consistent in the radiometric measurement space. The inconsistency in microwave brightness temperatures is reduced by iterating the retrieval procedure with updated assumptions of the DSD and ice-density models. The proposed methodology is expected to refine the a priori rain profile database and error models for use by parametric passive microwave algorithms, aimed at the Global Precipitation Measurement (GPM) mission, as well as a future TRMM algorithms.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Parsa Omidi ◽  
Mohamadreza Najiminaini ◽  
Mamadou Diop ◽  
Jeffrey J. L. Carson

AbstractSpatial resolution in three-dimensional fringe projection profilometry is determined in large part by the number and spacing of fringes projected onto an object. Due to the intensity-based nature of fringe projection profilometry, fringe patterns must be generated in succession, which is time-consuming. As a result, the surface features of highly dynamic objects are difficult to measure. Here, we introduce multispectral fringe projection profilometry, a novel method that utilizes multispectral illumination to project a multispectral fringe pattern onto an object combined with a multispectral camera to detect the deformation of the fringe patterns due to the object. The multispectral camera enables the detection of 8 unique monochrome fringe patterns representing 4 distinct directions in a single snapshot. Furthermore, for each direction, the camera detects two π-phase shifted fringe patterns. Each pair of fringe patterns can be differenced to generate a differential fringe pattern that corrects for illumination offsets and mitigates the effects of glare from highly reflective surfaces. The new multispectral method solves many practical problems related to conventional fringe projection profilometry and doubles the effective spatial resolution. The method is suitable for high-quality fast 3D profilometry at video frame rates.


2020 ◽  
Vol 6 (2) ◽  
pp. eaay6036 ◽  
Author(s):  
R. C. Feord ◽  
M. E. Sumner ◽  
S. Pusdekar ◽  
L. Kalra ◽  
P. T. Gonzalez-Bellido ◽  
...  

The camera-type eyes of vertebrates and cephalopods exhibit remarkable convergence, but it is currently unknown whether the mechanisms for visual information processing in these brains, which exhibit wildly disparate architecture, are also shared. To investigate stereopsis in a cephalopod species, we affixed “anaglyph” glasses to cuttlefish and used a three-dimensional perception paradigm. We show that (i) cuttlefish have also evolved stereopsis (i.e., the ability to extract depth information from the disparity between left and right visual fields); (ii) when stereopsis information is intact, the time and distance covered before striking at a target are shorter; (iii) stereopsis in cuttlefish works differently to vertebrates, as cuttlefish can extract stereopsis cues from anticorrelated stimuli. These findings demonstrate that although there is convergent evolution in depth computation, cuttlefish stereopsis is likely afforded by a different algorithm than in humans, and not just a different implementation.


2013 ◽  
Vol 319 ◽  
pp. 343-347
Author(s):  
Ru Ting Xia ◽  
Xiao Yan Zhou

This research aimed to reveal characteristics of visual attention of low-vision drivers. Near and far stimuli were used by means of a three-dimensional (3D) attention measurement system that simulated traffic environment. We measured the reaction time of subjects while attention shifted in three kinds of imitational peripheral environment illuminance (daylight, twilight and dawn conditions). Subjects were required to judge whether the target presented nearer than fixation point or further than it. The results showed that the peripheral environment illuminance had evident influence on the reaction time of drivers, the reaction time was slow in dawn and twilight conditions than in daylight condition, distribution of attention had the advantage in nearer space than farther space, that is, and the shifts of attention in 3D space had an anisotropy characteristic in depth. The results suggested that (1) visual attention might be operated with both precueing paradigm and stimulus controls included the depth information, (2) an anisotropy characteristic of attention shifting depend on the attention moved distance, and it showed remarkably in dawn condition than in daylight and twilight conditions.


Sign in / Sign up

Export Citation Format

Share Document